Navigating the Fortran Standard document

I would be interested in hearing from people that often refer to the Fortran Standard docs (or their drafts) on how they navigate the documents. I periodically have to search through the Fortran Standard to check how I should implement feature X in the fortls parser and for me it is always a rather painful process.

My workflow is using the PDF version of the draft and:

  • Opening 2 windows of the PDF with a PDF viewer (I use Okular)
  • Have one window opened at the Index page looking for definitions (also showing the TOC on the side)
  • Using the second window to manually go to the definitions and browse the Standard

That does not seem like a very efficient way to navigate the Fortran Standard, but I havenā€™t thought of anything better. Do others navigate/search through the Standard in a different way? I would be interested to hear about it. This thread could hopefully be of use to people that want to contribute to open source compilers like gfortran and lfortran but do not know where to get started in terms of the missing functionality they want to implement.

2 Likes

A useful tool for what you want to do is, perhaps, PDFGrep . I used it some months ago to obtain the number of ā€œConstraintsā€ in the Fortran standard. Give it a try!

2 Likes

Thanks @mecej4 for pointing this out. Will git it a go.

Unhappily Okular and evince do not allow regex search.

Maybe we could build a script using pdfgrep with its -n option, and okular with its -p option to open the right page:

$ pdfgrep "cosh\(" 18-007r1.pdf -n
375:10   5 Result Value. The result has a value equal to a processor-dependent approximation to cosh(X). If X is of type
$ okular -p 375 18-007r1.pdf
2 Likes

On a tangential note, let me know what it would take for fortls to use LFortran as the parser. :slight_smile: As far as I know we should be able to parse any free-form to AST (ā€œbetaā€ quality), and a lot of fixed-form (ā€œalphaā€ quality), all of SciPy for example.

I am happy to add features, make it installable separately so that you could (if you want to) vendor the files into your Python package. It will be more efficient if we can collaborate on the same parser.

2 Likes

Thank you all for sharing your workflows.

We had a few chats about this earlier this summer and I did a little viability study. I identified 2 bottlenecks at the time:

  • fortls is all Python making it very portable, adding LFortran as a dependency would make it harder to deploy via PyPi and pip/pipx
  • the fortls parser is designed to be error resilient. If fails to parse certain bits of code it will just skip them and continue parsing the rest of the file. I donā€™t remember LFortranā€™s parser working like that I might be wrong through. I remember the solution that we came up was to save at least one valid state of the ASR from when the file was valid and revert to that.

We would also need 1 or 2 devs to then write the C++ ā† ā†’ Python glue.

1 Like

I have just tried xpdf 3.04 (in Ubuntu 22.04) but CTRL+F does not seem to offer regex, but just a simple text search like Okular and evince. Have I missed something?

I generally just ctrl+f for the keyword of interest. Itā€™s probably not the most efficient, and sometimes the chosen word provides too many results, but at that point itā€™s kind of like figuring out how to use Google efficiently, pick the right related words. Using the hyperlinks, index and table of contents also work out pretty well most of the time.

3 Likes

I have just pushed a small shell script to search a string in the Fortran standard and open the page(s) in Okular:

It is based on pdfgrep, cut and okular. It is just a prototype but it already makes a decent job:

$ ./fss co_sum
373:37     16.9.50      CO_SUM (A [, RESULT_IMAGE, STAT, ERRMSG])

The regex will detect strings like ā€œ12.5 blablaā€, "ā€œ12.5.1 blablaā€, ā€œ12.5.1.2 blablaā€, etc. See the README.md for more information, and tell me what you think.

6 Likes

I have added some improvements proposed by @msz59. And in the coming days, I will add some options to the fss command to modify the regex behavior.

I have also discovered that pdfgrep has an option --cache that speeds dramatically the search when the command is used several times. The first time it takes several seconds, and the next times the search is instantaneous :rocket:.

4 Likes

I thought just using the -A (--after-context) option of pdfgrep could be a simple alternative. With for example -A 30 we could print the 30 lines following the matching line so that we donā€™t need to open the PDF. But if the result is at the bottom of a page, pdfgrep stops printing at the end of the page. In that example, we therefore miss the interesting information:

$ pdfgrep -A 30 -iP "\d+\s+abs " 18-007r1.pdf
15     16.9.2        ABS (A)
16   1 Description. Absolute value.

17   2 Class. Elemental function.

18   3 Argument. A shall be of type integer, real, or complex.



                                                            J3/18-007r1                                               339


$

Updated the 2022-10-30:

I have tested a simple solution in Kubuntu (KDE):

  • create a launcher on your desktop that executes the command okular -p 616 18-007r1.pdf (look in the properties of the icon),
  • When you click on that icon, the Fortran standard is now opened at the beginning of the final index (page 616),
  • so when you type CTRL+F, the search begins in the index instead of the top of the document,
  • and if you find what you want in the index, you have just to click on the page number to go to the section. And if it is not in the index, the search continues at the beginning of the document.

Of course the standard PDF is not totally representable as flat text, and the cross-links (somewhat reminscent of GOTOs :>) in the PDF are great for finding definitions of terms; but I find the search capabilitites of vim with text files so useful, I often do something like

 pdfgrep .  $FILENAME.pdf  > $FILENAME.txt

and look through the text version with vim(1) as a supplement for trying to navigate the PDF. I find I
can find what I want much more easily in the text version; or find where I want to look in the PDF far more easily than using just the PDF.

vim $FILENAME.txt

:g/REGX/#
/REGX/
n
p

and so on.

2 Likes

Thanks, thatā€™s really a beautiful way to use pdfgrep! I tried the command pdf2txt but the layout was not good. Here pdfgrep does a really good job.

The only default is that there is an empty line between each line of text, but with the problem is solved by:

$ pdfgrep ^.*$ 18-007r1.pdf > 18-007r1.txt

The following command can now give a good result:

$ grep -iP "\d+\s+co_sum \(" -A 30 18-007r1.txt

well, I have a little bias about how to look up intrinsics, but still working on it (especially the co_* procedures)

https://raw.githubusercontent.com/urbanjost/M_intrinsics/master/standalone/fman.f90

but you might use sed with the intrinsics and look for

^ *16\..*co_sum' 

to the next line starting with ā€˜^ *16.ā€™
for the intrinsics.

Canā€™t they make the .tex available? Has anyone tried emailing them?
We could make a web browsable version ourselves.

4 Likes

I think the tex version is available to members of the committee (I think Iā€™ve seen it somewhere). But the problem is that we need a permission from J3, WG5, INCITS and possibly ISO to post it. Here is an issue for this: Put the standard on GitHub Ā· Issue #48 Ā· j3-fortran/fortran_proposals Ā· GitHub.

2 Likes

The command pdftotext (not to be mistaken with pdf2txt) does a good job with its -layout option:

$ pdftotext -layout 18-007r1.pdf

Under Ubuntu, that command is in the poppler-utils package.

1 Like