Interest in refactoring/replacing FORD parser

We have now integrated C preprocessor in LFortran and plan to add other preprocessors such as fypp also.

The location information should now be fixed (it works with the preprocessor also even within a line with expanded macros). If you discover an incorrect location information, please let us know.

We have also implemented Rust style error message, see here for an example: Add Rust style error messages (!1490) · Merge requests · lfortran / lfortran · GitLab.

We are now focusing on the --symtab-only mode. Once we can do all of Fortran in this mode, we’ll let you know.

1 Like

Are you using the standard C preprocessor or Sun’s modified version that deals with such idiosyncrasies as \\ being used for comments in C and string concatenation in Fortran?

Neither, we wrote our own, to obtain accurate location information for good error reporting. However, the C preprocessor is separate, in other words, it runs before the prescanner, not part of it. It can be extracted to a separate project, or fpm. As long as it gives over a few remapping arrays, then LFortran can use them to give back accurate locations for error messages and other uses. Most external preprocessors do not give you such accurate remapping arrays, so you can only locate to the line, but not columns.

From my recent experiments with getting C-preprocessors for a project with very specific constraints running, a standalone project to install just a preprocessor would have been quite useful. Something like GitHub - ned14/pcpp: A C99 preprocessor written in pure Python (this one unfortunately failed me because of a change in Python, which broke opening files due to encoding or so, didn’t really investigate it further).

It’s implemented in these two files:

I don’t think there is any other hard dependency. It fills these 3 arrays with remapping information:

But doesn’t need anything else.

One thing that we should still do is to make the location information recursive: currently it remaps the one file that is being preprocessed. So if it includes another file, the location info currently points to the #inlude "something.h", instead of into the something.h itself. For Fortran fortunately this is not needed as often as in C++. But it would be nice to do it recursively. If anybody is interested in helping, let me know.

We can do something similar for fypp, and we can include both in LFortran, enabled with options like --cpp and --fypp (or both!). We can also include both in fpm itself.