I just published the first release of fpx, an extended Fortran preprocessor.
This project aims at providing a simple, embeddable, open-source preprocessor written in modern Fortran. fpx is mostly compliant with a C preprocessor, fine-tuned for the specificity of the Fortran language. fpx is an embeddable preprocessor. It can be used as a command-line tool or directly embedded into any solution.
fpx supports:
conditional compilation with #if, #ifdef, #ifndef, #elif, #else,#endif
simple macros and functions like macros with #define, #undef, defined, and !defined
simple arithmetic and bitwise operations with +, -, *, **, /, >, <, >=, =<, ||, &&, |, ^, &, ! and ~.
include files with #include
variadic macros with __VA_ARGS__, and __VA_OPT__,
build-in macros as __LINE__, __FILE__, __FILENAME__, __TIME__, __DATE__, __TIMESTAMP__
stringification # and concatenation ##
and more…
Command line
The preprocessor fpx can be used from the command line using your favorite shell. The following options are available:
Option
Definition
-D
Define a with no value.
-D=
Define a with as its value.
-U
Undefine ’
-I
Add to the end of the global include paths.
-h, -?
Display this help.
-o
Output file path with name and extension.
-v
Display the version of the program.
Using the file preprocessor could not be easier. The function simply takes as arguments the input and output file paths.
Embedded
program test
use fpx_parser
call preprocess('tests/input.f90', 'tests/output.f90')
end program
*For more examples, please refer to the Documentation.
At the moment, the fpx passes all the tests from LFortran and tests of flang targeting modern Fortran. Some of the flang tests might differ since they are quite opiniated, especialy concerning the rules for line continuation in and after macro substitution.
Before proceeding with the development of fpx, I would like to propose this survey to determine which functionalities are the most useful. I collected some ideas of mine, but you have more, do not hesitate to share.
Support #foreach(a,b,c,…) keyword, to build multiple versions of the same function with various kinds
Look for #include files in PATH
Read response files (.rsp) to substitute intel fpp with fpx
Read global macro definitions from a configuration file
Evaluate string intrinsic functions at preprocessing time (e.g. __FILE__(index(‘', back=.true.):index(’.', back=.true.)) to return the filename.)))
Another feature that would be nice would be to automatically invoke the preprocessor by the fortran compiler, i.e. without a separate preprocessor step and without a separate intermediate output file. Requiring a separate output file makes debugging more difficult because the line numbers will refer to the intermediate file rather than to the original source file, and it makes the build process more difficult because of the extra step required in the *.f90 → *.tmp → *.o process.
I think the preprocessed files offer hints for the correct next line number in the form # <number>, don’t they? (at least ifort/ifx do).
As @rwmsu posted (while I started writing this), there’s already an effort to standardize the preprocessor, and also an effort to obviate the need of #foreach (...) through the auto-generic subprograms.
So, how would an external preprocessor fit in, when it’s supposed to always be run by the compiler —regardless of flag or file extension?
That’s the idea. I would like to offer the possibility to substitute fpp or cop with fpx without any additional step. I started to collect some info about it. The results of my research are available in the Documentation. Intel for instance proposes the fpp-name flag to pass the name of another preprocessor. All I am missing is to be able to read response file used to pass the command line arguments.
Besides this fpx already supports reading from stdin.
Finally, as mentioned by @jwmwalrus, the #line is used to keep track of the original file line number. Both fpp and fpp work this way. It’s not supported in fpx yet though. Maybe this should be the next thing to do.
Features I find useful are an option to recognize environment variables as defining a macro, a block text feature than can optionally export the text to a file for post-processing, an option to name the block of text and reuse it for templating, allowing #include to include files via curl/wget options, allowing #include to filter the plain text input into a character array definition, into base64 data (allowing including PNG, PNM, GIF, … files, for example) are features I find I use more than the base if/else/elseif/endif selection options.
Thanks for bringing this to my attention. I was not aware of this. I hope it can finally make it to the standard.
That said, history showed that it is not that easy to standardize the preprocessor (see the coco case). The big difference I see with the coco attempt though is that, this time, the proposal is much closer to a C compliant preprocessor, with tweaks, which I am perfectly aligned with.
(@jwmwalrus, I try to answer to your comment in the same post.)
I do not see fpx as competing with that proposal since, 202Y is not near soon and from what I understand there is no guaranty that the proposal makes it to the final standard. In the mean time there is room for fpx.
Moreover, as @certik advocates very often, there is a need for prototyping new features. So if fpx can be used as a prototype and finally gets retired in a decade or so, it will have served its purpose.
From what I can tell fpx is quite compatible with the j3 paper, so it can really be used as a prototype and codes compatible with it will remains compatible with a standard preprocessor.
Finally, I have seen recently that @FedericoPerini added some preprocessing support to fpm. This shows that there is a need for an embeddable preprocessor, which is not possible if it is part or the compiler.
I would be more than happy to work with the authors of the proposal to see how fpx can get closer to their proposal.
Something I have never provided in a preprocessor but I found intriguing in coco was reversibility, where all data in the original file was retained in comments and could be recovered.
Originally, a small cpp processor (based on stripping down prep) that did not include macros was proposed as being included with fpm and fpm.toml being automatiically run through it since TOML does not directly provide conditionals. At the time a standard preprocessor was considered to be relatively imminent and fpm itself was in its infancy so it was decided to put off such a feature; and fpm.toml looked like it was going to have profile definitions; and the main need at the time looked to be to allow specifying different compiler options. Since then a need for other selections like dependencies being looked for locally versus via curl have created other reasons a preprocessor could be useful for running fpm.toml through.
Thanks to those who answered the poll. From the results, it seems that the #foreach and #include directives are on top of the list.
In addition, I have been discussing with @bonachea and @pjfasano from the j3 subgroup about pre-processing. Looks like we have a lot in common, and they suggested that fpx could become a ‘reference‘ implementation of the current j3 papers:
After reviewing the papers, it does not look too hard to make fpx `standard` compliant. It already support most directives (only missing #line_number, #pragma, #error, and #warning), function-like macros, variadic macros and most arithmetic operations (missing x:y?z). There is a notable difference about line continuation, but that should be easy to standardize.
I wanted to start working on the #include directive, until I noticed that the papers do not clearly mention the order in which the paths are resolved for and “filename“.
My approach would be to stick to what intel’s fpp is doing:
For #include “filename”, file names are searched for in the following order:
In the directory in which the source file resides
In the directories specified by the I or Y preprocessor option
In the default directory
For #include <filename>, filenames are searched for in the following order:
In the directories specified by the I or Y preprocessor option
In the default directory
There is still an open question regarding the ‘default’ directory. Should it be the directories specified in the INCLUDE environmental variable? Or something else?
Having looked at that issue for prep a behavior that varies between fpp and cpp varients is if you #include “filename” in a nested set of files and they are not in the same directory is the name read relative to the directory the file is found in or from one of the previous levels? It seems intuitive to read relative to where the file is found but last time I looked some preprocessors used the initial directory where everything started and did not consider that a bug at the time.
In recent years the question about whether to do anything with the input files regarding encoding comes up. So far I do not thing any cpp-like processor does but if not also doing so be careful about requiring a fixed line width in bytes, as UTF-8 strings can easily exceed the old 132 or 72 character limits. Fortran has gone to allowing very long lines so an option to split lines at a certain column number is less desirable anyway.
A “super-include” might convert non-ASCII-7 characters in strings to C-style escape sequences, which more and more Fortran compilers support as an extension, or converting binary files into base64 strings as options. There are very few restrictions on what the Fortran INCLUDE statement can take as an argument, but #include usually is limited to just filenames; but I think the Godbolt site and a few others will resolve a URL or curl statement. Handy on web applications but probably not going to make it into a standard, but INCLUDE is surprisingly very free to allow almost any syntax
I noticed that there are variants in the search order and even in the relative paths reference. If you look at the Windows doc, it states that
Syntax Form
Action
Quoted form
The preprocessor searches for include files in this order:
In the same directory as the file that contains the #include statement.
In the directories of the currently opened include files, in the reverse order in which they were opened. The search begins in the directory of the parent include file and continues upward through the directories of any grandparent include files.
Along the path that’s specified by each /I compiler option.
Along the paths that are specified by the INCLUDE environment variable. |
| Angle-bracket form | The preprocessor searches for include files in this order:
Along the path that’s specified by each /I compiler option.
When compiling occurs on the command line, along the paths that are specified by the INCLUDE environment variable. |
Point 2 of the “quoted form” is a variant of what you were discussing, from what I understand.
I’m not going to support url because I would like to keep a pure Fortran solution, but that’s an interesting idea .
I must admit that I dropped the problem about line length simply by excluding support to fixed form. If you check the papers, this is also what the j3 group is suggesting. Supporting legacy fixed form is a whole new challenge, that I am not ready to take.