Specifying command line arguments via the PROGRAM statement

proposal for the PROGRAM statement to have arguments that allow for an automatic interface to the OS command line (perhaps optionally a GUI interface as well):

If the PROGRAM statement took arguments like a procedure and there were an option to specify a formatting style like so …

program(style=getopts_long) :: myprogram(a,b,c,title)
real :: a=30.4
character(len=80) :: title=''
integer :: b=4, c=1

and this would allow you to call the program like

myprogram -a 10 --title 'this is my title'

where the styles ‘getopts_long’, ‘getopts’, ‘namelist’, ‘posix_shell’ were predefined but compilers could supply others; or where custom filters could be specified would that be useful? Maybe also requiring a MSwindows-like /keyword=value format as well would be desired.

So if the style posix_shell were specified you would call the same program with

./myprogram  a=10  title='my title'

Perhaps using kind= instead of style= is more “Fortranic”.

It’s trivial to write many of the intrinsics. Why have everyone write one when it is an extremely common task; and tedious and repetitive to convert arguments to the correct types. It provides for using standard behavior which makes program usage easier, and so on. There are many obvious reasons to want an interface to the command line beyond a trivial list of strings which is all that is provided for currently. It is at least thirty years overdue. Fortran cannot hide behind “I just crunch numbers” to the extent of not providing simple interfaces needed by virtually all programs. I have seen people completely ignore Fortran even though it would have been a very good solution for their primarily computational programs as soon as they cannot find a command line parser, date and time procedures, regular expressions, basic plotting utilites, … all things many proprietary Fortran compilers supplied thirty years ago.

I think this is best left for libraries to implement. There’s already a few available even your own it seems :wink: By having compilers implement this we end up with n different implementations (for a rather large n), each with its own bugs or inconsistencies and with a slow update cycle.

Command line parsing is also a complex enough task that needs varies from application to application so trying to make the “one” solution in the standard would be a difficult task. Take Python for example: The Python standard library has a module for this, but there’s still a plethora of libraries that implement it with slightly different approaches.

I think the standard should rather focus on specifying language constructs that make it possible for the community to build robust and easy to use libraries. Then the “best practice” way of doing a specific task can emerge from iteration and experimentation instead of upfront specification.

Easy enough to define a syntax or pick one regarding the long syntax. Not sure what the first question is asking.

The vast majority of programs needs would be covered by such an option. If you want a custom solution you can still write your own, as done now. Most programs use intrinsics like SUM(). Since the standard does not prescribe SUM() condition the input it can return very inaccurate values so some programmers have to write their own, but that does not mean SUM() should not be an intrinsic , as one simple example.

Using the proposed syntax a subcommand syntax would not be supported as all values would need associated with a keyword; but if that is considered important and to be more “Fortranic” I would propose just like a procedure call that you can start out with positional association until the keyword style you have selected is used, so a subcommand would just be a string associated to the first parameter. Just as Fortran I/O allows for variations on allowed separators and padding options for list directed and NAMELIST I/O it can allow for a colon seperator but to help encourage standardization I think multiple numeric values should require being specified by NAMELIST rules so A:B would not be supported, but A,B or ,B would be as a first thought.

The topic of “reinventing the wheel” comes up very often with Fortran primarily because there has not been a large standard library. I see command line parsers written over and over again; and indeed, have several public (and non-public) versions of my own. As the number of OSes has reduced and time has passed it seemed like going back to the early Fortran days when the PROGRAM directive took options might make sense.

Early de-facto Fortran almost always allowed for options on the PROGRAM directive for pre-assigning files to unit numbers. I was just looking at a number of programs in the old COSMIC collection that still contain PROGRAM options and thought it might be better used for CLI parameters.

I don’t think this is due to a small standard library, but rather because handling of external dependencies has been difficult. External dependencies are easy now due to efforts in fpm and CMake.

To compare with other later generation programming languages than Fortran:

  • Rust has a very small standard library and instead relies on external dependencies developed by the community. As far as I understand, this is a very conscious decision by the language developers. There’s no command line parsing in the standard library, but it’s provided by third party packages in the ecosystem.
  • Golang probably has a somewhat larger standard library than Rust. It provides some more functionality than just the bare essentials for the language. Command line parsing are handled by third party packages.
  • Julia standard library seems similar to Golang in size in that it’s slightly more than the bare minimums. Command line parsing are handled by third party packages.
  • Python has a really large standard library that includes command line parsing. There’s also lot’s of third party packages that implement command line parsing. If the goal of a standard library is to provide the de facto “standard” way of doing something one could argue that the Python standard library has failed in that regard.

As I sidenote I would really welcome a robust and user friendly Fortran library for command line parsing. I think that would be highly beneficial for may Fortran users. I tried FLAP a while back and found the interface to be really nice, but I run into some issues (don’t remember exactly what though). Also it doesn’t seem to be actively maintained any more. I haven’t tried M_CLI yet, but that could be an alternative.

We’ve had issues open on command line parsers in both fpm and stdlib:

Here’s a list of what’s available:

You could also try interfacing with one of the C or C++ libraries:

I suppose the suggestion of the OP could be partially fulfilled with some kind of preprocessor or code generator. Given the description

program(style=getopts_long) :: myprogram(a,b,c,title)
real :: a=30.4
character(len=80) :: title=''
integer :: b=4, c=1

it would generate a subroutine used as the main entry point or a derived type storing the options. At this point we are probably reaching something very close to the docopt command-line interface description language, but tailored to Fortran users. I often wish such meta-programming tools would “live” in the browser, so I could easily analyze their output. If the tool caught on it could also be offered as an fpm sub-package.

The best way to stop the cycle of re-inventing command line parsers is probably to put more effort into advertising the available libraries via Discourse, Twitter, blog-posts, fortran-lang tutorials, etc. One way to get a library which stands out from the rest is simply to provide better documentation and distribution mechanisms.

1 Like

Thanks for the list, which I augmented and placed in the Command Line Parsing section of my list of Fortran Tools:

Command Line Parsing

argv-fortran: better get_command_argument for Fortran that returns the argument in an allocatable character string, by Jacob Williams

cmdff: makes nicer command lines for Fortran codes, by Brad Richardson

command_args: automatically handles the command-line arguments that are passed to the program, by Arjen Markus

f90getopt: getopt()- and getopt_long()-like functionality (similar to the C-functions) for Fortran 90, by Hani Andreas Ibrahim, based on code by Mark Gates

fArgParse: command line argument parsing for Fortran, part of the Goddard Fortran Ecosystem

FLAP: library designed to simplify the (repetitive) construction of complicated CLI in Fortran 2003, by Stefano Zaghi. FLAP has been inspired by the python module argparse and tries to mimic it.

FTN_Getopt: supplies a method for handling command arguments in a manner similar to the getopt facility in C, by Reinhold Bader

M_CLI: cracks the command line when given a NAMELIST and a prototype string that looks very much like an invocation of the program, by urbanjost and Laurence Kedward. Using the NAMELIST group has the benefit that there is no requirement to convert the strings to their required types or to duplicate the type declarations.

M_CLI2: cracks the command line when given a prototype string that looks very much like an invocation of the program, by urbanjost et al. A call to get_args(3f) or one of its variants is then made for each parameter name to set the variables appropriately in the program.

M_kracken95: Fortran 95 version of the kracken(3f) procedure (and related routines) for command line parsing, by urbanjost

optionsf90: module for defining and parsing command-line options and input parameters for Fortran programs, by Christopher N. Gilbreth. Its design is inspired by Python’s optparse module.

paramcard: command-line parameter input made simple, by Takahiro Ueda

1 Like

The interface I miss the most in the non-proprietary world is based on concepts I think that originated with NOS CCL and TDU facilities on CDC machines a long time ago and were much more powerful than Unix shell capabilities.

Essentially you can type a description of your parameter including type and range if numeric, a list of allowed values if a string an that is preferred, to define allowed values.

Then you provide a block of help divided by lines like “.help,PARAM_NAME” where this a a period in front of help.

You can then call the program positionally, as in “myprogram 10,“title”,3.4” or by keyword but still very much like a Fortran procedure “myprogram title=‘title’,a=10”
in command line, but by adding a question mark you go into interactive prompting
where the help for the parameter is specified; or two ? and you go into TLI mode where an automatic help panel is built in a screen panel, and you can click or hit the help button to get that block of help text for each parameter.

So the user can define a command in a few lines but optionally include a description of each parameter. With just that input the user is provided an interface that can be used in command line mode, but also in interactive mode. The interactive mode can be a terminal line interface or a screen panel. Would take a while to describe in detail but that old CDC NOS style has been carried forward in some environments to modern platforms because it was far more user friendly and programmer friendly than anything I have seen emerge since (it is programmer friendly in the sense that you could specify things like range and value type and the interface would validate that without you adding code to do so). Linux has still not caught up to that, which dates back to at least the 1980’s.

I see some of the list already has fpm interfaces, which is encouraging.

1 Like

Sounds interesting. While googling earlier I found this nice write-up on command line interfaces,

I would be curious to learn how closely does their philosophy match your experience with the CDC machines.

Like with C++ and python there are standard parsers and predefined simple rules that apply to 90% of usage, and then anyone is free to write a parser if they want with features like subcommands, exclusive options and so on. I would say using SYNTAX=getopt would just follow a specific version of getopts, including one specifically defined for the purpose of the standard if need be. If syntax=namelist it would follow the rules for NAMELIST syntax. Allowing just NAMELIST syntax, which is already part of the Fortran standard would be better than the current state, and make it trivial to enter small arrays, numeric values using engineering syntax, and so on. I do exactly that on some programs now, by reading all the arguments and wrapping them in a NAMELIST group.

As far as what I personally like to do to handle subcommands, lists and ranges of values, exclusive arguments and such, see the M_CLI2 module referenced above, which allows for all that plus user-defined abbreviations for complex parameter combinations and so on. But I think that goes too far for what I am proposing;, i am picturing something very intentionally much simpler and trivially easy to use, as described above (just add the variable names to the program directive and insist they have default values (maybe).

Regarding some languages having a standard but other parsers still being created …
It might be that you could say python failed because there are other parsers available but it is even more likely the simple standard they provide is used a large percentage of the time, and like anything else there are special cases or variants that others write additional parsers for. It would be nice to see usage counts of the standard interface versus the custom libraries but I don’t know of a trivial way to garner them.

I have one I use a lot that much like the old CDC NOS CCL format allows for a line mode program to automatically invoke a screen mode, but unfortunately it is in a proprietary environment;but I think that gives me a good feel for how useful that can be compared to the current GNU/Linux CLI state of affairs.

A simple mockup giving a glimpse of how that old NOS CCL might look today with GNU/Linux is shown below for the cp(1) command. The idea is you would use cp(1) exactly like you do now, but if you specified a special character (NOS used a ?) an interactive mode is invoked. If your screen type is not “dumb” or if you did not ask for terminal mode a screen mode comes up much like curses/Pcurses/ncurses programs use; I would like to see that available on Linux as a standard feature all languages could use but I am not proposing that for the Fortran header syntax.

A calculator is not useless if it cannot solve a PDE problem, a parser that only allows getopt_long-like parsing using one additional line in the code is not a bad idea just because it does not automatically generate a CANVAS or X11 interface or allow for subcommands. That logic would lead to saying all intrinsics in Fortran are useless because they do not support arbitrary precision or units or do not return error ranges on answers.

Regarding the article on CLI interfaces …

I threw together an interface using fixedform(1) from the GPF site (which has not been worked on in a while – caveat emptor) to simulate what a CDC NOS screen would look like for an automatically generated interface to a script, “unix-izing” it a bit.

I think this is
something reminiscent of the old auto-generated NOS CCL script interfaces. A screen interface with context help was auto-generated for any script nearly 40 years ago!

The article about unix CLI scripts I almost entirely agree with, but the state of things described years ago was the state of Unix; other OSes like NOS, NOS/VE, VMS, Aegis and IBM systems had far more elaborate scripting interface tools, and editors far friendlier than vi(1) (but which i think vim has now surpassed) and many other good things that went away with near-standarization on Unix-like interfaces. I could permit a file to a specific user and say I wanted to give him append access only on Tues. from 5 to 7 PM on those old machines. ACLs are still not common in a lot of environments today. You could ask for last access and access counts to be kept on a file so you could tell who loaded a library file and how often, and when they last accessed it. That was really nice. We threw a few babies out with the bathwater by going away from proprietary OSes, even though that was a worthy goal. I can remember having to remember a dozen commands just to list files in a directory (CATLIST, DIR, …) and different commands for the same thing on CDC, PRIME, VAX, IBM, HP, AEGIS, … what a nightmare that was. Although I miss the system-language extensions Fortran had. VAX/VMS has Fortran long before C and the languages were extended to give you full access to the OS, which i remember as being better than most of the C/C++ interfaces available today, but they locked you into one OS. Fortran “owned” graphics, and the first relational database and regular expression parser I ever used or even heard about was in Fortran (RIM0; I have heard it still exists in a few places, but I have not seen it in years).

If such features had gone into the standard and not been fought against because of the work required or because it gave a vendor advantage or reduced vendor lock-in there would have been no need for dozens of the languages that came afterward.

To define it more formally I would suggest that placing names in the PROGRAM statement defines a special NAMELIST group called ARGS; for which I/O rules are already defined. The input for this group, instead of being read from a file would be read from the command line as if preceded by the line “&ARGS” and followed by the line “/”.

So the default behavior would be to assume namelist input syntax. Optionally, filters can be provided to provide support for different syntax. A getopts-like and getopts_long-like filter would be predefined. For the default supported syntax parameter values would support NAMELIST-style value specifications; so an array would be specified using a comma-separated list, for example. Only intrinsic types would be allowed, unlike NAMELIST. The specific rules for any filter could vary, but for the predefined filters the rules would provide a getopts-like interface as much as possible, with compromises regarding case and a few other details possibly modeled on solutions such as used by the M_CLI module. This would not prevent the command line from being accessed by the current intrinsics so anyone can make any custom parser they like.

So the default behavior is basically already defined in the standard and would allow usage such as “myprogram a=10 b=20” with a one-line change to the program without the user having to deal with reading strings from the command line and converting them to types. The filters would allow for customizations more like OS “standards” (although those are weakly defined in common systems like GNU/LInux, Unix, and MSWindows) .

What I want is to hear other peoples thoughts; the purpose of the post. The easiest thing to implement I can think of is the namelist model which is already defined in the Fortran standard. I would prefer an option to map that to other syntax such as getopt_long syntax, TUI/GUI interfaces, … . But you could not get any simpler than having the first line of a program list all the command line parameters, in a fashion very reminiscent if not identical to a SUBROUTINE statement and have a self-describing keyword-value interface defined. I have had the NAMELIST syntax idea in mind for value specification before posting but am curious about interest level and what options others desire. If I am primarily working on one program I might not care if there even is a CLI interface, as it might very well be using a file for input already. But for those that create hundreds of small programs using Fortran as others create perl or bash or lua or python scripts having a one-line interface definition is probably more valuable. The NAMELIST syntax is very similiar to “bash -k” or “key=value key2=value2 command” syntax supported by the POSIX shell standard, although books rarely emphasize that so more recent users of the sh/ksh/bash shells often seem unaware of that and write or call getopt-like parsers.

So that is less “what you really want” and more like what would be trivial to implement but still widely useful. So the only major issue in implementing that is handling shell processing of special characters such as quotes and asterisks, and whether the options for handling NAMELIST options available via an OPEN statement would be available in some manner. You would want something like

mycommand   mytitle="this is a multi-word title"  point=10,20,30  

to be supported, but you would have to quote valid NAMELIST syntax like “point=3*0.0” with most OS/shell environments available today, which often use * as a glob character; although that would already be possible using multiple quotes or escape characters in most shells, such as

mycommand  'point=3*0.0d0,title="this is my title" '

having to use the multiple quoting seems awkward, and resolvable. If NAMELIST is the underlying syntax, then there could be an automatic method of reading input from a config file as well, which could be as simple as an argument list not starting with a keyword, or starting with a = or perhaps a /. So I could specify my values on the command line, but perhaps with a call like

mycommand ./configfile

it could be read from a file, where perhaps a magic string at the top of the file would allow for using JSON/TOML/INI/… files as well; and so on. So there are lots of possibilities but I am curious how many want it.

PS:

If I am running in a bash shell, and I have a little shell called “tst” in my path that contains

#!/bin/bash
A=${A:-10}
B=${TITLE:-'X-axis'}
echo "B is $B and A is $A"

and I enter

what happens? If you are an old-time Bourne shell user you probably know that is a totally standard way to call tst(1).

Sometimes reinventing the wheel can be as simple, as including an external library, and limiting the amount of libraries that needs to be installed on every cluster one uses for computations (with old compilers, MPI libraries…) can itself save a lot of work in the future.

This

      mycommand   mytitle="this is a multi-word title"  point=10,20,30

can be as simple as

     implicit none
     
     real point(3)
     character(256) :: mytitle
     character(1024) :: command_line
     
     call read_command_line
     
     call parse_command_line
     
     print *, point
     print *, "'",trim(mytitle),"'"
     
contains

     subroutine read_command_line
       integer :: exenamelength
       integer :: io, io2
       
       command_line = ""
       call get_command(command = command_line,status = io)
       if (io==0) then
         call get_command_argument(0,length = exenamelength,status = io2)
         if (io2==0) then
           command_line = "&cmd "//adjustl(trim(command_line(exenamelength+1:)))//" /"
         else
           command_line = "&cmd "//adjustl(trim(command_line))//" /"
         end if
       else
         write(*,*) io,"Error getting command line."
       end if
     end subroutine
     
     subroutine parse_command_line
       character(256) :: msg
       namelist /cmd/ mytitle, point
       integer :: io

       if (len_trim(command_line)>0) then
         msg = ''
         read(command_line,nml = cmd,iostat = io,iomsg = msg)
         if (io/=0) then
           error stop "Error parsing the command line or cmd.conf " // msg
         end if
       end if
     end subroutine
end

with the only exception that one needs to escape the quotes on the command line

./mycommand   mytitle=\"this is a multi-word title\"  point=10,20,30

or the multiple quotes you show

./mycommand   "mytitle='this is a multi-word title'  point=10,20,30"

This is what I use and will not change for a third party library (fpm or not) without a very good reason.

3 Likes

Great example! Thanks for sharing.

Upon an error you could deploy the pretty-diagnostics tool by @awvwgk (a discussion is available in the thread Exploring first class error messages for Fortran). I wonder if there is a preprocessing trick, which would help share the namelist arguments between the parser and the error reporting.

I think it’s exactly what @urbanjost is trying to suggest as an enhancement, in order to avoid writing/copying the boiler-plate read/parse subroutines.

Yes, as a minimum I was inquiring about adding arguments to the PROGRAM() directive that define a special namelist that is read from the command line, but allowing in particular for strings to be detectable as such as the compiler could easily know the type of the parameter.

I have used exactly the method you describe, dating back to before this:

and that is dated 2009, as well as on the Fortran Wiki, this forum, and a couple of github sites, so no argument there :slight_smile:

But the command line and NAMELIST are basically used to define a simple table, typically consisting of at least

NAME VALUE TYPE DEFAULT

often supplemented with a DESCRIPTION, CONSTRAINTS ON VALUES, and so on.

So whether you say "cmd 10,20,“title” (CDC NOS) “A=10 B=20 TITLE=“title” cmd” (bash), “cmd -a 10 -b 20 --title ‘title’” (getopts_long) or “cmd /A=10 /B=20 /TITLE=‘title’” you are in most common cases just defining that table. I think it would be great and reasonable to impliment that using existing NAMELIST rules as you describe; but think it is plausible to define filters to convert those other formats too;
and am wondering what the general interest is. I use a non-public preprocessor to do that now using that syntax except you use $PROGRAM instead of PROGRAM, so I like the preprocessor approach, but as a second choice dependent very much on a standard preprocessor being defined; but would prefer it being part of the Fortran specification.

I generate hundreds of frequently small programs so having a simple accessible way of providing command line options (even using TUI and GUI interfaces) is particularly appealing to me, but I know many cases where someone is working on a single program for whom that may not be nearly so much a glaring hole in the Fortran language (and many other languages as well). I think it would be nice for Fortran to provide such a simple but useful interface as part of the language.

I guess the above feature is very similar to config variables in the Chapel language, which allows the user to change the value of config variable or constant (defined anywhere in the program) directly via a command-line argument. IIRC, it also allows module variables to be defined like ./foo.x --mymod.num=300, or from a file (-f<filename>), which also seems similar to the namelist idea.
https://chapel-lang.org/docs/users-guide/base/configs.html

1 Like