Contributors to a CLI library wanted

I’ve been interested in having a command line library that makes it easier to test those aspects of an application. I haven’t found a Fortran library that seems to allow this because they do some or all of the following:

  • report they’re own errors directly to the user and stop themselves (can’t test error conditions in a unit testing framework)
  • interact directly with get_command_line_argument (can’t run multiple tests at once from a single executable)
  • Don’t provide a convenient way of defining/describing the desired command line options

The most compelling example of a library I’ve seen that manages to solve these problems is optparse-applicative for Haskell.

Basically, I’d like to have just the following appear in my main program

class(my_custom_args_type), allocatable :: args

args = parse_command_line()
...

where the internals are organized like the following

function parse_command_line() result(args)
    class(my_custom_args_type), allocatable :: args

    type(varying_string), allocatable :: arg_strings(:)
    type(error_t) :: errors
    type(varying_string) :: help_text

    arg_strings = get_command_line_strings()
    args = run_my_parser(my_command_line_parser(), arg_strings, errors, help_text)
    if (errors%has_any()) then
        call put_line(errors%to_string())
        call put_line(help_text)
        error stop
    end if
end function

pure function run_my_parser(parser, strings, errors, help_text) result(args)
    type(cmdff_parser_t), intent(in) :: parser
    type(varying_string), intent(in) :: strings(:)
    type(error_t), intent(out) :: errors
    type(varyign_string), intent(out) :: help_text
    class(my_custom_args_type), allocatable :: args

    type(cmdff_parsed_args_t) :: parsed_args

    parsed_args = run_parser(parser, strings, errors, help_text)
    if (.not.errors%has_any()) then
        args = to_my_args(parsed_args, errors)
    end if
end function

pure function to_my_args(parsed_args, errors) result(args)
    type(cmdff_parsed_args_t), intent(in) :: parsed_args
    type(error_t), intent(out) :: errors
    class(my_custom_args_type), allocatable :: args

    ! code to extract data from parsed_args and populate args
end function

pure function my_command_line_parser() result(parser)
    type(cmdff_parser_t) :: parser

    parser = info( &
            sample() + helper(), &
            full_description() + program_description("Print a greeting for TARGET") + header("hello - a test for cmdff"))
end function

pure function sample()
    type(cmdff_parser_info_t) :: sample

    sample = &
            string_option(long("hello") + metavar("TARGET") + help("Target for greeting")) &
            + switch(long("quiet") + short("q") + help("Whether to be quiet")) &
            + option(INTEGER_PARSER, long("enthusiasm") + help("How enthusiastically to greet") + show_default() + value(1) + metavar("INT"))
end function

so that I can write tests like the following:

arg_strings = [var_str("some"), var_str("command"), var_str("I'd"), var_str("like"), var_str("to"), var_str("test")]
args = run_my_parser(my_command_line_parser(), arg_strings, errors, help_text)
! make some assertions about errors returned, help_text content, or args values

In this way, the rest of my code is completely divorced from any command line aspects, let alone any knowledge of the command line library I’m using. Even my tests are only aware to a very minimal extent of the command line library I’m using.

I’m curious to know:

  1. Would there be sufficient interest in having such a library available?
  2. Would anyone like to collaborate on the project with me?

I’m not sure if I’d use such a library. I prefer to be in direct control of argument handling although I do have some helper routines. For example (from one of my apps):

    parse_args: do i=1,nargs
        call get_command_argument(i,arg,alen)
        actual_arg => arg(1:alen)

        have_arg = get_arg_value(actual_arg,'debug_level=',ivalue=debug_level)
        if (have_arg) cycle parse_args

        have_arg = get_arg_value(actual_arg,'ndims=',ivalue=n_dims)
        if (have_arg) cycle parse_args

        have_arg = get_arg_value(actual_arg,'mu=',r8value=mu)
        if (have_arg) then
            if (mu > 0.5 .or. mu < 1.0d-9) then
                stop '*** mu out of range [1.0d-9,0.5]***'
            end if
            cycle parse_args
        end if
        ...

Of course this has the disadvantage of not providing a semi-automatic way of providing help but it does give me complete control of argument order and inter-argument interactions. I think it is more likely I’d write a script (in Tcl) to parse a spec file and generate code like above. Even that is fraught with difficulty because to make it general the syntax would need to be well defined (bison?) which can be quite tricky once arguments are non-orthogonal.

I’m curious what capability you are worried about losing? Handling positional arguments properly would of course be a requirement for the library.

You’re example seems not to be concerned with the argument order, but just doing key-value parsing manually. This would be handled by the library, and making sure all required options are present would be easier.

The other thing your example is doing is handling invalid (e.g. out of range) values which would otherwise be valid from the point of view of the parser. There may be some way to specify these to the library, but if not, in my example there is still a well defined place this logic would go - in the to_my_args function.

Have I missed something?

I admit I don’t fully understand your 2nd bullet point. But have you look at FLAP?

I thought it looked reasonable, and had an API reminiscent of Python’s argparse, which I do like. But, with regards to to my 2nd bullet point, is there a way to provide FLAP with a different command line to parse than the one actually used to run the executable? I.e. for testing the command line behavior within a set of unit tests.

If I understand correctly, something like this?

myprogram --test --args-to-test="-a -b -c --and --more"

Then your CLI library parses the whole thing, extracts the string "-a -b -c --and --more", and passes it to the same CLI library to validate it.

(I don’t know if FLAP can do this).

No, like in a unit test

! I don't want the following call to look at the actual command line
call my_command_line_routine(["my_command", "with", "--these", "arguments"], errors, my_command_line_object)
! now make some assertion about the errors returned, or the command line object returned

I’d like to have several of these in a suite of unit tests, all compiled into a single executable, so that I can actually “unit test” (i.e. in isolation) my command line logic.

1 Like

Ah I see what you mean. No I don’t think FLAP has that. Probably could be added.

Should I take a look at adding that functionality to FLAP? It could certainly be easier than writing a whole new library.

2 Likes

Replying to your response to my post

That is true of this snippet but not in general. At some point you have to tell the program what the arguments are, what type they are, what (if any) the dependencies are, which arguments are optional, how I want them denoted (-x, --x, -x= --x=), whether or not they’re case sensitive and ideally a description so that help can be generated. There are also arguments that control how later arguments are processed. Having a library seems to do some of that but whichever library is used that information will have to be passed into it with some syntax which could well make the code harder to read.
In your example I would define an overload on the assignment operator for var_str - requiring the application code to use it seems wrong since it is part of the library’s implementation.

Exactly, you have to define all that information anyway, and so a well designed library with an ergonomic API can make it easier to identify and read quickly (or at least that’s the goal). Every example I’ve seen of doing it manually ends up with a large block of procedural code where all of the important information is obscured by the syntax and mechanics. The what gets drowned out by the how.

There actually is an overloaded assignment operator for varying_string, but that doesn’t help inside an array constructor. You’ll also notice the only place I used it explicitly was in the test. I’d expect many user’s wouldn’t ever notice it, because they’d just have

! define command_line
! the below extracts the command line arguments internally and 
! passes them like command_line%parse(arguments)
args = command_line%parse()
! extract data from the args data structure

I don’t know about varying_string, but if you write your own string type with an overloaded assignment operator declared as elemental then the array construction works - I’ve just checked some code I wrote in 2013 which builds okay with ifort 19.1.0.166 although not with gfortran 9.1 so perhaps there is a compiler issue there.

I agree with your goals and accept that my current approach is clunky although my current solution (get_arg_value) is an improvement on what went before! I guess I’ll need to see a specific example when it’s finished to work out how exactly I would use it.

It find this interesting because I personally have been down this road and back up it.

I wrote command parsers that were designed for use by Fortran-based shells that were not even intended for command line parsing and even auto-built a TLI interface when called with a ?-mark character and also remembered the parameters used from the previous call so things like a plot command with many arguments could just be recalled by a plain “plot” command with no arguments and allowed for interactive prompting and so on. I actually had multiple requests to make exactly the opposite and as a result went in the opposite direction to make a command-line parser that handled 90% of what people in my circle at the time wanted, often with just the use of two routines, each with one parameter. These programmers bwere happier handling issues like parameter interdependencies themselves and preferred just providing a manpage-like help text block instead of defining usage text for each parameter. The same for defining allowed ranges and values which I really did need for the TLI (screen-mode) interface to make it responsive (so it could check values to some extent independent of the program). So I understand the urge and see it all over like the args-parse python interface but my experience has been that many people want something they can use without having to read the document twice. Large production codes can justify the effort of calling an elaborate interface (and I have some really elaborate ones I am personally rather fond of that are non-public) but out of stuff I made that are public most people seem to like the KISS approach. So ironically I took OUT all the functionality you are mentioning and made the code stop unconditionally on errors and only be called as a command-line parser and a good number of people seemed to prefer that. And following the Unix “standard” syntax (I personally hate having to quote multi-word strings and want to easily enter negative numbers) or something close to it seemed popular so you could just use it “like every other command” (just use find(1) on a Unix machine to find out how many exceptions to the “rules” there are) seemed to satisfy a pretty good segment of programmers.

But on the other hand things like the python arg-parse module seem to be quite popular even though I personally have to re-read the document every time I use it.

It might partly be OOP programming versus procedural and functional, but if you are putting a simple utility together something as simple as two calls like

 call set_args(' --title "this is my default title" -x 20.4 -y 40 -i F') 
 call get_args('title',title,'x',x,'y',y,'i',i)

to define your options and default values, update them from the command line, and convert them to the appropriate type or doing the same with a NAMELIST group so you do not even have to deal with types apparently also has an audience.