Introspection in Fortran for generic file I/O libraries

What would it take to be able to write an I/O library (JSON, TOML, etc.) for Fortran that was as easy for a user to use as the built-in namelists (which, frankly, people shouldn’t be using). Currently, users of the libraries for modern file formats have to write a lot of verbose code, since there’s not currently any way for the generic library to know anything about the variables in their custom types. An example:

type :: mytype
 integer, dimension(:),allocatable :: ints 
 character(len=10) :: name 
 real,dimension(3) :: x
end type mytype

Now, how do we write a variable of type(mytype) to a file? Well, we have to do stuff like this:

type(json_file) :: json
type(mytype) :: t
...
call json%add('t.ints', t%ints)
call json%add('t.name', t%name)
call json%add('t.x', t%x)
call json%print('file.txt')

So, it’s up to the user to type out all the variables names, etc. This is error prone and very cumbersome for very large nested data structures.

What if we could do this:

type(json_file) :: json 
type(mytype) :: t
call json%print(t, 'file.txt')

Where the dummy argument to print is class(*), and the author of the JSON library had available in Fortran to do all the introspection necessary to resolve all the variables in the type. This would have to include variable names, kinds, size, rank, etc. And we’d need the ability to traverse user-defined types.

Wouldn’t this be a good addition to Fortran? I think it would be incredibly useful, would make life a lot easier, and encourage users to move to modern file formats. It seems like some or all of the machinery is already in the compilers, since that’s basically what it is doing with namelists. I just want to be able to use that machinery to read/write other file formats just as easily.

3 Likes

I think it would be great if there were something like an option on READ() and WRITE() when using NAMELIST groups to specify styles like JSON, TOML, CSV, … maybe

  read(unit=LUN,nml=GROUPNAME,style="JSON") 

But I like a lot of things about NAMELIST. I have not understood why, after being standardized for quite a while now, it is not more popular. It is useful for transferring data between programs, checkpointing state, interactive input, debugging, … I suspect it is a bear to write for a compiled language but since it is there it seems like at least for writing it would be easy to write other formats. Surprised I see no compiler extensions to do so, but most of those formats are not truly standardized, which haunted NAMELIST itself even though it was there in very early IBM Fortran. Some might like JSON, others TOML or YAML or HTML or CSV …

Alternatively, there are procedures for other languages (python, perl,
…) that can read NAMELIST files. You can use the python one to read
the NAMELIST and then it has features to write JSON, for example.

See Also

We use NAMELIST filters a lot. The programs write output as NAMELIST
files and write them into a spooling area. A cron job then picks up the
NAMELIST files and inserts the data into databases. For fixed namelist
groups it is very easy to write; and acts as a buffer so when the database
is overloaded, down for maintenance, down, … the files wait as in a
print spooler queue. Works great.

My biggest problem with NAMELIST is you cannot read a file with variables
defined in it that your program does not need/define.

Addendum

Fortran interfaces

NAMELIST can now write into an internal file. If it could write into an allocatable character variable (it is a little dusty whether you should always have a seperate header and terminator or you can use just one line; I think everyone but IBM allows just one line(?)) then you could write a parser procedure to convert to the other formats (some of the above links come close) for at least simple cases. So you could define a NAMELIST, write to an internal file, and convert and write the internal file. Less optimally, a Fortran SCRATCH file could be used as the buffer. I played with doing that with command line arguments so it is feasible for the simpler cases and allows for using the current NAMELIST group syntax from the user’s perspective.

2 Likes

I think historically one of the biggest knocks against NAMELIST was the memory overhead that it generated. My now fuzzy memory seems to remember that the first DEC VAX compiler I used (around 1981) didn’t support NAMELIST natively but you could buy an add on package that provided support. I use them occasionally during code development since its an easy way to set up test programs etc. For my production level codes I write my own parser. IMHO, I think two of the biggest roadblocks to a general IO library are the lack of a real string class and no native support for dictionaries/hashmaps etc.

You could have used DTIO if the JSON table was defined as a Fortran unit and had a simple linear structure, but that’s not the case:

type(mytype) :: var
write(unit,'(DTjson)') var

With the current facilities, I think the closest you can get to a decent UX is by defining a parent abstract class:

type, abstract :: jsonifiable
   contains
   procedure(to_json) :: write
   procedure(from_json) :: read
end type

So that each parent derived type would call it for all children and you get the nested structure. However, Fortran has no multiple inheritance, so basically all derived types in the program would need to extend this “superclass”. No free lunch here.

1 Like

Or, try to emulate traits, as I desired to do recently, which has its downsides: Traits/interfaces in Fortran?

I opened up an issue for this introspection proposal at Introspection in Fortran for generic file I/O libraries (TOML, JSON, NPZ, etc.) · Issue #331 · j3-fortran/fortran_proposals · GitHub.

1 Like

Native hash map would really take Fortran from toy language to usable general purpose language when you’re trying to write simple programs that solve the typical coding challenge type problems.

Yes it’s possible to use someone else’s library, or even write your own module for hash maps, but this is a feature expected to have native language support in a “Modern” language (such as Modern Fortran).

It’s not native, but I’ve found that the stdlib hashmap implement to work very well. Seems to be memory efficient, scales up to 1e7 and higher entries in my case, and relatively fast. Hopefully the community can continue to improve and adopt it.

1 Like

Chapel seems to have the “Reflection” module in the standard library. I’ve never tried it before (so not sure about details…), but it might be useful to create custom I/O libraries in Chapel.
https://chapel-lang.org/docs/modules/standard/Reflection.html

EDIT: I’ve read the page a bit more, and the module seems still missing “setField()” thing? (something like setfield! in Julia). If possible, I hope both “get” and “set” functionalities will be provided.