TOML Fortran 0.3.0

Just released a new version of TOML Fortran, which adds a lot of new features regarding error reporting and handling of data structures. I specially looking forward to make use of the error reporting in fpm soon.

This release comes with two new tutorials describing the usage of these new features:

  1. building a linter for fpm manifest files: Building a linter — TOML Fortran
  2. creating a custom lexer to parse JSON: Writing a custom lexer — TOML Fortran

Also, there are a lot of problem related recipes now for addressing concrete use cases, like

Finally, the documentation is partially localized in German and thanks to @alozada in Spanish as well. Open the Read The Docs navigation menu in the left sidebar to access the translated pages.

Feedback is welcome, let me know if you are using TOML Fortran or have suggestions for improvements.

15 Likes

Very interesting. Using your example I added the toml-f parser to my JSON Fortran benchmark code.

Time to parse a large JSON example file:

               json_fortran :   0.1154  seconds
                      rojff :   0.2359  seconds
                      tomlf :   0.4717  seconds
                       fson :   0.8649  seconds
2 Likes

Thanks for testing this. Looks like you kept the pruning step for removing type annotations for the toml-test JSON format (this one only uses strings + type annotations). The transformation of the whole data structure to remove the type hints, even if non are present, is rather expensive.

Running the repository (c9d72db, GFortran 12, Intel i7-6500U CPU @ 2.50GHz) gives the following baseline for me

      read file to a string :   0.0011  seconds
               json_fortran :   0.2133  seconds
                       fson :   1.5379  seconds
                      rojff :   0.2757  seconds
                      tomlf :   0.9108  seconds

Without the unneeded transformation the timings look somewhat different.

      read file to a string :   0.0011  seconds
               json_fortran :   0.2107  seconds
                       fson :   1.5152  seconds
                      rojff :   0.2744  seconds
                      tomlf :   0.2704  seconds

Profiling the modified test program with VTune shows that the JSON parsing is currently spending about one third of the runtime on the deep-copy of the data structure to remove the artificial top-level table:

I think this can be avoided by giving the toml_table type a possibility to drop a value and return the allocation, which can than returned as polymorphic toml_value rather than as toml_table, being more appropriate for the JSON format.

In any case, TOML Fortran is not a JSON parsing library, it just happens to need a JSON parser. Since the JSON grammar is much easier than the TOML one, it makes for a shorter tutorial to teaching the idea of the lexer in the library.


Edit:

After a bit of tinkering (see Refactor storage structure for tables and arrays by awvwgk · Pull Request #108 · toml-f/toml-f · GitHub) I’m now actually able to barely outperform JSON Fortran on its own benchmark:

      read file to a string :   0.0012  seconds
               json_fortran :   0.2172  seconds
                       fson :   1.5760  seconds
                      rojff :   0.2866  seconds
                      tomlf :   0.2127  seconds

The deep-copy of a data structure representing 2MB of JSON is indeed quite demanding and shouldn’t be done if it can be avoided. Still, I find it amazing that it is actually possible to do a deep-copy with nothing but a simple assignment in Fortran (if the derived types contain only allocatables like in TOML Fortran).

2 Likes