Schema support for IO libraries

This was originally raised in a thread on Writing a linter in Fortran:

Personally, I consider TOML Fortran as low-level library, providing schema validation would therefore be out-of-scope for the core project. However, having a project providing schema support for TOML Fortran and maybe other IO libraries like JSON-Fortran by @jacobwilliams or rojff by @everythingfunctional seems like a much more appealing choice.

One crucial point might be defining the storage format of the schema in the first place, all IO libraries currently define and implement their own data structures as well as the access methods. Having a standardized format which is working with most IO libraries would be the first step.

I was hoping to create a bridge to stdlib at some point which allows to directly load a TOML document into a stdlib hashmap, now available thanks to @wclodius amazing work. Still missing for this to become possible are list-like data structures in stdlib.

4 Likes

I had a bit of a thought about this and for me the creation of a schema (JSON, TOML, YAML, etc.) makes more sense when it is applied to something, fpm.toml, GitHub Actions workflows, package.json, etc. A schema purpose built for a tool like fpm I believe would be more useful to the end user. Having said that, having the option to validate any TOML file is also very appealing.

I also agree that this is probably outside of the scope of TOML Fortran, maybe another repo under the same organisation would be a better idea. @awvwgk Do you think the schema generation (not validation) should be done in Fortran or in something like Typescript?

My original comment came from me misunderstanding what the focus of the post was.

For an actual tool using any IO library, the schema validation will be an integral part of reading the input. If you have validated the input against the schema, you can safely read into the data structures without worrying about any error handling, because the data is always valid.

Making the schema and the input reader match is crucial for this to work. Having a library which can read or generate a schema should also be able to apply the validation to be a valuable addition for users of the IO library. As a user of an IO library I want to catch errors from the users of the program I built with it efficiently and report any error in an easy to understand way.

It would be cool to compile the schema from the derived types to store the input data, but this can be hard because not all checks (especially for input ranges) are easily to encode in a derived type declaration.

Generating the schema for external tools would a bonus which such a project provides.

2 Likes

I’ve had musings about writing a schema definition/validator for rojff. I haven’t had the time to fully pursue it yet.