A New JSON Library

Thanks @jacobwilliams! So rojff is unfortunately 16.75x slower than the default Python’s parser. The JSON-Fortran is 4.35x slower. That’s better.

As a user, I just want a library in pure Fortran (fpm installable) that is comparable to Python’s default JSON library. So it doesn’t have to be as fast as the fastest C++ libraries. But it needs to be competitive at least with Python. I would say 50% slower at most, so 0.06 on the above benchmark would be ok I think.

Then we can say that with LFortran in a Jupyter notebook that we have an equivalent experience. If we are 4x or 16x slower, then people will think “what is the point of using Fortran if you can’t even match Python, which we all know is slow?” (Yes, I know well that Python is actually quite fast with these libraries, but I also know that we are capable of matching the speed, one way or another.)

3 Likes

Yep, agree.

FYI: my benchmark code is now here: GitHub - jacobwilliams/json-fortran-benchmarks: Benchmarks for JSON Fortran parsers

3 Likes

Are there major differences in the libraries in terms of error handling? I could imagine that error handling generates huge overhead because you have to check more and ideally you have to track the position while parsing (like the curser_t type in rojff).

1 Like

JSON-Fortran does have pretty comprehensive error checking. If there’s a parsing error, the caller can retrieve what the error was, what line and character it occurred on, etc.

1 Like

Interesting! It will be nice if someone knowledgeable of both Python libraries as well as JSON-Fortran can complete a thorough investigation and provide a summary of the root-cause(s) of the nearly 8X slowness compared to Python ujson.

I would really like to be proven wrong but my hypotheses is 3 reasons as to the slowness of such libraries developed in pure Fortran:

  1. The language standard of Fortran itself needs significant improvements to help enable a vital aspect of scientific and technical computing which is pre and post-processing of data, now there are massive amounts of it. The core number-crunching is important but processing of all the input and program data to get to the number-crunching stage and once crunched, process the results again for all the stakeholders is paramount. The utility in question here, a JSON library for Fortran, is but one part of this. However, circa 2021-22, it’s rather difficult to build a performant library in Fortran compared to the alternatives. The language itself needs to offer a set of facilities to enable such library authoring, I’ve listed my suggestions here. Add the computer science concepts of move semantics and rule of 7 to the list, for this is relevant to how libraries such as JSON-Fortran and rjoff tend to be architected.
  2. Fortran compilers need to really up the game on optimization though it’s a very difficult battle. The other paradigms, especially C++, Python, and Julia, attract the sharpest minds and have tons and tons of them to optimize and optimize their language processors. Fortran needs a lot of catching up here.
  3. Library authors themselves will need to put in tons more effort to further optimize their libraries and eke out every ounce of performance, if they are intent on remaining competitive with other alternatives. This may include perhaps replacing critical sections of their “pure Fortran” code with optimized C and/or assembler pieces; this may apply to Fortran stdlib as well.

Or I may be completely off and may be it is just one or two low-hanging fruits in the Fortran code for these JSON libraries that affect the performance and once those fruits are grabbed and the code improved, the Fortran equivalent becomes similarly fast. As I wrote above, I wouldn’t mind at all if I am wrong on this even as my experience thus far has informed me otherwise.

2 Likes

@jacobwilliams , please see this. I was just about to suggest you the same re: 64-bit integer (and 64-bit real) with your timing measurements involving your use of system_clock when @tomohirodegawa made that post in the other thread.

It may not change the gist of your benchmarks you reported thus far all that much but it will be good to ensure your “instrumentation” for timing has no issues. On this, please note your current use of a default integer and real with system_clock will leave a nagging doubt with some folks, it will for me.

1 Like

Done! (note: I need to clean up the whole thing…this really is just something I slapped together in 15 minutes). :slight_smile:

1 Like

:+1:

I found the gprof(1) output from a gfortran(1) build interesting:

Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  ms/call  ms/call  name    
 87.51      0.07     0.07  2473280     0.00     0.00  __json_value_module_MOD_json_value_reverse
 12.50      0.08     0.01   167178     0.00     0.00  __json_value_module_MOD_json_value_add_member
  0.00      0.08     0.00  2473280     0.00     0.00  __json_value_module_MOD_pop_char
  0.00      0.08     0.00   222252     0.00     0.00  __json_value_module_MOD_push_char
  0.00      0.08     0.00   167179     0.00     0.00  __json_value_module_MOD_parse_value
  0.00      0.08     0.00   167178     0.00     0.00  __json_value_module_MOD_json_info
  0.00      0.08     0.00   111126     0.00     0.00  __json_value_module_MOD_parse_number
  0.00      0.08     0.00   111080     0.00     0.00  __json_string_utilities_MOD_string_to_real
  0.00      0.08     0.00   111080     0.00     0.00  __json_value_module_MOD_string_to_dble
  0.00      0.08     0.00   111080     0.00     0.00  __json_value_module_MOD_to_real
  0.00      0.08     0.00    56045     0.00     0.00  __json_value_module_MOD_to_array
  0.00      0.08     0.00       46     0.00     0.00  __json_string_utilities_MOD_string_to_integer
  0.00      0.08     0.00       46     0.00     0.00  __json_value_module_MOD_string_to_int
  0.00      0.08     0.00       46     0.00     0.00  __json_value_module_MOD_to_integer
  0.00      0.08     0.00       12     0.00     0.00  __json_string_utilities_MOD_unescape_string
  0.00      0.08     0.00       12     0.00     0.00  __json_value_module_MOD_parse_string
  0.00      0.08     0.00        4     0.00     0.00  __json_value_module_MOD_to_object
  0.00      0.08     0.00        4     0.00     0.00  __json_value_module_MOD_to_string
  0.00      0.08     0.00        2     0.00    40.00  __json_value_module_MOD_parse_array
  0.00      0.08     0.00        2     0.00     0.00  __json_value_module_MOD_parse_object
  0.00      0.08     0.00        1     0.00     0.00  __json_file_module_MOD_json_file_failed
  0.00      0.08     0.00        1     0.00    80.01  __json_file_module_MOD_json_file_load
  0.00      0.08     0.00        1     0.00     0.00  __json_value_module_MOD_json_clear_exceptions
  0.00      0.08     0.00        1     0.00     0.00  __json_value_module_MOD_json_failed
  0.00      0.08     0.00        1     0.00     0.00  __json_value_module_MOD_json_initialize
  0.00      0.08     0.00        1     0.00     0.00  __json_value_module_MOD_json_parse_end
  0.00      0.08     0.00        1     0.00    80.01  __json_value_module_MOD_json_parse_file
  0.00      0.08     0.00        1     0.00     0.00  __json_value_module_MOD_json_prepare_parser

Wait, what code did you run that generated this? json_value_reverse shouldn’t be called at all for just parsing a file.

I was running various app codes as a quick view of where time was spent and got called off onto something else and also was seeing a bug that seems to have creep into the version of fpm as well that shows up with your code (using the latest version, which I just rebuilt if I use “fpm run” I just see “app app app app”. So back and I see you probably wanted me to run something like

MYBUILD='--profile release --flag -p'
fpm build $MYBUILD
 fpm run json_fortran_test $MYBUILD
gprof $(fpm run json_fortran_test $MYBUILD --runner) >gprof.out
(more||less) <gprof.out
exit

which I still think is more on target now that I have taken a bit of time to look at the new version. Not sure what platform you have or if you use gprof(1), which is a bit of an art as well as a bit of science but if not, give that a try. Will do that a bit more rigorously if you find the results useful.

I started an fpm plug-in that I had not finished that I might use this code to polish off:

NAME
  fpm-time(1) - call fpm(1) with gprof(1) to generate a flat timing profile
SYNOPIS
  fpm-time [subcommand] [--target] targets
DESCRIPTION
  Run the fpm(1) command with the gfortran(1) compiler and compiler flags
  required to build instrumented programs which will generate gprof(1)
  output files. Run the program and then run a basic gprof(1) command
  on each output.

  IMPORTANT: ONE target program should be selected if multiple targets exist.

  NOTE: 2021-03-21

     This is a prototype plug-in for fpm(1), which is currently in alpha
     release. It may require changes at any time as a result.

OPTIONS
   subcommand  fpm(1) subcommand used to run a program (test,run). If
               no options are specified the default is "test".
               The name "example" will be converted to "run --example"
               internally.
   --targets   which targets to run. The default is "*". ONE target should
               be tested
   --flag      ADDITIONAL flags to add to the compile
   --repeat,R  number of times to execute the program. Typically, this helps
               reduce the effects of I/O buffering and other factors that can
               skew results. Defaults to one execution.
   --help      display this help and exit
   --version   output version information and exit

EXAMPLE
   # in the parent directory of the fpm(1) project
   # (where "fpm.toml" resides).

    fpm-time
    fpm-time run demo1 demo2

SEE ALSO
    gprof(1), gcov(1)

I started that in March. Maybe time to finish it :blush:

If I finish it, if your default test is in the test directory you just run

fpm time

and get a profile run of your test, started the same for gcov(1) too. Also want to extend it to other tools like valgrind(1) and other tools supplied with compilers.

2 Likes

Ah interesting. Yes, I can duplicate this. Thanks!

Something is definitely wrong in the Gprof results. The reverse routine isn’t called for parsing. When I just comment it out completely and rerun Gprof, then it says some other uncalled routine is at the top. So, it is getting confused somehow… Is it a bug?

I noticed that this canada.json file is mostly real numbers. It seems most of the time is spent converting the strings to reals. I haven’t checked jsonff, but in JSON-Fortran, I’m just using:

read(str,fmt=*,iostat=ierr) rval

I notice when I just replace this with

rval = 0.0_RK
ierr = 0

Then the parse time goes down to about 0.05 seconds. So clearly, there is room for improvement here. Is there a faster string to real parser out there for Fortran? Hmmm… maybe I’ll make a new post about this so as not to hijack this thread any more.

2 Likes

Awesome, yes, we might need to write our own string to real converter.

Another benchmark. For the 6.5 MB file big.json that isn’t just real numbers (e.g., it has a lot of string data):

Fortran:

rojff        : 1.5498  seconds
fson         : 0.9193  seconds
json_fortran : 0.2063  seconds

Python:

rapidjson    : 0.045112584 seconds
json         : 0.033147166 seconds
ujson        : 0.021337875 seconds
3 Likes

Thanks to all of you for taking a look and running some benchmarks. For some reason I thought I was a bit closer performance wise. Guess we’ve got some work to do.

As for ideas about where the bottlenecks might be. From what I’ve heard, and to some extent experienced myself, the Fortran library code for reading/writing numeric data is… shall we say not the fastest. And since the canada.json is a lot of numeric data, I suspect that is taking a lot of the time.

Another thought, my file_cursor_t is reading the file one character at a time. Perhaps implementing some sort of buffering would illicit some improvements?

As for the overhead due to error handling, I’d think the branch prediction on modern processors ought to alleviate a lot of that. I’d be curious to know if anybody would know of any way to confirm or deny that though.

I’m happy to take contributions if anybody would be interested. I’d be interested to hear thoughts on the API as well.

That’s very interesting that the builtin json parser in Python can beat rapidjson. My experience has been that rapidjson is one of the fastest.

I think we should experiment with writing a JSON parser that assumes valid JSON and just parses it as quickly as it can. I wouldn’t even worry about representing it at first (nor error handling), just parse it, and perhaps just count how many {} pairs there are. And see if we can get competitive. Then we can add error handling and representing it in Fortran.

Yep, take a look at JSON-Fortran. There is some stuff in there to make the file read go faster (e.g., using STREAM, and also reading it in chunks rather than one character at a time). But, I have to ask: why do you not just use JSON-Fortran? :slight_smile:

1 Like

I didn’t look real hard at it, but my first impressions were that the API didn’t seem that friendly, and the documentation/tutorial wasn’t that illuminating. I tried reading through the source code a bit, because I was curious how you implemented the parser, but had a hard time finding my way around. I never did find where the actual logic for the parser started. So I’ll admit that to some extent my library was born out of Not Invented Here, but I was more interested in the usability aspect than performance, at least initially.

And I will say that rojff is fast enough to be usable, if not necessarily the fastest.

Why not provide Fortran “bindings” for UltraJSON aka ujson and “call it a day”?!

After all, “UltraJSON is an ultra fast JSON encoder and decoder written in pure C” That it has Python bindings is beside the point.

1 Like