SmartSim from CrayLabs

milancurcic · April 11, 2021, 5:23pm

TL;DR: SmartSim allows sharing data between applications in different languages (Fortran, C, C++, Python) through an in-memory cache based on Redis. It seems geared toward training ML models in Python and applying them on the fly in HPC simulations.

The cache layer is SmartRedis:

Here’s the Twitter thread about it from the developer: https://twitter.com/SamPartee/status/1379506098177146881

SmartRedis is especially interesting to me because it seems to provide a Fortran client to Redis, an in-memory database that I’ve been enjoying in the past year or so.

ivanpribec · April 11, 2021, 8:20pm

Very interesting, thanks for sharing. I bet we will see even more ML + HPC examples in the future.

It would be great if we could wrap the TensorFlow C API. I’ve saved links to a few blog posts (BP1, BP2) about how it can be used to run a model trained in Python from an application in C. I believe the C API also serves as the basis for bindings for other languages.

As a proof of concept I downloaded the precompiled Linux archive given in the Setup section, extracted it into /usr/local/ and ran sudo ldconfig to configure the linker.

Then I created a small wrapper module:

Click for details

module tf_api

  use, intrinsic :: iso_c_binding

  implicit none

  interface
    integer(c_size_t) function c_strlen(str) bind(C,name="strlen")
      import c_size_t, c_ptr
      type(c_ptr), intent(in), value :: str
    end function
  end interface

  interface
    type(c_ptr) function c_tf_version() bind(c,name="TF_Version")
      import c_ptr
    end function
  end interface

contains

  !> tf_version returns a string describing version information of the
  !> TensorFlow library. TensorFlow is using semantic versioning.
  function tf_version() result(res)
    character(len=:), allocatable :: res

    type(c_ptr) :: c_str
    integer(c_size_t) :: len

    c_str = c_tf_version()
    len = c_strlen(c_str)

    block
      character(len=len,kind=c_char), pointer :: f_str => null()
      call c_f_pointer(c_str,f_str)
      res = f_str
    end block

  end function

end module

and replicated the original hello_tf.c example:

program hello_tf

  use tf_api
  implicit none

  print *, "Hello from TensorFlow C library version ", tf_version()

end program

Finally I compiled the program, and ran it:

~/fortran/tensorflow$ gfortran tf_api.f90 -ltensorflow -o hello_tf 
~/fortran/tensorflow$ ./hello_tf
 Hello from TensorFlow C library version 2.4.0

Addendum: here a few more links if someone would like to explore calling TensorFlow from Fortran further

Beliavsky · April 11, 2021, 8:50pm

There is the Fortran-Keras Bridge (FKB)

This library allows users to convert models built and trained in Keras to ones usuable in Fortran. In order to make this possible FKB implements a neural network library in Fortran. The foundations of which are derived from Milan Curcic’s original work.

Keras is built on top of TensorFlow. Since Keras is more high-level I think one would want to call it rather than TensorFlow when possible.

ivanpribec · April 11, 2021, 9:03pm

Don’t take my word for this, but I think there is an important difference, in case of FKB the weights and topology are loaded from a file, and the neural network kernel is a piece of Fortran code.

In case of calling the TensorFlow primitives directly, one actually launches a (GPU) kernel (either pre-compiled or JIT-compiled) produced by the Tensorflow XLA Optimizing Compiler for Machine Learning.

I haven’t used Tensorflow before, so I cannot vouch this is entirely correct.

spartee · April 16, 2021, 5:19am

Thanks for sharing @milancurcic! You described it well. The important thing to know about SmartSim is that its a client-server paradigm. This is very unlike other solutions for ML for HPC like Fortran Keras Bridge, which largely builds off the work of @milancurcic and Neural Fortran, that runs inside the simulation itself.

FKB is undoubtedly a cool framework. We looked into using it for our research. Compiling in C APIs also works (see github TensorFlowFoam for an example of this TF C API with OpenFOAM) but it can be alot of work to build and maintain.

The reason we built SmartSim like this boils down to flexibility.

Diversity of supported ML frameworks (PyTorch, TF, TF-Lite, ONNX, etc)
Diversity of supported languages (C, C++, Fortran, Python)
Choice of ML compute type (CPU/GPU)
Choice of ML compute location (co-located or distributed on the same network)
Pre/postprocessing in Python/Torchscript instead of app language
Ability to update model parameters/model itself during the course of computation (from all supported languages esp Python)
light: minimal LOC added to simulation codebase

The client server architecture also facilitates online analysis (“live” output of diagnostics for plotting, training etc) bypassing the filesystem. The SmartRedis clients work with OSS Redis cluster but don’t have the same scaling problem as many OSS redis clients because we address database shards directly. The data is distributed evenly on the database cluster nodes and is opaque to the user. All you need to know is the key where you stored the data you want to look at/use (think large distributed python dictionary but fast and written in C). Similar to how it distributes storage requests evenly, it also distributes processing and inference requests evenly. The ML runtimes execute in the database. Essentially you can upload trained models from a jupyter notebook in python(or from the workload itself), use it in the workload(C/C++/Fortran), update if you want to, and monitor it the whole time.

Performance is the natural next question. How does moving the ML/processing/analysis outside the workload scale? We are writing a paper about this and a collaboration we did with NCAR on MOM6 (Fortran based ocean model) and it should be on arxiv very soon. If you can’t wait, we hosted the inference scaling test we ran for the paper and if you have access to a fairly large allocation on a slurm system with GPUs it’ll run with few changes. Theres a Fire CLI for changing the experiment parameters. If you’re interested in running the online inference scaling tests on a different system at scale, feel free to message me on the slack (invite hosted on the github) we created and Ill help.

Also @ivanpribec you are right we need to do more tutorials and examples. We have some coming out soon. Let me know if there is an example you’d like to see.

We have some fortran examples here

Fortran Client examples

I would post the API link to but since I am a new user I can only post two links.It should be fairly easy to find from the above link to the docs.

Were also speaking at PyTorch Ecosystem day (poster) and RedisConf (pre-recorded presentation) if you want to hear more.

It’s still young and we have alot to do, but hopefully y’all find it useful! If you’d like to contribute we would certainly appreciate the help and Fortran expertise.

milancurcic · April 16, 2021, 4:10pm

Welcome to the forum, @spartee!

certik · April 18, 2021, 8:16pm

Thanks @spartee for explaining SmartSim and for posting here. Welcome!

Topic		Replies	Views
Fortran For Building ML Models Tutorials	5	1776	July 12, 2023
Using a trained neural network model in a fortran code Help	25	3031	November 17, 2023
fastGPT: Faster than PyTorch in 300 lines of Fortran Announcements	32	4443	May 4, 2023
ONNX - Library/vendor agnostic API for inferencing DL models Help	18	978	February 27, 2024
Fortran called by Python/Vice-Versa Help	11	1129	February 26, 2022

SmartSim from CrayLabs

Related topics