Code Generation Using Sympy

I would like to share a short Python snippet I’ve been using very often to generate Fortran code through symbolic simplification and common sub-expression elimination:

# Copyright (c) 2020 Ivan Pribec. All rights reserved.
#
# This work is licensed under the terms of the MIT license.  
# For a copy, see <https://opensource.org/licenses/MIT>.

import sympy as sp
from sympy.printing.fcode import print_fcode

def apply_cse_and_print_fcode(expr,assign_to,source_format='free',standard=95):
    """Apply common subexpression simplification and print Fortran code
    
    Input:
    expr - a list of expressions we would like to evaluate
    assign_to - the symbols we would like to assign them to (either a tuple of strings or sp.symbols)
    
    Output:
        The expressions will be printed to the standard output.
    """
    
    rvar, rexpr = sp.cse(sp.simplify(expr))
    
    print("! Reduced variables")
    for var, var_expr in rvar:
        print_fcode(var_expr,assign_to=var,source_format=source_format,standard=standard)
    
    print("! Reduced expressions")
    for var, var_expr in zip(assign_to,rexpr):
        print_fcode(var_expr,assign_to=var,source_format=source_format,standard=standard)

As an example I will be computing the water activity of a wet material, a_w, which is a function of moisture content X and temperature T, according to the Oswin model equation:

a_w(X,T)=\frac{\left(\frac{X}{a}\right)^{1/b}}{1+\left(\frac{X}{a}\right)^{1/b}},\quad a=a_1+a_2 T,\quad b=b_1+b_2 T

where the parameters (a_1,a_2,b_1,b_2) are empiricial fitting coefficients. In some applications we might also be interested in the derivative of a_w with respect to the variables X and T. Using SymPy we can easily find these using the following snippet (you can run this in a Jupyter notebook):

# Create symbols
X, T, a1, a2, b1, b2 = sp.symbols('X T a1 a2 b1 b2')
# Define symbolic expression
a = a1 + a2*T
b = b1 + b2*T
aw = (X/a)**(1/b)/(1 + (X/a)**(1/b)) 
# Find symbolic derivatives
daw_dX = sp.diff(aw,X)
daw_dT = sp.diff(aw,T)

To generate Fortran code for the expression and derivatives all I need to do now is run the following code:

output = (aw,daw_dX,daw_dT)
output_var = ('aw','daw_dX','daw_dT')
apply_cse_and_print_fcode(output,output_var)

which will print the reduced variables and expressions to standard output:

! Reduced variables
x0 = T*a2 + a1
x1 = 1d0/x0
x2 = X*x1
x3 = T*b2 + b1
x4 = 1d0/x3
x5 = x2**x4
x6 = x5 + 1
x7 = x5/x6**2
! Reduced expressions
aw = x5/x6
daw_dX = x4*x7/X
daw_dT = -x1*x7*(a2*x3 + b2*x0*log(x2))/x3**2

I can now wrap this output into a subroutine block, specifying the precise interface and applying any further changes (i.e. replace the d0 format specifier to my desired precision):

subroutine oswin_model(X,T,a1,a2,b1,b2,aw,daw_dX,daw_dT)
  real(wp), intent(in) :: X, T
  real(wp), intent(in) :: a1, a2, b1, b2,
  real(wp), intent(out) :: aw, daw_dX, daw_dT

  real(wp) :: x0,x1,x2,x3,x4,x5,x6,x7

  ! Reduced variables
  x0 = T*a2 + a1
  x1 = 1.0_wp/x0
  x2 = X*x1
  x3 = T*b2 + b1
  x4 = 1.0_wp/x3
  x5 = x2**x4
  x6 = x5 + 1
  x7 = x5/x6**2
  ! Reduced expressions
  aw = x5/x6
  daw_dX = x4*x7/X
  daw_dT = -x1*x7*(a2*x3 + b2*x0*log(x2))/x3**2
end subroutine

As a second example, say you want to evaluate the Jacobian of a system of ordinary differential equations. I will use the Lorenz attractor as a trivial example:

\frac{dx}{dt} = \sigma(y-x) \\ \frac{dy}{dt} = x (\rho - z) - y \\ \frac{dz}{dt} = xy - \beta z

First we define the symbolic system of differential equations:

x,y,z = sp.symbols('x y z')
s,r,b = sp.symbols('sigma rho beta')

dx_dt = s*(y-x)
dy_dt = x*(r - z) - y
dz_dt = x*y - b*z

system = sp.Matrix([dx_dt,dy_dt,dz_dt])

The SymPy Matrix object has a built-in Jacobian function, that can be called as system.jacobian((x,y,z)) to produce the output:

\left[\begin{matrix}- \sigma & \sigma & 0\\ \rho - z & -1 & - x\\ y & x & - \beta\end{matrix}\right]

To output the Fortran code we again use the function defined above,

output = (system,system.jacobian((x,y,z)))
output_vars = ('F','J')
apply_cse_and_print_fcode(output,output_vars)

which generates the output:

! Reduced variables
x0 = -x
x1 = rho - z
! Reduced expressions
F(1, 1) = sigma*(x0 + y)
F(2, 1) = x*x1 - y
F(3, 1) = -beta*z + x*y
J(1, 1) = -sigma
J(2, 1) = x1
J(3, 1) = y
J(1, 2) = sigma
J(2, 2) = -1
J(3, 2) = x
J(1, 3) = 0
J(2, 3) = x0
J(3, 3) = -beta

We can see in this case, there is not much need to introduce temporary variables. Nevertheless, the symbolic route is much easier than writing the Jacobian by hand. In principle, it is possible to fine-tune the format of the output arrays by passing additional arguments to the Fortran printer in SymPy. Personally, for short functions, I prefer to do it manually using a good text editor.

If you are interested in learning more about code generation using Sympy, I suggest reading the tutorial “Automatic Code Generation with SymPy”. For those who have a MATLAB license, a similar tool is available in the Symbolic Math Toolbox.

Hopefully, someone will find this useful.

15 Likes

Thanks for sharing your code, this can be very useful.
You can use also Mathematica to simplify and export some mathematical expression to fortran.

Another, comment about your applications, if you don’t need the analytical expressions (you just need to be able to compute some values), you can use some automatic differentiation libraries (see http://www.autodiff.org). Some libraries are in fortran:

@ivanpribec thanks for sharing that. I have several additional thoughts:

  • I think it could be sometimes helpful to keep the original expression in Fortran, then extract it into SymPy via LFortran, do some manipulation of it in SymPy, and generate Fortran code out of it. We have a prototype for that in LFortran.

  • For autodifferentiation, one route is the previous bullet point or your post. Another route is directly via a compiler (https://gitlab.com/lfortran/lfortran/-/issues/97).

  • There are initial Fortran wrappers for SymEngine (a reimplementation of SymPy in C++), I started them here: https://github.com/symengine/symengine.f90. That would allow to do such symbolic manipulations in Fortran itself. It might open many possibilities. Essentially, just like Matlab can do symbolics in Matlab itself (see the link you posted above), we should be able to do this in Fortran. If anybody is interested in brainstorming this, designing this, or helping out, please let me know.

3 Likes

I’ve looked at SymEngine a few times but lack the time to get my hands dirty. Probably the low-level C interface could be lifted back up in Fortan to an OO-style interface so users could just write their symbolic expressions normally using operator overloading. As a next step you could use the delayed evaluation technique described by @Arjen Markus in “Modern Fortran in Practice” (page 31). It was also shown in his presentation at FortranCon 2020 - “Experimental Programming in Fortran”.

With LFortran run dynamically from a Jupyter Notebook, I guess it would be interesting to have equivalents of sympify and lambdify to convert between textual representation, symbolic expressions, and Fortran code.

For static compilation however, with existing tools, I don’t really see how I could get a symbolic block replaced with Fortran. I guess it is possible to write a fypp macro to read a Fortran expression into a symbolic representation, perform symbolic differentation or simplification, and then get replaced with Fortran code. But I am not sure what is the advantage compared to doing it separately in a Jupyter notebook (apart from having the code in a single file).

2 Likes

I’ve looked at SymEngine a few times but lack the time to get my hands dirty. Probably the low-level C interface could be lifted back up in Fortan to an OO-style interface so users could just write their symbolic expressions normally using operator overloading. As a next step you could use the delayed evaluation technique described by @Arjen Markus in “Modern Fortran in Practice” (page 31). It was also shown in his presentation at FortranCon 2020 - “Experimental Programming in Fortran”.

With LFortran run dynamically from a Jupyter Notebook, I guess it would be interesting to have equivalents of sympify and lambdify to convert between textual representation, symbolic expressions, and Fortran code.

Great ideas!

For static compilation however, with existing tools, I don’t really see how I could get a symbolic block replaced with Fortran. I guess it is possible to write a fypp macro to read a Fortran expression into a symbolic representation, perform symbolic differentation or simplification, and then get replaced with Fortran code. But I am not sure what is the advantage compared to doing it separately in a Jupyter notebook (apart from having the code in a single file).

This has to be implemented, but you would call it exactly as you do it now (whether from a Jupyter notebook, or a standalone .py file), except that you will use Fortran instead of Python. So you either write a .f90 program (a “script”) that gets compiled (with any Fortran compiler) and executed; or you call it from a Jupyter notebook via LFortran. This “script” will then call SymEngine to do whatever manipulation you want and generate Fortran code.

The overall idea is that in Matlab you don’t need to switch to another language (Python or Mathematica) to do symbolics, you simply use Matlab for everything: numerics, plotting, symbolics and code generation. In the same way, we could use Fortran down the road for all these tasks also. Until we do it, I can’t of course guarantee that it will be easy to use and natural, but if Matlab can do it, then I think Fortran (as a language) can do it too, as easily. (If Matlab couldn’t do it, then that would be different.)

In MATLAB, one can convert a symbolic expression to a numeric function handle for immediate use as follows:

syms x y
f(x,y) = x^3 + y^3;
ht = matlabFunction(f)

It is also possible to put this into a script which generates a separate function file, i.e.

syms x
f = x^2 + log(x^2);
matlabFunction(f,'File','myfile');

produces the output:

function f = myfile(x)
%MYFILE
%    F = MYFILE(X)

%    This function was generated by the Symbolic Math Toolbox version 8.4.
%    01-Sep-2019 00:00:00

t2 = x.^2;
f = t2+log(t2);

Note the creation of a temporary variable. I copied both examples from the MATLAB documentation: https://de.mathworks.com/help/symbolic/matlabfunction.html

Carrying this concept over to Fortran would mean having a driver script along the lines of:

use symengine

type(sym_t) :: x, y
type(expr_t) :: f

call f%set([x,y], x**2 + y**2)
call fortran_function(f,'File','myfile')
end

In the most simple setting this would produce the file myfile.f90 containing something like

elemental function myfile(x,y)
  real(sp) :: myfile
  real(sp), intent(in) :: x, y
  myfile = x*x + y*y
end function

One of the challenges would be how to extend the interface to arbitrary precision and simultaneously provide a common interface block. And second, how to make such kind of symbolic meta-programming tools available inside of a module.

1 Like

@ivanpribec perfect, that is exactly what I had in mind. I opened https://github.com/symengine/symengine.f90/issues/4 for some of the details.

@ivanpribec, @certik, et al.

I’m inclined to make a close connection with this challenge mentioned above @ivanpribec with the other big challenge with Fortran 202Y which is whether a sufficiently full-featured GENERIC programming facility can be incorporated in it. Any investigation by you all with code generation along the lines of this thread and the findings can really help become good use case(s) for generics in Fortran 202Y.

In the meantime, integrating some preprocessing into the mix, say FYPP, might be an option for the generated code to work with any of the supported floating-point kinds in a modern Fortran processor?

3 Likes

Some users might be more interested in the most numerically stable transformation of an algebraic expression.

This is a good place to put a link to Herbie.

2 Likes

Indeed, another idea is to integrate something like Herbie with a Fortran compiler:

Minor correction in @ivanpribec 's post of 28 Sep 2020: in your TeX formatted Oswin model equation, only X is raised to the power (1/b), whereas in your code (X/a) is raised to the power (1/b). In other words, you need to add parentheses around (X/a) in two places, or introduce a new variable for X/a.

2 Likes

Thanks @mecej4 for catching this. Indeed, the TeX formatted model equation should have parentheses around (X/a). I am not able to edit the post any more, I’m not sure if the moderators (ping @lkedward) are able to do it?

1 Like

I think it is fixed now.

1 Like

Thanks for sharing this! Do you know if there is a way to have Sympy produce expressions like x**(n-1) instead of x**n/x, avoiding a NaN for x=0?

I am currently experimenting with generating function overloads for hyper-dual numbers (dual numbers extended to 2nd derivatives) based on the generic overload rules described here: https://arxiv.org/pdf/1801.03614.pdf

But there seems there is no way around manual inspection to catch these kind of things!

On MacOS 12.6.3 Monterey, with Python 3.10.9 and sympy 1.11.1 I get

>>> import sympy as sp
>>> from sympy.printing.fcode import print_fcode
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'sympy.printing.fcode'

:frowning:
Not aliasing sympy as sp (just import sympy) does not help.

It looks like the module has been renamed to sympy.printing.fortran.

It is not related directly to the topic of the post, but I’ve developed a library, AD_dnSVM, to deal up to third order derivatives. All intrinsic Fortran 2003 functions are overloaded (including, dot_product, matmul and order matrix operations).

3 Likes

Thanks for sharing! Here is my approach.

It is a fork of joddlehod´s DNAD repository. I have refactored it into fypp macros that can “inject” interfaces and function overloads directly into the module where they are used. See for instance example/example_hdual_mod.fypp. One advantage of this approach is that functions are likely to be inlined without having to resort to interprocedural optimizations. I also like to think about the dual type in the macro as a kind of “meta-type” where each generated type have a particular meaning - representing derivatives with respect to particular physical variables. Lately I have also introduced a hyper-dual type where the overloaded functions are automatically generated by src/fortran_gen.py.

1 Like

Python/Sympy here generates fypp Macros, so it is actually meta-meta-programming😀