Getting type checking in Python similar to Fortran

I would like to have function argument declarations in my Python code to make it more informative, to catch mismatched arguments, and to ease future automatic translation to Fortran. You can annotate scalar arguments with :int or :float, but I am having trouble annotating NumPy array arguments.

In Fortran one can declare an argument

real(kind=kind(1.0d0)) :: x(:)

and in Python one can write

from __future__ import annotations
import numpy as np
from nptyping import NDArray
def f(x:NDArray[np.float64]):

However, CPython does not check type annotations, and mypy, a tool that does, complains about such a declaration. For the code

from __future__ import annotations
import numpy as np
from statsmodels.tsa.arima_process import ArmaProcess
from nptyping import NDArray

def ar1_sim(nobs:int, ar1:float):
    """ simulate an AR(1) autoregressive process """
    ar_coeff = np.array([1, -ar1])
    ma_coeff = np.array([1])
    AR_object1 = ArmaProcess(ar_coeff,ma_coeff)
    return AR_object1.generate_sample(nsample=nobs)

def ar1_sim_mat(nobs:int, ar1:NDArray[np.float64]):
    """ return a matrix each of whose columns follow an AR(1) process """
    ncol = len(ar1)
    xmat = np.zeros(shape=[nobs, ncol])
    for icol in range(ncol):
        xmat[:, icol] = ar1_sim(nobs, ar1[icol])
    return xmat

mypy statsmodels_util.py says

statsmodels_util.py:5: error: Skipping analyzing "statsmodels.tsa.arima_process": module is installed, but missing library stubs or py.typed marker  [import]
statsmodels_util.py:5: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports
statsmodels_util.py:15: error: "ndarray" expects 2 type arguments, but 1 given  [type-arg]
Found 2 errors in 1 file (checked 1 source file)

Mypy does accept a declaration of the form

def ar1_sim_mat(nobs:int, ar1:NDArray):

which leaves the rank of argument ar1 unspecified. My general question is how close can you get to Fortran’s type checking, which considers type, kind and rank, in Python. If the answer is “not close”, that is an argument for Fortran for some kinds of programs.

Just out of curiosity, have you checked whether with Numba’s JIT option “nopython=True” you could get a more strict type checking?

You can indeed use numba to enforce argument types. For example, numbalsoda/driver_solve_ivp.py at c9183b5bc75002bac5270339f0e0ca645f102b4d · Nicholaswogan/numbalsoda · GitHub

Another option is Cython, which also enforces argument types. For example, clima/futils.pyx at 9eb31401f6e6a9f101b896107c67e5db6cef35fd · Nicholaswogan/clima · GitHub

To the best of my knowledge, the most complete solution to type-check numpy arrays is nptyping. It allows one to specify both “type” (float32, float64, etc.) and array shape.
Rank can be specified as well. For example:

from nptyping import NDArray, Shape, Float64, Int32

# rank-2 array of double precision; real(8), dimension(:,:)
NDArray[Shape["*, *"], Float64]

# vector of integer; integer, dimension(:)
NDArray[Shape["*"], Int32]

My favorite package for this is pydantic.

1 Like

You can also use LPython, a compiler we are developing. Here are some examples:

The advantage of LPython is that it compiles to identical intermediate representation as LFortran, so gives exactly identical runtime performance. And you will also be able to use it to translate the Python code to Fortran and vice versa. The above examples work with both LPython and regular CPython. So in your case if you only want the type annotations, you could use it with CPython. LPython is in alpha stage, so if you need something to use in production, you might want to wait a little longer. If you have any feedback, definitely let us know.

1 Like

That sounds great, and I look forward to future announcements. Since Fortran is known by many fewer programmers than Python, has fewer general-purpose libraries, and is more verbose than Python, I wonder what the niche of Fortran would be if there were a Python implementation that equaled it in speed.

Assuming LPython and LFortran become production, some advantages of Fortran over the Python subset that LPython can compile:

  • You can use other Fortran compilers
  • The syntax of Fortran might be better
  • You can use all of Fortran and it will be fast

With LPython we are initially targeting a subset of Python that can be ahead of time compiled. That means that most Python libraries will not be possible to compile, unless you rewrite them to this subset. We’ll allow using them, via CPython, so at least you get the features, but you don’t get Fortran speed. I don’t know of any way to get Python fast without the type annotations that LPython needs. One can relax things in many ways (Implement --infer · Issue #305 · lcompilers/lpython · GitHub), but you can’t compile just any Python code. It’s nice that LPython will guide you, it will tell you to declare (annotate) this and that variable, until it compiles, and if it compiles, you will get Fortran speed. It’s not clear to me it’s a threat to Fortran, I think if anything, it’s an enhancement. I am guessing people will continue using both.

1 Like

Thanks – Codon looks like a similar effort. It does not currently impove speed of Python code that uses NumPy, but that is being worked on. There is also Pyccel, which

can be viewed as:

  • Python-to-Fortran/C converter
  • a compiler for a Domain Specific Language with Python syntax

There are about 20 Python compilers, we have a fairly complete list at https://lpython.org/, we list Seq-lang, which is a predecessor to Codon.