Fortranlang: a Python package?

Have you measured the actual error of the different methods? When benchmarking different solvers, it’s generally a mistake to assume that specifying the same rtol and atol lead to comparable error.