Package for Bayesian inference?

Hi all,

Great to see the Fortran community very much alive and kicking!

My question is whether there are current (modern) Fortran projects to perform Bayesian inference using Markov Chain Monte Carlo. I am thinking about packages similar to Stan (R/python) PyMC3 (python) or Turing.jl (Julia).

Thanks!

2 Likes

Yes, I lead the development of one such package. Here is the documentation and extensive examples in different languages. It is available for serial and parallel MPI and Coarray simulations. It is also available in several other languages besides Fortran. We will soon release a new version of the library, adding a considerable number of new algorithms and significantly improving the existing ones.

There are also other packages in Fortran that are specifically stochastic samplers/integrators, like mcmcf90 and MultiNest.

If you let me know your specific usage, I’d be happy to help more.

4 Likes

Thanks a lot! We have a large numerical code written in Fortran, for which I would like to use Bayesian inference. Currently I am using samplers in other languages (mainly PyMC3), but it would be nice to be able to do everything in Fortran. For the samples, I treat our numerical models as black-box since they are rather complex, and taking derivatives w.r.t. the variables is not trivial (although not impossible). This makes more advanced samples (like Hamiltonian MCMC) not suitable. Do you think paramonte is useful for this case?

3 Likes

Yes, that is the primary reason why we developed the package in the first place and why we developed it in Fortran; for generic Bayesian inverse problems. All you need to supply is a (likelihood or any density) function that returns the natural logarithm of its scalar value for given input vector and its length,

function getLogFunc(ndim, point)
    use iso_fortran_env, only: IK => int32, RK => real64
    integer(IK), intent(in) :: ndim
    real(RK), intent(in) :: point(ndim)
    real(RK) :: logFunc
    logFunc = -sum(point**2) ! replace this with any complex Bayesian inverse problem.
end function

How many parameters does your model have (what is the value of ndim)?
Welcome to the forum, by the way.

1 Like

How does pymc3’s M-H sampler perform for your model? I want the same thing in Stan world but the fact is gradient-based sampler is just way superior when the dimension of the parameter space is high. My experience is that NUTS with even a crude gradient from finite diff can be more efficient than M-H & Gibbs. I did the experiment with MFEM.

Stan is essentially c++. The R & Python & (fill space here) are just different interfaces.

1 Like

Welcome to the Forum!
May I ask a layman question, in your poster https://www.metrumrg.com/wp-content/uploads/2018/09/stan_pde_poster.pdf, which equation did you find NUTS is better?

Thanks. I’ve been a lurker for quite a while.

I only tried NUTS vs M-H on the Darcy’s flow model as the other two solvers provide gradient out of box so it’s natural to use HMC. Unless M-H begins at very close (or with very informative prior) it takes much more iters to get enough effective samples.

1 Like

Thank you man! I did some google search about you and your group, it looks what we are doing now are king of similar. I am developing new Monte Carlo Parametric EM methods, and perhaps some non-parametric algorithms. All written in Fortran.
By the way, what ODE solver do you use? Solving ODE seems the most time consuming part. I am currently chasing all kinds of ODE solvers written in Fortran, I think I have collected many of them.
The only type of ODE solvers I am currently not sure how to use are those with something called preconditioned Krylov methods. LOL.

Stan uses sundials & boost::odeint. HMC requires gradient so Stan relies on CVODES. I just added IDAS support to DAEs.

I happen to be also working on things like SAEM.

1 Like

Oh, I see. Cool! Eh, Fortran also have one for DAE called DASKR in netlib/ode, the first one, daskr.tgz, the same author Alan Hindmarsh I believe. The latest update is on 2011-6-8. Perhaps a little old, lol.
https://www.netlib.org/ode/

Cool! We could discuss more if possible :slight_smile:
I have tried to make the M-step efficient. Now, if the E-step can be evaluated efficiently with some good importance sampling function it can greatly boost the computation. It may reduce the number of samples from like 1000-2000 to like 300 or so. Therefore basically speedup by a factor of 3 or so.

DASKR is the ancestor of IDA in sundials.

Sounds good. Let’s not hijack the thread.

1 Like

Thanks for the input and the nice discussion! The MH samples works as expected, thus with slow convergence and especially struggling with very strong anisotropy in the posterior. We have recently started working with emcee with quite some succes. It uses an Affine Invariant MCMC sampler, which helps tremendously with the anisotropy. I am looking into paramonte now which seems pretty promising (although, if am not mistaken, is the core of that code not written in modern Fortran). EDIT: this last statement is incorrect, see reply from @shahmoradi

1 Like

Hi @njaen, ParaMonte is Fortran 2018 compliant (There is no F77, other than isolated external modules, if any, that are irrelevant to the functioning of the library anyway). There is a new sampler ParaNest in the development branch of ParaMonte, which is ideally suited for highly degenerate objective functions and outperforms virtually all existing MCMC techniques. It is production-ready but has not been released yet. If your work is time-sensitive and you want to use/try it, let us know and we will be happy to expedite its release.

2 Likes

Thanks! I have edited my previous post about it not being written in Modern Fortran to make sure future readers don’t get the wrong impression. No need to speed the release, we are still very much in an explorative fase. I’ll definitely give ParaNest a try in the future. Thanks a lot for your work on this library!

1 Like

Hi @shahmoradi , what is the largest number of parameters (ndim) that Paramonte can support? I looked at Paramonte and tested it with a small sample, but I am confused with ndim and the size of the correlation matrix.

1 Like

It depends on the problem. The current algorithm can handle target densities up to very high dimensions (100D or more) if they are not too complex. It has readily handled complex non-trivial target densities up to 32 in actual research problems. We are in the progress of finalizing a massive update to the library, primarily targeting Fortran users but also adding new algorithms that we expect to dramatically improve the sampling efficiency and capabilities of the existing algorithm. Upon releasing the new version, I will update this thread in a few days to weeks.

1 Like

The primary challenge with high-dimensional targets is the vastness of the search landscape and the frequent excessive landscape flatness almost everywhere that makes the sampling very difficult.

Thank you @shahmoradi. I will read a bit more about the theory because I was expecting larger dimensions.

Sampling is practically impossible even at moderately high dimensions, particularly, with mcmc, unless the gradient information is supplied. Alternatives, that i know may work better are variational inference and gradient based optimization. But even then there can challenging problems like vanishing gradient and similar. How many dimensions is your problem?

I was thinking about linear mixed model equations with millions of variables. Gibbs samplers might be more appropriate.

It would have been easy to use your library. Congrats for that!