Questions on variable scope in parallel computing

Dear all,

I recently facing some difficulties in parallelizing a loop using the powell methods in the toolbox.f90 in Fabian Kindermann’s “Introduction to Computational Economics Using Fortran” as attached. I come up with an toy example in the attached main.f90 to illustrate what I encountered.

In the line 48, the loop I want to parallelize looks like,

    !$omp parallel do collapse(2) private(inda, indb, aval, bval, xout, yout)
    do inda = 1, anum
        do indb = 1, bnum
            aval = agrid(inda)
            bval = bgrid(indb)
            call fmintest(aval, bval, xout, yout)
            savemat(indb, inda, :) = [ xout, yout ]
        end do
    end do

If I comment out the !$omp line, then the code runs fine; if I uncomment that line, the code runs pretty randomly, sometime it works sometimes it is far.

Let me know what’s wrong with this way of parallelizing the code, and whether there’s a scope issue here. Thank you so much for your attention!

toolbox.f90 (331.0 KB)
main.f90 (1.6 KB)

1 Like

Hi @fish830911, what catches my eye first here is fmintest as a potential problem. Is this a pure subroutine (having no side effects)? Also, you want to make sure xout and yout are scalars when used in savemat(indb, inda, : ) = [ xout, yout ].

1 Like

Fmintest is just a wrapper that calls fminsearch. The issue here is that the user is calling fminsearch, which is a minimization routine and wants to pass the parameters aval and bval to the function to be minimized.

The code is like a sandwich:
First the main calls fmintest (user-written) which in turn calls fminsearch (this is a toolbox routine) that in turn calls xyfunc (this is a user-written function).

Problem: how do we pass aval and bval to xyfunc?
There are two solutions:
(1) Declare aval and bval as module variables, with the attribute threadprivate
(2) Use an internal function approach, which requires this extra fmintest

There is an example on the fortran-lang callback best practice

Hope this link is useful @fish830911

1 Like

It’s tricky, because xyfunc is not called by its host routine, which means that accessing aval and bval by host association looks unpredictable to me.

I would even tend to question the conformance of this code from a pure Fortran POV, but I won’t dig into the standard to find out.

My understanding is the values of aval and bval remain defined until fmintest exits.

The part I think is worth questioning is the implementation of fminsearch from the toolbox module (a generic interface for the procedure powell). The procedure powell makes use of several of the private module variables prefixed with tbox_:

! should the random tbox_seed be set
logical, private :: tbox_seed = .true.

! Level of tolerance for all routines
real*8,  private  :: tbox_gftol = 1d-12

! Maximum number of iterations
integer, private  :: tbox_itermax_min = 1000

! Boolean value for showing iterations for optimization
logical, private :: tbox_show_min = .false.

! Maximum number of iterations for brent_pow
integer, parameter, private  :: tbox_tbox_itermax_pow_b = 150

! Level of tolerance for all routines
real*8,  private  :: tbox_gftol_root = 1d-8

! Maximum number of iterations for broydn
integer, private  :: itermax_root = 200

! Boolean value for showing iterations for root finding
logical, private :: tbox_show_root = .false.

The procedure powell only references these values so I assume this should not be a problem.

Another potential source of problems is that powell calls brent_pow which uses save’d variables:

        !##########################################################################
        ! FUNCTION brent_pow
        !
        ! Minimizes a one dimensional function.
        !##########################################################################
        function brent_pow(ax, bx, cx, tol, xmin, func)

            implicit none


            !##### INPUT/OUTPUT VARIABLES #########################################

            ! left, middle and right interval points
            real*8, intent(in) :: ax, bx, cx

            ! level of tolerance
            real*8, intent(in) :: tol

            ! minimum value found
            real*8, intent(out) :: xmin

            ! function value at minimum
            real*8 :: brent_pow


            !##### OTHER VARIABLES ################################################

            real*8, parameter :: cgold = 0.3819660d0
            real*8, parameter :: zeps=1.0e-3*epsilon(ax)
            integer :: iter
            real*8 :: a=0d0, b=0d0, d=0d0, e=0d0, etemp=0d0       ! SAVE 
            real*8 :: fu, fv, fw, fx, p, q, r, tol1, tol2, &
                u, v, w, x, xm

This means there is a race condition on the variables a, b, d, e, and etemp. Essentially, the toolbox is not written in a thread-safe way.

In general, the toolbox code is written a bit sloppily (for instance it uses single precision in literals in several places). I’m also a bit uncertain about skipping the dot in real literals. :eyes: The callback function func is passed to internal procedures of powell as an argument, although it could be accessed through host-association.

5 Likes

If there is a d exponent, then it is double precision even without a decimal point. But your general comments are valid, the nonstandard *8 notation and d exponents should all be replaced with standard notation, preferably with parametrized kind values so that they could be easily changed. There are robust tools to do all of that, and it would only take a matter of a few seconds. Of course making them thread safe requires more than just a simple conversion step.

1 Like

To fix the race condition, could we replace

real(8) :: a = 0d0

with

real(8) :: a
a = 0d0

and likewise for the other variables?

1 Like

That would be my recommendation, yes. Without the initialization, the variables will become automatic variables.

(In gfortran, -fopenmp implies -frecursive; this will apply to some other internal procedures which use local arrays.)

In fact, brent_pow already sets some of those variables, so one could just add the missing ones

            a = min(ax, cx)
            b = max(ax, cx)
            v = bx
            w = v
            x = v
            d = 0.0d0        ! <--- added
            e = 0.0d0
            etemp = 0.0d0    ! <--- added
1 Like

Dear @ivanpribec,

Thank you so much for spotting this! I applied the change to both the toy example and the project I am working on and now the loop paralleled perfectly.

To avoid myself from making this mistake, could you explain a little bit what does it mean a variable is save’d? And why such way of defining variables would create race conditions?

Thank you!

Hui-Jun Chen

1 Like

Dear @aledinola

Thank you very much for the link, I will read it after I finish moving!

1 Like

Dear @fxm

In this example, xout is a vector and yout is a scalar. Somehow in gfortran it just compiles and runs without an issue.

1 Like

For historical reasons, initializing a variable has an implicit save behavior:

real(8) :: d = 0.0d0

is equal to

real(8), save :: d = 0.0d0

This is different from a separate declaration and assignment, i.e.

real(8) :: d
d = 0.0d0

The purpose of the save attribute is to save the value between calls of the function, implying the variables is placed in static storage and are persistent for the duration of the program. Another way I personally think of it is like a global variable, but limited to the scope of the function, if that makes sense. Or you could imagine it like a “function with memory”; the value of d is remembered across procedure calls.

Now in a multi-threaded program, each thread calling brent_pow will be referring to the same save’d variables thereby “stepping on each others toes”, i.e. modifying the values that another thread is still using. This is known as a data race. You could solve this issue by putting the function call in a single construct:

    !$omp parallel do collapse(2) private(inda, indb, aval, bval, xout, yout)
    do inda = 1, anum
        do indb = 1, bnum
            aval = agrid(inda)
            bval = bgrid(indb)
            !$omp critical
            call fmintest(aval, bval, xout, yout)
            !$omp end critical
            savemat(indb, inda, :) = [ xout, yout ]
        end do
    end do

but this would force the execution to become sequential (the threads would enter the function exclusively, i.e. one-by-one). Instead, by removing the saved variables, each thread keeps their own copy of d and the data-race is resolved.

Edit: The single construct, has been replaced with critical; this way all threads in the group will call fmintest exclusively. Sorry about the error.

The implicit save is a very common “gotcha” in Fortran. You can find other explanations at the following links:

I’ve also shown some examples related to save here:

3 Likes

Perhaps a fitting saying in this context is

Too many cooks spoil the broth.

Either you let each commis chef prepare their own broth, or the sous chef upgrades one commis into the potager (soup cook). The analogy being between threads and commis’.

3 Likes

I find that your typical exposition of Fortran gives attention to some of the pillars of Fortran (the Expression, the Assignment, the DO loop, the Array) but leaves the Association under-explored.