Why the performance is poorer after using OpenMP?

JohnCampbell · June 2, 2022, 10:52am

c(i,j,thread) = c(i,j,thread) + 0.052*lap_c(i,j,thread)
from
lap_c(i,j,thread) = … + c(im,j,thread) + …
but c(im,j,thread) was updated for the previous itteration (i-1) of do i = 1,nc
plus c(:,jm,thread) from the previous itteration of do j = 1,nc.

It may appear acceptable but the updated values are not all based on estimates from the previous step.

It is necessary to carry forward the unchanged single value of “c(i,j,thread)” plus the unchanged row c(:,jm,thread) from the previous itteration, with appropriate consideration of edge conditions.

C could be declared as allocate ( c(0:nc,0:nc,threads) ), and so utilise c(:,0,threads) to store the previous vector C(:,jm1,threads) and C(0,j,threads) for C(im1,j,threads), at the appropriate time in the itteration, or use a local real(kind=8) :: c_im and c_jm(nc).

The following code suggests a possible solution

  real(kind=wp) :: c_jm(nc), c_jn(nc), lap_c

        c_jm(:) = c(:,NC,thread)
        do j = 1,nc
            jp = j+1
            if (j == Nc) jp = 1
            c_jn(:) = c(:,j,thread)
            im = nc
            do i = 1,nc
                ip = i+1
                if (i == Nc) ip = 1
!old            lap_c(i,j,thread) = c(ip,j,thread) + c(im,j,thread) + c(i,jm,thread) + c(i,jp,thread) - 4.0*c(i,j,thread)
                lap_c             = c_jn(ip)       + c_jn(im)       + c_jm(i)        + c(i,jp,thread) - 4.0*c_jn(i)
                c(i,j,thread) =  c(i,j,thread) + 0.052*lap_c
                im = i
            end do
            c_jm(:) = c_jn(:)
        end do

In the OpenMP case, it is the edges that need to be updated between the threads at each step.

Topic		Replies	Views
Fortran openMP - make code faster	4	468	August 31, 2022
OpenMP parallelization Help	1	432	March 5, 2022
Openmp question Help	3	641	October 11, 2021
Trying to make a program using OpenMP give same output as non parallel counterpart Help	2	481	May 9, 2022
Slow thread creation with nested loops in GFortran Help	5	674	January 21, 2023

Why the performance is poorer after using OpenMP?

Related topics