Are 4 nested do loops suboptimal?

Hi all,
I am writing a loop to generate dimensions for a hollow rectangular cross section however the loop takes very long.
Note most of the functions that are called inside the inner most loop are pure functions.
I am using Gfortran no speed up options or parallelization

How do I speed things up ?

do i=10,99
   do j=10,99
        do k=i+1,100
           do m=j+1,100
                 !calculations
            end do
        end do
    end do
end do

Thanks
Note: Added my code :slight_smile:
The code is basically a beam solver for the best cross-section for a given deflection , beam length , and distributed load.
My Code:

program bmsolve1
  implicit none

  type data
     real,allocatable :: disp(:),moment(:)
     real :: maximum_disp
     real :: L_b,L_c
     real :: I,A,d1,d2,d3,d4,d5,d6
     real :: F_b,F_c
     real :: maximum_normal_stress
     integer :: id
  end type data

  real,parameter :: L_b_i=2,L_c_i=4,q=8*1000,disp_all=5*0.001,pi=3.14159,L=6,E=200*(10**9)
  real :: F_b_i,F_c_i
  real :: x
  integer :: i,j,k,m,n,total
  type(data) :: s
  type(data) :: best_circ_s,best_circ_h,best_arr(5),best

  total=0

  call find_react(F_b_i,F_c_i,q,L,L_b_i,L_c_i)

  print *, ' '
  print *, 'Initial reaction at B: ',F_b_i,'N'
  print *, 'Initial reaction at C: ',F_c_i,'N'
  print *, ' '

  best_arr(1)%maximum_disp=0.
  
 outer1: do i=20,500
     allocate(s%disp(6001))
     allocate(s%moment(6001))

     s%id=1
     s%d1=real(i)/1000.
     s%d2=0.
     s%d3=0.
     s%d4=0.
     s%A=circle_area_s(s%d1)
     s%I=circle_I_s(s%d1)
     s%F_b=F_b_i
     s%F_c=F_c_i
     
     s%moment(1)=moment(0.,L,L_b_i,L_c_i,s%F_b,s%F_c,q)
     s%disp(1)=find_disp(0.,L,L_b_i,L_c_i,s%F_b,s%F_c,q,s%I)
     do j=2,6001
        x=j/1000.
        s%moment(j)=moment(x,L,L_b_i,L_c_i,s%F_b,s%F_c,q)
        s%disp(j)=find_disp(x,L,L_b_i,L_c_i,s%F_b,s%F_c,q,s%I)
        total=total+1
     end do
     s%maximum_normal_stress=(maxval(s%moment,1)*s%d1)/s%I
     s%maximum_disp=maxval(s%disp,1)
     if (s%maximum_disp>best_arr(1)%maximum_disp .and. s%maximum_disp<disp_all) then
        best_arr(1)=s
     end if
     deallocate(s%disp)
     deallocate(s%moment)
  end do outer1

  print *, 'Properties of the best solid circle shape: '
  print *, 'Moment of Inertia is: ',best_arr(1)%I, 'm^4'
  print *, 'Area :',best_arr(1)%A, 'm^2'
  print *,'The best deflection is: ', best_arr(1)%maximum_disp*1000.,'mm'
  print *, 'The maximum normal stress is:',best_arr(1)%maximum_normal_stress/(10.**6),'MPa'
  print *, 'The radius of the solid circle is: ', best_arr(1)%d1*1000.,'mm'
  print *, ' '

  !Now to tackle the hollow circlular section

  best_arr(2)%maximum_disp=0.0045

 outer2: do i=10,499
     do j=i+1,500
        allocate(s%disp(6001))
        allocate(s%moment(6001))

        s%id=2
        s%d1=real(i)/1000.
        s%d2=real(j)/1000.
        s%d3=0.
        s%d4=0.
        s%A=circle_area_h(s%d2,s%d1)
        s%I=circle_I_h(s%d2,s%d1)
        s%F_b=F_b_i
        s%F_c=F_c_i

        s%moment(1)=moment(0.,L,L_b_i,L_c_i,s%F_b,s%F_c,q)
        s%disp(1)=find_disp(0.,L,L_b_i,L_c_i,s%F_b,s%F_c,q,s%I)

        do k=2,6001
           x=k/1000.
           s%moment(k)=moment(x,L,L_b_i,L_c_i,s%F_b,s%F_c,q)
           s%disp(k)=find_disp(x,L,L_b_i,L_c_i,s%F_b,s%F_c,q,s%I)
           total=total+1
        end do
        s%maximum_normal_stress=(maxval(s%moment,1)*s%d2)/s%I
        s%maximum_disp=maxval(s%disp,1)
        if (s%maximum_disp>best_arr(2)%maximum_disp .and. s%maximum_disp<disp_all) then
           best_arr(2)=s
        end if
        
        deallocate(s%disp)
        deallocate(s%moment)
     end do
  end do outer2

  print *, 'Properties of the best hollow circle shape: '
  print *, 'Moment of Inertia is: ',best_arr(2)%I, 'm^4'
  print *, 'Area :',best_arr(2)%A, 'm^2'
  print *,'The best deflection is: ', best_arr(2)%maximum_disp*1000.,'mm'
  print *, 'The maximum normal stress is:',best_arr(2)%maximum_normal_stress/(10.**6),'MPa'
  print *, 'The inner radius of the hollow circle is: ', best_arr(2)%d1*1000.,'mm'
  print *, 'The outer radius of the hollow circle is:', best_arr(2)%d2*1000.,'mm'
  print *, ' '

  !Now to tackle the solid square profile

  best_arr(3)%maximum_disp=0.

 outer3: do i=20,500
     do j=20,500
        allocate(s%disp(6001))
        allocate(s%moment(6001))

        s%id=3
        s%d1=real(i)/1000.
        s%d2=real(j)/1000.
        s%d3=0.
        s%d4=0.
        s%A=sq_area_s(s%d1,s%d2)
        s%I=sq_I_s(s%d1,s%d2)
        s%F_b=F_b_i
        s%F_c=F_c_i

        s%moment(1)=moment(0.,L,L_b_i,L_c_i,s%F_b,s%F_c,q)
        s%disp(1)=find_disp(0.,L,L_b_i,L_c_i,s%F_b,s%F_c,q,s%I)

        do k=2,6001
           x=k/1000.
           s%moment(k)=moment(x,L,L_b_i,L_c_i,s%F_b,s%F_c,q)
           s%disp(k)=find_disp(x,L,L_b_i,L_c_i,s%F_b,s%F_c,q,s%I)
           total=total+1
        end do
        s%maximum_normal_stress=(maxval(s%moment,1)*(s%d2/2.))/s%I
        s%maximum_disp=maxval(s%disp,1)
        if (s%maximum_disp>best_arr(3)%maximum_disp .and. s%maximum_disp<disp_all) then
           best_arr(3)=s
        end if
        deallocate(s%disp)
        deallocate(s%moment)
     end do
  end do outer3

  print *, 'Properties of the best solid rectangle shape: '
  print *, 'Moment of Inertia is: ',best_arr(3)%I, 'm^4'
  print *, 'Area :',best_arr(3)%A, 'm^2'
  print *,'The best deflection is: ', best_arr(3)%maximum_disp*1000.,'mm'
  print *, 'The maximum normal stress is:',best_arr(3)%maximum_normal_stress/(10.**6),'MPa'
  print *, 'The breadth of the solid rectangle is: ', best_arr(3)%d1*1000.,'mm'
  print *, 'The height of the solid rectangle is:', best_arr(3)%d2*1000.,'mm'
  print *, ' '

  !Now to tackle the hollow rectangular profile

  best_arr(4)%maximum_disp=0.

 outer4: do i=10,99
     do j=10,99
        do k=i+1,100
           do m=j+1,100
              allocate(s%disp(6001))
              allocate(s%moment(6001))

              s%id=4
              
              !inner dimensions
              s%d1=real(i)/1000.
              s%d2=real(j)/1000.

              !outer dimension
              s%d3=real(k)/1000.
              s%d4=real(m)/1000.

              s%A=sq_area_h(s%d1,s%d2,s%d3,s%d4)
              s%I=sq_I_h(s%d1,S%d2,s%d3,s%d4)

              s%F_b=F_b_i
              s%F_c=F_c_i

              s%moment(1)=moment(0.,L,L_b_i,L_c_i,s%F_b,s%F_c,q)
              s%disp(1)=find_disp(0.,L,L_b_i,L_c_i,s%F_b,s%F_c,q,s%I)

                do n=2,6001
                   x=n/1000.
                   s%moment(n)=moment(x,L,L_b_i,L_c_i,s%F_b,s%F_c,q)
                   s%disp(n)=find_disp(x,L,L_b_i,L_c_i,s%F_b,s%F_c,q,s%I)
                   total=total+1
                end do

                s%maximum_normal_stress=(maxval(s%moment,1)*(s%d4/2.))/s%I
                s%maximum_disp=maxval(s%disp,1)

                if (s%maximum_disp>best_arr(4)%maximum_disp .and. s%maximum_disp<disp_all) then
                   best_arr(4)=s
                end if
              deallocate(s%disp)
              deallocate(s%moment)
           end do
        end do
     end do
  end do outer4

  print *, 'Properties of the best hollow rectangle shape: '
  print *, 'Moment of Inertia is: ',best_arr(4)%I, 'm^4'
  print *, 'Area :',best_arr(4)%A, 'm^2'
  print *,'The best deflection is: ', best_arr(4)%maximum_disp*1000.,'mm'
  print *, 'The maximum normal stress is:',best_arr(4)%maximum_normal_stress/(10.**6),'MPa'
  print *, 'The inner breadth of the solid rectangle is: ', best_arr(4)%d1*1000.,'mm'
  print *, 'The inner height of the solid rectangle is:', best_arr(4)%d2*1000.,'mm'
  print *, 'The outer breadth of the solid rectangle is: ', best_arr(4)%d3*1000.,'mm'
  print *, 'The outer height of the solid rectangle is:', best_arr(4)%d4*1000.,'mm'
  print *, ' '
  print *, total,'Calculations done'

contains
  elemental subroutine find_react(F_b,F_c,q,L,L_b,L_c)
    real,intent(in) :: q,L,L_b,L_c
    real,intent(inout) :: F_b,F_c
    real :: detA
    real :: A(2,2),A_v(2,2),x(2,1),B(2,1)

    A(1,1)=((L_b**2)*((3*L_b)-L_b))/6.
    A(1,2)=((L_b**2)*((3*L_c)-L_b))/6.
    A(2,1)=((L_b**2)*((3*L_c)-L_b))/6.
    A(2,2)=((L_c**2)*((3*L_c)-L_c))/6.

    B(1,1)=(q*(L_b**2)*((6*(L**2))-(4*L*L_b)+(L_b**2)))/24.
    B(2,1)=(q*(L_c**2)*((6*(L**2))-(4*L*L_c)+(L_c**2)))/24.

    detA=1/(A(1,1)*A(2,2)-A(1,2)*A(2,1))

    A_v(1,1)=+detA*A(2,2)
    A_v(2,1)=-detA*A(2,1)
    A_v(1,2)=-detA*A(1,2)
    A_v(2,2)=+detA*A(1,1)
    
    x=matmul(A_v,B)

    F_b=x(1,1)
    F_c=x(2,1)
    
  end subroutine find_react
  
  pure elemental real function find_disp(x,L,L_b,L_c,F_b,F_c,q,I) result(disp)
    real,intent(in) :: x,L,L_b,L_c,F_b,F_c,q,I
    real diq,dib,dic

    if (x<=L_b) then
       dib=((F_b*(x**2))*((3*L_b)-x))/(6*E*I)
    else if (x>L_b) then
       dib=((F_b*(L_b**2))*((3*x)-L_b))/(6*E*I)
    end if

    if (x<=L_c) then
       dic=((F_c*(x**2))*((3*L_c)-x))/(6*E*I)
    else if (x>L_c) then
       dic=((F_c*(L_c**2))*((3*x)-L_c))/(6*E*I)
    end if
    diq=-((q*(x**2))*((x**2)+(6*(L**2))-(4*L*x)))/(24.*E*I)

    disp=diq+dib+dic
    
  end function find_disp
  
  pure real function moment(x,L,L_b,L_c,F_b,F_c,q) result(total_moment)
    real,intent(in) :: x,L,L_b,L_c,F_b,F_c,q
    real :: mb,mc,mq
    
    if (x<=L_b) then
       mb=F_b*(L_b-x)
    else if (x>L_b) then
       mb=0
    end if
    if (x<=L_c) then
       mc=F_c*(L_c-x)
    else if (x>L_c) then
       mc=0
    end if
    mq=-(q*((L-x)**2))/2

    total_moment=mq+mb+mc
  end function moment

  pure real function circle_area_s(d1) result(area)
    real,intent(in) :: d1

    area=pi*(d1**2)
  end function circle_area_s

  pure real function circle_area_h(d2,d1) result(area)
    real, intent(in) ::d1,d2

    area=pi*((d2**2)-(d1**2))
  end function circle_area_h

  pure real function sq_area_s(d1,d2) result(area)
    real,intent(in) :: d1,d2
    
    area=d1*d2
  end function sq_area_s

  pure real function sq_area_h(d1,d2,d3,d4) result(area)
    real,intent(in) :: d1,d2,d3,d4

    area=(d3*d4)-(d1*d2)
  end function sq_area_h
  
  pure real function circle_I_s(d1) result(I)
    real,intent(in) :: d1
    
    I=(pi/4.)*(d1**4)
  end function circle_I_s

  pure real function circle_I_h(d2,d1) result(I)
    real,intent(in) :: d1,d2

    I=(pi/4)*((d2**4)-(d1**4))
  end function circle_I_h

  pure real function sq_I_s(d1,d2) result(I)
    real,intent(in) :: d1,d2

    I=(1/12.)*d1*(d2**3)
  end function sq_I_s

  pure real function sq_I_h(d1,d2,d3,d4) result(I)
    real,intent(in) :: d1,d2,d3,d4

    I=((1/12.)*d3*(d4**3))-((1/12.)*d1*(d2**3))
  end function sq_I_h
  

end program bmsolve1

The whole code would be helpful.

Can you use DO CONCURRENT? The 4 nested loops could become a single do concurrent construct, as shown here.

1 Like

@kargl @Beliavsky added my code :slight_smile:

The calculations are executed 16769025 times. Loops themselves, when I put a counter into the inner one (cnt=cnt+1) take 0.1 sec with all optimizations off (-O0 option), so it is not loops what is slow.

1 Like

So you’re saying I should only allocate and deallocate once out of the loops ?

So I switched to the ifort compiler, and did the allocate/deallocate relocation which gave me a good speed boost :smiley:
I am gonna try the division bit and get back to you.

I will try this as well and get back to you :+1:

This would work, except I get an error saying

error #6695: The subscript or stride of a FORALL triplet specification must not reference a subscript name.   [I]
  do concurrent(i=10:499,j=10:499,k=i+1:500,m=j+1:500,n=2:6001)              
------------------------------------^

The DO CONCURRENT construct has an optional MASK. Here is an example:

program do_concurrent
implicit none
integer, parameter :: n = 3
integer :: i,j,m(n,n)
m = 0
do concurrent (i=1:n,j=1:n,j<=i) 
! fill the lower triangular part of m with i+j
   m(i,j) = i+j
end do
do i=1,n
   print "(*(i3))",m(i,:)
end do
end program do_concurrent
! output:
!   2  0  0
!   3  4  0
!   4  5  6

I think you can replace

do concurrent(i=10:499,j=10:499,k=i+1:500,m=j+1:500,n=2:6001)

with

do concurrent(i=10:499,j=10:499,k=11:500,m=11:500,n=2:6001,k>i .and. m>j)

3 Likes

Thank you so much !

Do you really want all these values calculated or are you looking to minimize/maximize something?

1 Like

I basically want to find out the best cross-section under given load conditions. I have my deflection equations through super position, I am given a max deflection, beam length and location of supports. I just need to find the best cross-section, with a max deflection closest to 5 mm.

I think that you should look for an optimization problem solver that will zero in to the optimum solution.

2 Likes

oh ? wow ! :smiley: I didn’t know these existed !
Even so, I wanted to make something on my own by taking inspiration from the modern Fortran book by Milan Curcic

Thanks for these though.

Those can be used but are mostly Fortran 77 codes, as is much Netlib software. When looking for code for a numerical problem I look at Alan Miller’s collection, John Burkhart’s collection, or GitHub, where I have listed many codes by category here.

4 Likes

In a typical Non-Linear optimization Problem (NLP), you supply a Fortran function that evaluates the objective to minimize/maximize and any constraints (they may be numeric functions as well) and the solver does the work. You would still aim to make your Fortran evaluation function as efficient as possible as it gets called a lot of times (not as many as an exhaustive grid search though). Using derivative information is crucial to efficiency.

1 Like

I will try looking for something in the links posted above and will then post the updated code. Generally though I have to calculate the deflection curve for the entire beam and then find the max in that curve, then find the curve with a max closest to the allowed max.

@Aurelius_Nero, please consider restating the problem as an optimization problem with constraints, or as a CSAT problem (one where there are only constraints to satisfy, no objective function to minimize). You may find the thread titled Variable Number of Nested Loops, in the Intel Fortran Forum, worth reading.

1 Like

For a given beam with given loading conditions - the maximum deflection will be at the same place. Why are you looping over the whole length and finding moment and deflection at every cross section ?
If it is a simply supported beam with UDL - it will be the center. Looping over the whole length seems unnecessary.
For all the cases the program shows that - the maximum moment is at 1357mm and maximum deflection is at 3292 - 3296 mm. If it is a simply supported beam with UDL - both should be at the center.
I think your beam is 6 meters in length. It is having a UDL - what else is the loading and boundary condition ?
If you want to formulate this as a optimization problem - go with Section modulus approach.
Section modulus = I/y.
Once you optimize this - you can get the optimal cross section.

1 Like

You may be right. I’ll check out the link, thanks :+1: