# Are 4 nested do loops suboptimal?

Hi all,
I am writing a loop to generate dimensions for a hollow rectangular cross section however the loop takes very long.
Note most of the functions that are called inside the inner most loop are pure functions.
I am using Gfortran no speed up options or parallelization

How do I speed things up ?

``````do i=10,99
do j=10,99
do k=i+1,100
do m=j+1,100
!calculations
end do
end do
end do
end do
``````

Thanks
The code is basically a beam solver for the best cross-section for a given deflection , beam length , and distributed load.
My Code:

``````program bmsolve1
implicit none

type data
real,allocatable :: disp(:),moment(:)
real :: maximum_disp
real :: L_b,L_c
real :: I,A,d1,d2,d3,d4,d5,d6
real :: F_b,F_c
real :: maximum_normal_stress
integer :: id
end type data

real,parameter :: L_b_i=2,L_c_i=4,q=8*1000,disp_all=5*0.001,pi=3.14159,L=6,E=200*(10**9)
real :: F_b_i,F_c_i
real :: x
integer :: i,j,k,m,n,total
type(data) :: s
type(data) :: best_circ_s,best_circ_h,best_arr(5),best

total=0

call find_react(F_b_i,F_c_i,q,L,L_b_i,L_c_i)

print *, ' '
print *, 'Initial reaction at B: ',F_b_i,'N'
print *, 'Initial reaction at C: ',F_c_i,'N'
print *, ' '

best_arr(1)%maximum_disp=0.

outer1: do i=20,500
allocate(s%disp(6001))
allocate(s%moment(6001))

s%id=1
s%d1=real(i)/1000.
s%d2=0.
s%d3=0.
s%d4=0.
s%A=circle_area_s(s%d1)
s%I=circle_I_s(s%d1)
s%F_b=F_b_i
s%F_c=F_c_i

s%moment(1)=moment(0.,L,L_b_i,L_c_i,s%F_b,s%F_c,q)
s%disp(1)=find_disp(0.,L,L_b_i,L_c_i,s%F_b,s%F_c,q,s%I)
do j=2,6001
x=j/1000.
s%moment(j)=moment(x,L,L_b_i,L_c_i,s%F_b,s%F_c,q)
s%disp(j)=find_disp(x,L,L_b_i,L_c_i,s%F_b,s%F_c,q,s%I)
total=total+1
end do
s%maximum_normal_stress=(maxval(s%moment,1)*s%d1)/s%I
s%maximum_disp=maxval(s%disp,1)
if (s%maximum_disp>best_arr(1)%maximum_disp .and. s%maximum_disp<disp_all) then
best_arr(1)=s
end if
deallocate(s%disp)
deallocate(s%moment)
end do outer1

print *, 'Properties of the best solid circle shape: '
print *, 'Moment of Inertia is: ',best_arr(1)%I, 'm^4'
print *, 'Area :',best_arr(1)%A, 'm^2'
print *,'The best deflection is: ', best_arr(1)%maximum_disp*1000.,'mm'
print *, 'The maximum normal stress is:',best_arr(1)%maximum_normal_stress/(10.**6),'MPa'
print *, 'The radius of the solid circle is: ', best_arr(1)%d1*1000.,'mm'
print *, ' '

!Now to tackle the hollow circlular section

best_arr(2)%maximum_disp=0.0045

outer2: do i=10,499
do j=i+1,500
allocate(s%disp(6001))
allocate(s%moment(6001))

s%id=2
s%d1=real(i)/1000.
s%d2=real(j)/1000.
s%d3=0.
s%d4=0.
s%A=circle_area_h(s%d2,s%d1)
s%I=circle_I_h(s%d2,s%d1)
s%F_b=F_b_i
s%F_c=F_c_i

s%moment(1)=moment(0.,L,L_b_i,L_c_i,s%F_b,s%F_c,q)
s%disp(1)=find_disp(0.,L,L_b_i,L_c_i,s%F_b,s%F_c,q,s%I)

do k=2,6001
x=k/1000.
s%moment(k)=moment(x,L,L_b_i,L_c_i,s%F_b,s%F_c,q)
s%disp(k)=find_disp(x,L,L_b_i,L_c_i,s%F_b,s%F_c,q,s%I)
total=total+1
end do
s%maximum_normal_stress=(maxval(s%moment,1)*s%d2)/s%I
s%maximum_disp=maxval(s%disp,1)
if (s%maximum_disp>best_arr(2)%maximum_disp .and. s%maximum_disp<disp_all) then
best_arr(2)=s
end if

deallocate(s%disp)
deallocate(s%moment)
end do
end do outer2

print *, 'Properties of the best hollow circle shape: '
print *, 'Moment of Inertia is: ',best_arr(2)%I, 'm^4'
print *, 'Area :',best_arr(2)%A, 'm^2'
print *,'The best deflection is: ', best_arr(2)%maximum_disp*1000.,'mm'
print *, 'The maximum normal stress is:',best_arr(2)%maximum_normal_stress/(10.**6),'MPa'
print *, 'The inner radius of the hollow circle is: ', best_arr(2)%d1*1000.,'mm'
print *, 'The outer radius of the hollow circle is:', best_arr(2)%d2*1000.,'mm'
print *, ' '

!Now to tackle the solid square profile

best_arr(3)%maximum_disp=0.

outer3: do i=20,500
do j=20,500
allocate(s%disp(6001))
allocate(s%moment(6001))

s%id=3
s%d1=real(i)/1000.
s%d2=real(j)/1000.
s%d3=0.
s%d4=0.
s%A=sq_area_s(s%d1,s%d2)
s%I=sq_I_s(s%d1,s%d2)
s%F_b=F_b_i
s%F_c=F_c_i

s%moment(1)=moment(0.,L,L_b_i,L_c_i,s%F_b,s%F_c,q)
s%disp(1)=find_disp(0.,L,L_b_i,L_c_i,s%F_b,s%F_c,q,s%I)

do k=2,6001
x=k/1000.
s%moment(k)=moment(x,L,L_b_i,L_c_i,s%F_b,s%F_c,q)
s%disp(k)=find_disp(x,L,L_b_i,L_c_i,s%F_b,s%F_c,q,s%I)
total=total+1
end do
s%maximum_normal_stress=(maxval(s%moment,1)*(s%d2/2.))/s%I
s%maximum_disp=maxval(s%disp,1)
if (s%maximum_disp>best_arr(3)%maximum_disp .and. s%maximum_disp<disp_all) then
best_arr(3)=s
end if
deallocate(s%disp)
deallocate(s%moment)
end do
end do outer3

print *, 'Properties of the best solid rectangle shape: '
print *, 'Moment of Inertia is: ',best_arr(3)%I, 'm^4'
print *, 'Area :',best_arr(3)%A, 'm^2'
print *,'The best deflection is: ', best_arr(3)%maximum_disp*1000.,'mm'
print *, 'The maximum normal stress is:',best_arr(3)%maximum_normal_stress/(10.**6),'MPa'
print *, 'The breadth of the solid rectangle is: ', best_arr(3)%d1*1000.,'mm'
print *, 'The height of the solid rectangle is:', best_arr(3)%d2*1000.,'mm'
print *, ' '

!Now to tackle the hollow rectangular profile

best_arr(4)%maximum_disp=0.

outer4: do i=10,99
do j=10,99
do k=i+1,100
do m=j+1,100
allocate(s%disp(6001))
allocate(s%moment(6001))

s%id=4

!inner dimensions
s%d1=real(i)/1000.
s%d2=real(j)/1000.

!outer dimension
s%d3=real(k)/1000.
s%d4=real(m)/1000.

s%A=sq_area_h(s%d1,s%d2,s%d3,s%d4)
s%I=sq_I_h(s%d1,S%d2,s%d3,s%d4)

s%F_b=F_b_i
s%F_c=F_c_i

s%moment(1)=moment(0.,L,L_b_i,L_c_i,s%F_b,s%F_c,q)
s%disp(1)=find_disp(0.,L,L_b_i,L_c_i,s%F_b,s%F_c,q,s%I)

do n=2,6001
x=n/1000.
s%moment(n)=moment(x,L,L_b_i,L_c_i,s%F_b,s%F_c,q)
s%disp(n)=find_disp(x,L,L_b_i,L_c_i,s%F_b,s%F_c,q,s%I)
total=total+1
end do

s%maximum_normal_stress=(maxval(s%moment,1)*(s%d4/2.))/s%I
s%maximum_disp=maxval(s%disp,1)

if (s%maximum_disp>best_arr(4)%maximum_disp .and. s%maximum_disp<disp_all) then
best_arr(4)=s
end if
deallocate(s%disp)
deallocate(s%moment)
end do
end do
end do
end do outer4

print *, 'Properties of the best hollow rectangle shape: '
print *, 'Moment of Inertia is: ',best_arr(4)%I, 'm^4'
print *, 'Area :',best_arr(4)%A, 'm^2'
print *,'The best deflection is: ', best_arr(4)%maximum_disp*1000.,'mm'
print *, 'The maximum normal stress is:',best_arr(4)%maximum_normal_stress/(10.**6),'MPa'
print *, 'The inner breadth of the solid rectangle is: ', best_arr(4)%d1*1000.,'mm'
print *, 'The inner height of the solid rectangle is:', best_arr(4)%d2*1000.,'mm'
print *, 'The outer breadth of the solid rectangle is: ', best_arr(4)%d3*1000.,'mm'
print *, 'The outer height of the solid rectangle is:', best_arr(4)%d4*1000.,'mm'
print *, ' '
print *, total,'Calculations done'

contains
elemental subroutine find_react(F_b,F_c,q,L,L_b,L_c)
real,intent(in) :: q,L,L_b,L_c
real,intent(inout) :: F_b,F_c
real :: detA
real :: A(2,2),A_v(2,2),x(2,1),B(2,1)

A(1,1)=((L_b**2)*((3*L_b)-L_b))/6.
A(1,2)=((L_b**2)*((3*L_c)-L_b))/6.
A(2,1)=((L_b**2)*((3*L_c)-L_b))/6.
A(2,2)=((L_c**2)*((3*L_c)-L_c))/6.

B(1,1)=(q*(L_b**2)*((6*(L**2))-(4*L*L_b)+(L_b**2)))/24.
B(2,1)=(q*(L_c**2)*((6*(L**2))-(4*L*L_c)+(L_c**2)))/24.

detA=1/(A(1,1)*A(2,2)-A(1,2)*A(2,1))

A_v(1,1)=+detA*A(2,2)
A_v(2,1)=-detA*A(2,1)
A_v(1,2)=-detA*A(1,2)
A_v(2,2)=+detA*A(1,1)

x=matmul(A_v,B)

F_b=x(1,1)
F_c=x(2,1)

end subroutine find_react

pure elemental real function find_disp(x,L,L_b,L_c,F_b,F_c,q,I) result(disp)
real,intent(in) :: x,L,L_b,L_c,F_b,F_c,q,I
real diq,dib,dic

if (x<=L_b) then
dib=((F_b*(x**2))*((3*L_b)-x))/(6*E*I)
else if (x>L_b) then
dib=((F_b*(L_b**2))*((3*x)-L_b))/(6*E*I)
end if

if (x<=L_c) then
dic=((F_c*(x**2))*((3*L_c)-x))/(6*E*I)
else if (x>L_c) then
dic=((F_c*(L_c**2))*((3*x)-L_c))/(6*E*I)
end if
diq=-((q*(x**2))*((x**2)+(6*(L**2))-(4*L*x)))/(24.*E*I)

disp=diq+dib+dic

end function find_disp

pure real function moment(x,L,L_b,L_c,F_b,F_c,q) result(total_moment)
real,intent(in) :: x,L,L_b,L_c,F_b,F_c,q
real :: mb,mc,mq

if (x<=L_b) then
mb=F_b*(L_b-x)
else if (x>L_b) then
mb=0
end if
if (x<=L_c) then
mc=F_c*(L_c-x)
else if (x>L_c) then
mc=0
end if
mq=-(q*((L-x)**2))/2

total_moment=mq+mb+mc
end function moment

pure real function circle_area_s(d1) result(area)
real,intent(in) :: d1

area=pi*(d1**2)
end function circle_area_s

pure real function circle_area_h(d2,d1) result(area)
real, intent(in) ::d1,d2

area=pi*((d2**2)-(d1**2))
end function circle_area_h

pure real function sq_area_s(d1,d2) result(area)
real,intent(in) :: d1,d2

area=d1*d2
end function sq_area_s

pure real function sq_area_h(d1,d2,d3,d4) result(area)
real,intent(in) :: d1,d2,d3,d4

area=(d3*d4)-(d1*d2)
end function sq_area_h

pure real function circle_I_s(d1) result(I)
real,intent(in) :: d1

I=(pi/4.)*(d1**4)
end function circle_I_s

pure real function circle_I_h(d2,d1) result(I)
real,intent(in) :: d1,d2

I=(pi/4)*((d2**4)-(d1**4))
end function circle_I_h

pure real function sq_I_s(d1,d2) result(I)
real,intent(in) :: d1,d2

I=(1/12.)*d1*(d2**3)
end function sq_I_s

pure real function sq_I_h(d1,d2,d3,d4) result(I)
real,intent(in) :: d1,d2,d3,d4

I=((1/12.)*d3*(d4**3))-((1/12.)*d1*(d2**3))
end function sq_I_h

end program bmsolve1

``````

The whole code would be helpful.

Can you use DO CONCURRENT? The 4 nested loops could become a single `do concurrent` construct, as shown here.

1 Like

The calculations are executed 16769025 times. Loops themselves, when I put a counter into the inner one (`cnt=cnt+1`) take 0.1 sec with all optimizations off (`-O0` option), so it is not loops what is slow.

1 Like

So youâ€™re saying I should only allocate and deallocate once out of the loops ?

So I switched to the ifort compiler, and did the allocate/deallocate relocation which gave me a good speed boost
I am gonna try the division bit and get back to you.

I will try this as well and get back to you

This would work, except I get an error saying

``````error #6695: The subscript or stride of a FORALL triplet specification must not reference a subscript name.   [I]
do concurrent(i=10:499,j=10:499,k=i+1:500,m=j+1:500,n=2:6001)
------------------------------------^

``````

The DO CONCURRENT construct has an optional MASK. Here is an example:

``````program do_concurrent
implicit none
integer, parameter :: n = 3
integer :: i,j,m(n,n)
m = 0
do concurrent (i=1:n,j=1:n,j<=i)
! fill the lower triangular part of m with i+j
m(i,j) = i+j
end do
do i=1,n
print "(*(i3))",m(i,:)
end do
end program do_concurrent
! output:
!   2  0  0
!   3  4  0
!   4  5  6
``````

I think you can replace

`do concurrent(i=10:499,j=10:499,k=i+1:500,m=j+1:500,n=2:6001)`

with

`do concurrent(i=10:499,j=10:499,k=11:500,m=11:500,n=2:6001,k>i .and. m>j)`

3 Likes

Thank you so much !

Do you really want all these values calculated or are you looking to minimize/maximize something?

1 Like

I basically want to find out the best cross-section under given load conditions. I have my deflection equations through super position, I am given a max deflection, beam length and location of supports. I just need to find the best cross-section, with a max deflection closest to 5 mm.

I think that you should look for an optimization problem solver that will zero in to the optimum solution.

2 Likes

oh ? wow ! I didnâ€™t know these existed !
Even so, I wanted to make something on my own by taking inspiration from the modern Fortran book by Milan Curcic

Thanks for these though.

Those can be used but are mostly Fortran 77 codes, as is much Netlib software. When looking for code for a numerical problem I look at Alan Millerâ€™s collection, John Burkhartâ€™s collection, or GitHub, where I have listed many codes by category here.

4 Likes

In a typical Non-Linear optimization Problem (NLP), you supply a Fortran function that evaluates the objective to minimize/maximize and any constraints (they may be numeric functions as well) and the solver does the work. You would still aim to make your Fortran evaluation function as efficient as possible as it gets called a lot of times (not as many as an exhaustive grid search though). Using derivative information is crucial to efficiency.

1 Like

I will try looking for something in the links posted above and will then post the updated code. Generally though I have to calculate the deflection curve for the entire beam and then find the max in that curve, then find the curve with a max closest to the allowed max.

@Aurelius_Nero, please consider restating the problem as an optimization problem with constraints, or as a CSAT problem (one where there are only constraints to satisfy, no objective function to minimize). You may find the thread titled Variable Number of Nested Loops, in the Intel Fortran Forum, worth reading.

1 Like

For a given beam with given loading conditions - the maximum deflection will be at the same place. Why are you looping over the whole length and finding moment and deflection at every cross section ?
If it is a simply supported beam with UDL - it will be the center. Looping over the whole length seems unnecessary.
For all the cases the program shows that - the maximum moment is at 1357mm and maximum deflection is at 3292 - 3296 mm. If it is a simply supported beam with UDL - both should be at the center.
I think your beam is 6 meters in length. It is having a UDL - what else is the loading and boundary condition ?
If you want to formulate this as a optimization problem - go with Section modulus approach.
Section modulus = I/y.
Once you optimize this - you can get the optimal cross section.

1 Like

You may be right. Iâ€™ll check out the link, thanks