Fortran code snippets

You can turn the style suggestions off, although not selectively yet (Style suggestions cannot be disable · Issue #2984 · lfortran/lfortran · GitHub).

However, that is why we made them, to motivate users to upgrade their codes to a consistent, modern style. It worked! Thank you, that is exciting.

2 Likes

Nice! This reminded me of this similar function aoc2023/7/aoc7_part1.f90 at 9edcf5f4387bad10d22f6153d339b0e7355f59af · ettaka/aoc2023 · GitHub that I implemented in aoc2023. The main difference is that you are using recursion (looks very pretty!).

After I posted it (and saw @Beliavsky 's comment) I immediately thought of different implementations of this functionality. I am even contemplating discussing and analysing them in a short note. While the implementation I posted is not optimal in terms of performance and may even be shortened, I do like the functional style :innocent:

3 Likes

The functional module from functional-fortran has set operations. The program

use functional
implicit none
integer, parameter :: a(*) = [1,3,3,5], b(*) = [9,6,3]
character (len=*), parameter :: fmt_i="(*(1x,i0))"
print fmt_i, complement(a,b)
print fmt_i, intersection(a,b)
print fmt_i, union(a,b)
print fmt_i, set(a)
end

gives output

 1 5
 3
 1 3 5 9 6
 1 3 5
4 Likes

That’s amazing! Thank you. I will definitely use that!

fpm-ized Orderpack contains procedures for selecting unique elements of an array, and a supplemental module has extensions specifically emulating common set procedures via M_orderpack. It has confidence tests built in for use with “fpm test”. I was not aware of some of the listed resources. It is interesting to look at the other approaches and their supporting materials. I was wondering if anyone has other examples and/or use cases that might be incorporated into M_sets as well as being useful additional code snippets. fpm-specific code snippets would be useful to me, but perhaps should be a feature of the WIP fpm repository.

2 Likes

I just realized that an internal write can be used to set not just a single character variable but an array of character variables. Section 12.4 of the Fortran 2023 interpretation document says

If the file is a scalar character variable, it consists of a single record whose length is the same as the length of the scalar character variable. If the file is a character array, it is treated as a sequence of character array elements. Each array element, if any, is a record of the file. The ordering of the records of the file is the same as the ordering of the array elements in the array (9.5.3.3) or the array section (9.5.3.4). Every record of the file has the same length, which is the length of an array element in the array.

Here is an illustrative program that runs with gfortran and ifort.

implicit none
integer, parameter :: n=3
integer :: i
character (len=2) :: s(n)
write (s,"('x',i0)") (i, i=1,n) ! use an implied do loop
print "(*(1x,a))",s ! x1 x2 x3
do i=1,n ! use a regular do loop
   write (s(i), "('x',i0)") i
end do
print "(*(1x,a))",s ! x1 x2 x3
end

Looking at that implementation I see that it is pretty much what I concocted :innocent:

What I thought of, in the way of writing a note, is analysing the various implementations:

  • Length of the code
  • Amount of memory used
  • Number of comparisons
  • Possibly other measures as well

Given the number of implementations out there, this could be a fun project with a high level of nerdiness.

Here’s a snippet I created for a Stack Exchange reply. The purpose is to form the product L L^T for a lower-triangular matrix stored in the packed storage format.

module matmult
implicit none
public
contains

! SLLTPM performs the update
!
!    B := A*A**T
!
! where B is a n by n symmetric matrix, and A is an n by n 
! lower-triangular matrix. Packed storage is used for both A and B.
subroutine slltpm(n,a,b)
    implicit none
    integer, intent(in) :: n
    real, intent(in) :: a(*)
    real, intent(out) :: b(*)

    integer :: i, j, k

    do j = 1, n
        b(id(j,j):id(n,j)) = 0
        do k = 1, j
            !$omp simd
            do i = j, n
                b(id(i,j)) = b(id(i,j)) + a(id(i,k))*a(id(j,k))
            end do
        end do
    end do
contains 
    integer function id(i,j)
        integer, intent(in) :: i, j
        id = i + (2*n-j)*(j-1)/2
    end function
end subroutine
end module

program main
use matmult, only: slltpm
implicit none

integer, parameter :: n = 4
real :: ad(n,n), bd(n,n)

integer, parameter :: np = n*(n+1)/2
real :: a(np), b(np)

! A := L
a = [real :: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

! B := AA^T
call slltpm(n,a,b)

call print_packed(n,a,"a (packed)")
call print_packed(n,b,"b (packed)")

call unpack(n,a,ad)
bd = matmul(ad,transpose(ad))

call print_dense(n,ad,"a (dense)")
call print_dense(n,bd,"b (dense)")

contains

    subroutine unpack(n,a,b)
        integer, intent(in) :: n
        real, intent(in) :: a(*)
        real, intent(out) :: b(n,n)
        integer :: id, i, j
        id(i,j) = i + (2*n-j)*(j-1)/2
        b = 0
        do j = 1, n
            do i = j, n
                b(i,j) = a(id(i,j))
            end do
        end do
    end subroutine

    subroutine print_packed(n,a,str)
        integer, intent(in) :: n
        real, intent(in) :: a(*)
        character(*), intent(in) :: str
        integer :: id, i, j
        id(i,j) = i + (2*n-j)*(j-1)/2
        print *, str
        do i = 1, n
            print *, a( [(id(i,j), j = 1, i)] )
        end do
    end subroutine

    subroutine print_dense(n,a,str)
        integer, intent(in) :: n
        real, intent(in) :: a(n,n)
        character(*), intent(in) :: str
        integer :: i, j
        print *, str
        do i = 1, n
            print *, (a(i,j), j = 1, n)
        end do
    end subroutine

end program
3 Likes

I have in fact been working on such a note. It is not online yet, but I hope to do that some time this week. It has grown “organically”, my analyses are less than rigorous, but it was fun to do and it includes six very different implementations.

Your code stores the packed array by columns. If you instead store the packed arrays by rows, then the code is much simpler, with sequential access to all the arrays. Here is an example:

program xxx
   implicit none
   integer, parameter :: wp = kind(1.0)
   integer, parameter :: n = 4, nn=(n*(n+1))/2
   real(wp), parameter :: small = 8.0_wp * epsilon(1.0_wp)
   real(wp) :: AP(nn), BP(nn), BPx(nn), A(n,n), B(n,n), diff
   integer :: i, j, ij, nerr
   character(*), parameter :: cfmt = '(2(a,i0),2(a,es0.4,1x))'
   
   call random_number( AP ); AP = AP - 0.5_wp

   ! compute the matrix product three ways, and compare.

   A = 0.0_wp
   ij = 0
   do i = 1, n
      do j = 1, i
         ij = ij + 1
         A(i,j) = AP(ij)
      enddo
   enddo
   B = matmul( A, transpose(A) )

   call slltpm( n, AP, BP )

   nerr = 0
   ij = 0
   do i = 1, n
      do j = 1, i
         ij = ij + 1
         diff = B(i,j) - BP(ij)
         if ( abs(diff) > small ) then
            nerr = nerr + 1
            write(*,cfmt) 'B(', i, ',', j, ')=', B(i,j), 'BP=', BP(ij)
         endif
      enddo
   enddo
   write(*,cfmt) 'BP nerr=', nerr

   call slltpmx( n, AP, BPx )
   nerr = count( abs(BP-BPx) > small )
   write(*,cfmt) 'BPx nerr=', nerr

contains

   subroutine slltpm( n, A, B )
      ! compute B = A . A^T where A is lower triangular and B is symmetric.
      ! both A and B are stored lower triangular packed by rows.
      implicit none
      integer, intent(in)   :: n ! matrix dimension.
      real(wp), intent(in)  :: A(*)
      real(wp), intent(out) :: B(*)

      integer :: i, j, ij, i0, j0

      ij = 0
      i0 = 0
      do i = 1, n
         j0 = 0
         do j = 1, i
            ij = ij + 1
            B(ij) = dot_product( A(i0+1:i0+j) , A(j0+1:j0+j) )
            j0 = j0 + j
         enddo
         i0 = i0 + i
      enddo
      return
   end subroutine slltpm

   subroutine slltpmx( n, A, B )
      ! compute B = A . A^T where A is lower triangular and B is symmetric.
      ! both A and B are stored lower triangular packed by rows.
      implicit none
      integer, intent(in)   :: n ! matrix dimension.
      real(wp), intent(in)  :: A(*)
      real(wp), intent(out) :: B(*)

      integer :: i, j, i0, j0

      do concurrent( i=1:n )
         i0 = (i*(i-1))/2
         do concurrent( j=1:i )
            j0 = (j*(j-1))/2
            B(i0+j) = dot_product( A(i0+1:i0+j) , A(j0+1:j0+j) )
         enddo
      enddo
      return
   end subroutine slltpmx

end program xxx

There are two versions of the subroutine, one that just increments the array offsets, which demonstrates sequential access of all the arrays, and the second one which computes the offsets within the loops and uses do concurrent.

This convention of storing packed arrays by rows was more or less the standard convention in the eispack and linpack libraries.

What is the area of the Mandelbrot set?

Surprisingly, this is not known analytically and it’s quite difficult to estimate numerically. This is because the closer you get to the Mandelbrot boundary the larger the number of iterations required to determine if the Julia set is bounded.

One of the best calculations so far is by Thorsten Förstemann (2012). Using 88 trillion points he found the area to be 1.5065918849(28).

Let’s try and get close with a little Fortran code:

program mandelbrot_area
use mpi_f08
implicit none
logical mp_mpi
integer np_mpi,lp_mpi,ierror,n,i,j
real(8) area
complex(8) c
! determine the number of MPI processes and the local process rank
call mpi_init(ierror)
call mpi_comm_size(mpi_comm_world,np_mpi,ierror)
call mpi_comm_rank(mpi_comm_world,lp_mpi,ierror)
! flag for master process
mp_mpi=.false.
if (lp_mpi == 0) then
  mp_mpi=.true.
  write(*,'("Number of MPI processes : ",I6)') np_mpi
! open file for writing area vs n
  open(50,file='MANDELBROT_AREA.OUT',form='FORMATTED')
end if
! loop over grid sizes
do n=2**12,2**20,2**10
  if (mp_mpi) write(*,'("Evaluating n = ",I8)') n
  area=0.d0
!$OMP PARALLEL DO DEFAULT(SHARED) &
!$OMP PRIVATE(c,j) REDUCTION(+:area) &
!$OMP SCHEDULE(DYNAMIC)
  do i=1,n
    if (mod(i-1,np_mpi) /= lp_mpi) cycle
    c%re=-2.d0+2.49d0*dble(i-1)/dble(n)
    do j=1,n/2
      c%im=1.15d0*dble(j-1)/dble(n/2)
      if (c%re**2+c%im**2 > 4.d0) cycle
      if (julia(c)) area=area+1.d0
    end do
  end do
!$OMP END PARALLEL DO
! add the areas from each MPI process and distribute
  if (np_mpi > 1) then
    call mpi_allreduce(mpi_in_place,area,1,mpi_real8,mpi_sum,mpi_comm_world, &
     ierror)
  end if
  area=4.d0*2.49d0*1.15d0*area/dble(n)**2
  if (mp_mpi) then
    write(*,'("Area of Mandelbrot set : ",G18.10)') area
    write(50,'(2G18.10)') 1.d0/dble(n),area
    flush(50)
  end if
end do
if (mp_mpi) close(50)
call mpi_finalize(ierror)

contains

pure logical function julia(c)
implicit none
complex(8), intent(in) :: c
integer i
complex(8) z
z=c
do i=1,n/8
  z=(((((((z**2+c)**2+c)**2+c)**2+c)**2+c)**2+c)**2+c)**2+c
  if (z%re**2+z%im**2 > 4.d0) then
    julia=.false.
    return
  end if
end do
julia=.true.
end function

end program

This uses MPI+OpenMP hybrid parallelism and will run on as many cores as you throw at it. I ran it on 4608 cores for about 40 minutes. Here is the plot of Mandelbrot area vs 1/n, where n is the grid size along one axis:

By using extrapolation to the continuum limit, my estimate for the area is 1.5065849 which differs from Förstemann’s result by 0.000007, which is not too bad.

Here’s a challenge for someone: convert the above code to run with Coarray Fortran instead of MPI+OpenMP.

7 Likes

Thanks, very interesting!

I have found on that page the formula of its area:

A=pi(1-sum_(n=1)^inftynb_n^2)
where b_n are the coefficients of the Laurent series about infinity of the conformal map psi of the exterior of the unit disk onto the exterior of the Mandelbrot set

These coefficients can be computed recursively, but a closed form is not known. Furthermore, the sum converges very slowly, so 10^(118) terms are needed to get the first two digits, and 10^(1181) terms are needed to get three digits.

:face_with_spiral_eyes:

1 Like

I computed the same Jacobian with Finite Difference method.

John D. Cook:

The coupon collector problem says that for a set of N items, the expected number of draws [of one of the N items, with replacement] before you’ve seen everything once is approximately

N(log N + γ)

for large N. Here γ is Euler’s constant.

Here is a code to check this.

module coupon_mod
implicit none
private
public :: dp, avg_tries
integer, parameter :: dp = kind(1.0d0)
contains
function avg_tries(n, k) result(avg)
! simulate the average number of tries that it
! takes to collect all n coupons, using k trials
integer, intent(in) :: n, k
real(kind=dp) :: avg
real(kind=dp) :: rn
logical :: collected(n)
integer :: i, j, ncount, total_tries
total_tries = 0
do i = 1, k
   collected = .false.
   ncount = 0
   do while (ncount < n)
      call random_number(rn)
      j = int(rn*n) + 1
      if (.not. collected(j)) then
         ncount = ncount + 1
         collected(j) = .true.
      end if
      total_tries = total_tries + 1
   end do
end do
avg = total_tries / real(k, kind=dp)
end function avg_tries
end module coupon_mod
!
program coupon_collector
use coupon_mod, only: dp, avg_tries
implicit none
real(kind=dp) :: euler_masch = 0.577215664901532_dp
integer, parameter :: k = 10**5 ! number of simulations
integer :: n
real(kind=dp) :: sim_tries, theory_tries
call random_seed()
print "(*(a12))", "#coupons", "simulated", "theoretical", "sim/theory"
do n=2,1000 ! loop over # of unique coupons
   if (n > 10 .and. mod(n,100) /= 0) cycle
   sim_tries = avg_tries(n, k)
   theory_tries = n*log(real(n, kind=dp)) + euler_masch*n + 0.5_dp
   print "(i12,*(f12.4))", n, sim_tries, theory_tries, sim_tries/theory_tries
end do
end program coupon_collector

Sample output:

    #coupons   simulated theoretical  sim/theory
           2      2.9987      3.0407      0.9862
           3      5.4988      5.5275      0.9948
           4      8.3346      8.3540      0.9977
           5     11.4167     11.4333      0.9986
           6     14.7108     14.7139      0.9998
           7     18.1228     18.1619      0.9978
           8     21.7756     21.7533      1.0010
           9     25.4960     25.4700      1.0010
          10     29.3364     29.2980      1.0013
         100    518.7473    518.7386      1.0000
         200   1174.3304   1175.6066      0.9989
         300   1885.2814   1884.7994      1.0003
         400   2627.0973   2627.9721      0.9997
         500   3395.7917   3396.4119      0.9998
         600   4186.7526   4184.9872      1.0004
         700   4991.9206   4990.3072      1.0003
         800   5806.8269   5809.9619      0.9995
         900   6636.0467   6642.1494      0.9991
        1000   7481.0483   7485.4709      0.9994
1 Like

I’m actually teaching a 2 day course on MPI, Coarrays, OpenMP and do concurrent and could use another example/exercise if you don’t mind me borrowing this.

3 Likes

@everythingfunctional
I’m actually teaching a 2 day course on MPI, Coarrays, OpenMP and do concurrent and could use another example/exercise if you don’t mind me borrowing this.

By all means :slightly_smiling_face:

John D. Cook presented A very accurate logarithm approximation from the paper Empirical Formulae for Approximate Computation (1897) by Kellogg using a ratio of quadratic polynomials. I wonder if it has practical applications in programs where computing logs takes a substantial fraction of the total computation time. The code

module approx_mod
implicit none
integer, parameter :: dp = kind(1.0d0)
contains
elemental function log_approx(p, q) result(y)
real(kind=dp), intent(in) :: p, q
real(kind=dp)             :: y
real(kind=dp)             :: s, d
s = p + q
d = p - q
y = 6*d*s/(3*s**2 - d**2)
end function log_approx
end module approx_mod

program xapprox
use approx_mod, only: dp, log_approx
implicit none
integer :: i
real(kind=dp), parameter :: x(*) = 0.1_dp*[(i,i=11,1,-1)]
character (len=*), parameter :: fmt_r = "(a8, *(1x,f9.6))"
print fmt_r,"x", x
print fmt_r,"log(x)", log(x)
print fmt_r,"approx", log_approx(x,1.0_dp)
print fmt_r,"diff", log(x) - log_approx(x,1.0_dp)
end program xapprox

gives

       x  1.100000  1.000000  0.900000  0.800000  0.700000  0.600000  0.500000  0.400000  0.300000  0.200000  0.100000
  log(x)  0.095310  0.000000 -0.105361 -0.223144 -0.356675 -0.510826 -0.693147 -0.916291 -1.203973 -1.609438 -2.302585
  approx  0.095310  0.000000 -0.105360 -0.223140 -0.356643 -0.510638 -0.692308 -0.913043 -1.192140 -1.565217 -2.106383
    diff  0.000000  0.000000 -0.000000 -0.000003 -0.000032 -0.000187 -0.000839 -0.003247 -0.011833 -0.044221 -0.196202

This will hold for a small region - if you are outside that region, you will have to scale the argument and it might very well be that then the computational efficiency is lost.

Some time ago I mentioned that I was working on a memo about different implementations of a simple-looking programming problem - determining the unique elements of an array. This memo is now available at GitHub - arjenmarkus/memos-on-programming: Memos on programming problems that caught my eye. Whether is of any use, I do not know, but it was fun to write :slight_smile:

1 Like