Tips to make this toy program faster?

For the following loop:

 ll = 0.0
  do i=1,n
    z = (x(i) - mu)/sigma
    ll = ll - 0.5*z**2 - log(sigma*sqrt(2*pi))
  end do

the scalar temp z would almost certainly be eliminated by the compiler, so there is no harm in writing it this way. The loop-invariant scalar log(sigma(sqrt(2*pi) will most likely be computed only once before the loop starts and the resulting internal temp used in the loop. My only real objection is the variable name ll - I tend to avoid lowercase L (or an uppercase O) for variables since they look too much like numbers in some fonts. There is no ambiguity in this case - just a style issue.

I tried the original code on an Intel Broadwell processor (one core) with this main program:

program test
  implicit none
  integer,parameter :: n = 1000000
  double precision :: x(n),mu,sigma,nll
  integer(8) :: t1,t2
  real(8)    :: rate, time

  call random_number(x)
  x = 12.0d0*x - 6.0d0
  mu = sum(x)/n
  sigma = sqrt(sum(x-mu)**2/n)

  call system_clock(t1, count_rate = rate)
  call negloglik(x,n,mu,sigma,nll)
  call system_clock(t2)
  time = (t2-t1)/rate
  print *, "Time = ",time," sec."
  print *, "mu    = ",mu
  print *, "sigma = ",sigma
  print *, "nll   = ",nll
end program test

and got the output

 Time =  5.90489042675893883E-4  sec.
 mu    =  1.23783794657927361E-3
 sigma =  7.89020759839331797E-12
 nll   =  9.64039973146251418E+28

Changing the subroutine to

subroutine negloglik(x,n,mu,sigma,nll)
  implicit none
  integer,intent(in) :: n
  double precision, intent(in) :: x(n), mu, sigma
  double precision, intent(out) :: nll

  nll = sum(0.5d0*((x-mu)/sigma)**2) + n*log(sigma*sqrt(2*acos(-1.0d0)))

end subroutine negloglik

and the same main program results in a smaller time:

 Time =  2.69306805074971179E-4  sec.
 mu    =  1.23783794657927361E-3
 sigma =  7.89020759839331797E-12
 nll   =  9.6403997314624702E+28

(default compiler options, Cray/HPE compiler ftn)

[Apparently the text system here mutilates input, for example to eliminate the “*” between sigma and sqrt. (and to remove the indentation). ]

Edited: Format code blocks for readability. LK

Happily, both seem the same in gfortran:

program pi
    use iso_fortran_env, only: wp=>real128
    implicit none
    real(wp), parameter :: pi1 = 4.0_wp * atan(1.0_wp)
    real(wp), parameter :: pi2 = acos(-1.0_wp)

    print *, 1.0_wp/(pi1-pi2), 4.0_wp * atan(1.0_wp)-acos(-1.0_wp)
end program pi
$ gfortran -Wall -Wextra -std=f2018 -pedantic pi.f90 && ./a.out

   11 |     print *, 1.0_wp/(pi1-pi2), 4.0_wp * atan(1.0_wp)-acos(-1.0_wp)
      |                   1
Error: Division by zero at (1)

It brings an interesting question: why \pi is not defined in the Fortran standard? I guess there is a profound (transcendental?) numerical reason if a scientific language refuses to define \pi for 65 years… The situation is the same in C language: some standard libraries define a MATH_PI constant in math.h, but it is not in the standard.

Hi @billlong, the Discourse uses markdown formatting which is what has caused your asterisk to disappear. Of most use is the code block, denoted by triple backticks ( ` ) which will print your literal text as monospace, which aids in readability. I have updated your post to use code blocks.

1 Like

A module defining pi and some physical constants, used in a large program so hopefully correct, is here.

1 Like

About Physical constants, It is not so easy. They are not so constant! See NIST web page:
NIST constants.

:bulb: We could imagine a script which would load the latest values:
and transform automatically that .txt into a Fortran module.

1 Like

Yes you are right.
Also, we need to have the possibility to recover the previous version.

Yes, previous versions are still online, for example:

The difficulty would be to validate the result. A solution:

  1. Generate the Fortran module using a Python or Perl script, where all would be stored (names, values, uncertainties, units),
  2. Write a Fortran program which generates another allascii.txt file,
  3. Make a diff to verify they are identical.

One problem is that some exact values, computed from another exact value, are given with a limited number of digits:
Boltzmann constant in eV/K 8.617 333 262... e-5
Those ... are annoying…
And probably, managing the number of digits to print for each constant. Not so easy.

I know about the versions, I’ve got 3 in my code.
About the number of digits:
While reading the constants, one can read also the number of digits + the exponent
Then, it will be more easy to write the constants in exactly the same format.

About, the Boltzman constant, R.
molar gas constant 8.314 462 618… (exact) J mol^-1 K
One can calculate its value, instead of using the reading one. It is the reason, they give R with “…”

1 Like

So simple… :sweat_smile:

Then, the units (with metric prefix) have to be read and probably converted.

I would put the timing code around the function, otherwise the differences can get swamped by the time to create the random numbers (which you want to compute only once, anyway).

I think you need

Yes, I struggled to try to parallelize the loop with OpenMP, until I realize my parallelization was just working: on a slow PC, I saw the effect. On my fast i7, I thought I was unable to use correctly OpenMP, although I already use it in a model.
I guess the main program is eating an important part of the CPU.

I now have a short dirty Python script doing the basic part of the job, just downloading the latest values, extracting the values of the 354 constants and generating a simple module:

module NIST_physical_constants
  use, intrinsic :: iso_fortran_env, only: wp=>real64
  implicit none

  real(wp), parameter :: alpha_particle_electron_mass_ratio = 7294.29954142_wp   !
  real(wp), parameter :: alpha_particle_mass = 6.6446573357e-27_wp   !kg
  real(wp), parameter :: Wien_wavelength_displacement_law_constant = 2.897771955e-3_wp   !m K
  real(wp), parameter :: W_to_Z_mass_ratio = 0.88153_wp   !
end module NIST_physical_constants

A little test program:

$ ./build/nist.out
 Do you know this constant with so many zeros?
 Oh sorry, you're that kind of biped with 10 digits?
 Some other constants:

I will improve it before pushing on GitHub and posting a new post in the Discourse for further collaboration.


@vmagnin Have a look at QCElemental/ at master · MolSSI/QCElemental · GitHub for accurately parsing different versions of the NIST codata files.

1 Like

Much less fun if everything has already been done…
They extract data here:
using the pandas library: Getting started — pandas 1.2.3 documentation
They generate a JSON file.
They manipulate the data here: QCElemental/ at master · MolSSI/QCElemental · GitHub
and generate a C header file.


Maybe we can contribute a Fortran export to qcel?



I find it curious that the periodic table data are obtained from NIST and not directly from IUPAC data bank