Welcome to the forum and thank you for your great effort and wonderful dialogue with the special functions. Your work with Bessel functions is most impressive and it really showcases how highly talented computational mathematicians can use Julia, which is incorporating into the platform itself a tremendous amount of desired computational language facilities that will be of particularly of interest and delight to those in academia and research - to advance fundamental sciences.
Generally in the Fortran Community, you will find the folks will appreciate if the microbenchmarks are performed thoroughly and cleanly in an easily comprehensible manner across processors and platforms so that the readers can make proper apples-to-apples comparisons. Also, to get into discussions that strive to explain the algorithmic advances that yield computational benefits with special function evaluations, etc⦠This is as opposed to mere marketing materials that instead try to make blanket statements around language superiority (Julia 10x better than xxx, etc.)
In terms of Fortran evaluations, you have likely already realized one has to generally take good care to βwire inβ the needed instrumentation to attempt computing performance evaluations and would need to keep in mind several aspects:
-
There are several targeted compiler implementations toward multiple platforms including commercial ones and the performance mileage can vary significantly across implementations.
gfortran
is one among several and thus gfortran \neq Fortran. -
Fortran has an ISO IEC standard and the use of standard-conforming code across compilers and platforms can be a preferred way to level-set the code used toward benchmarks.
-
With microbenchmarks in sequential computations,
CPU_TIME
in the Fortran standard may not always provide the required resolution andSYSTEM_CLOCK
can be more revealing. -
You will know well as the data sizes used toward function evaluations increase, chances are high the system times can be dictated by data memory access and read/write. With Fortran processors, there can be large variances in this the memory data aspect across different compilers resulting in different performance behavior that overwhelm any of the computational aspects around function evaluations.
-
Simple-minded considerations of stats with a certain number of trial evaluations in the same run with the exclusion of the best and worst performance toward the determination of the βmeanβ helps with a compiler-based program execution of test cases.
With the above and a few points in mind, here is an edited version of your code you may want to review:
Click to see code
integer, parameter :: Float32 = selected_real_kind( p=6 )
integer, parameter :: Float64 = selected_real_kind( p=12 )
integer, parameter :: WP = Float64
! Program data
real(WP), parameter :: ONE = 1.0_wp
integer, parameter :: N = 1000000
real(WP), allocatable :: v(:), r(:)
integer, parameter :: NTRIALS = 12
real(Float64) :: Calc_Times(NTRIALS)
real(Float64) :: Mean_Time
real(WP) :: rsum(NTRIALS)
real(WP) :: stddev
allocate( v(N), r(N) )
call random_number( v )
v = v * 200.0_wp
! Set first element to a reference value for a numerical check
v(1) = ONE
call calc( Calc_Times, rsum )
Mean_Time = ( sum(Calc_Times) - minval(Calc_Times, dim=1) - maxval(Calc_Times, dim=1) ) / (NTRIALS-2)
print *, "BESSEL_J1 Evaluation: Mean compute time for ", N, " evaluations:"
print *, Mean_Time, " seconds."
stddev = norm2( rsum - sum(rsum)/NTRIALS )
print *, "Standard deviation in calculated results: ", stddev
contains
impure elemental subroutine calc( t, sumr )
! Argument list
real(Float64), intent(inout) :: t
real(WP), intent(inout) :: sumr
! Local variables
real(Float64) :: t0
real(Float64) :: t1
call cpu_t( t0)
r = bessel_j1( v )
call cpu_t( t1 )
t = t1 - t0
! Check calculation result
if ( abs(r(1) - 0.44_wp) > 1E-3_wp ) then
print *, "BESSEL_J1( x=1 ): ", r(1), " vs an expected value of approximately ", 0.44_wp
error stop "Unexpected result with x = 1.0: "
end if
sumr = sum( r )
end subroutine
subroutine cpu_t( time )
use, intrinsic :: iso_fortran_env, only : I8 => int64
!.. Argument list
real(Float64), intent(inout) :: time
!.. Local variables
integer(I8) :: tick
integer(I8) :: rate
time = 0.0_Float64
call system_clock (tick, rate)
time = real(tick, kind=kind(time) ) / real(rate, kind=kind(time) )
return
end subroutine
end
C:\temp>gfortran -O3 p.f90 -o p.exe
C:\temp>p.exe
BESSEL_J1 Evaluation: Mean compute time for 1000000 evaluations:
4.9179059994639826E-002 seconds.
Standard deviation in calculated results: 0.0000000000000000