I am now learning coarrays and I share with you my first experiments:
In the repository, you will find a very simple algorithm computing an approximation of \pi using a Monte Carlo method (very inefficient method to compute \pi, but very efficient to burn the CPU!), with different versions:
- a serial version of the algorithm.
- A parallel version using OpenMP.
- A parallel version using coarrays.
- Another coarrays version printing steadily intermediate results.
My first benchmark on a 2 cores / 4 threads CPU yields:
|Serial||19.9 s||34.8 s|
|OpenMP||9.9 s||93.0 s|
|Coarrays||16.2 s||14.4 s|
|Coarrays steady||33.2 s||35.9 s|
First, concerning the gfortran results:
- I am surprised by the difference between OpenMP and coarrays (in both cases there was 4
- And by the effect of printing steadily intermediate results (just 20 times).
Concerning ifort, which I am not familiar with:
- I don’t understand why the results are so bad with the serial version while they are a little better than gfortran with coarrays.
- And when I use ifort with
-qopenmp, I see 4
a.outexecutables but using only 45% of the CPU. And the results are catastrophic.
Any help and comments welcome!
And I hope this post and that repository will help other people interested by learning coarrays.