If they don’t want to use Fortran leave them, but this evaluation is no proper work and should not be used as an excuse for rejecting Fortran for anyone. The authors completely ignore the coarray runtime (comprising coarrays and collective subroutines for communication) which should be expected as Fortran’s core component for (upcoming) heterogeneous programming in pure Fortran.
They don’t even mention coarrays at all which leaves the really bad taste that they only want to justify an already made decision against Fortran, no matter what. But then the only obvious reason for giving such to the public would be to fool others, and then even the sections with ‘Contrasting Reviewer Judgments’ appear as an attempt of (self-) delusion.
To say it crystal clear, it is just way too early in time to give any evaluation about how exactly heterogeneous programming will look like in the next years. Personally, I am learning much from SYCL /DPC++ yet, and I am using it to apply programming techniques with Coarray Fortran. But I can only assume if this will really work on future heterogeneous systems yet. Approaching heterogeneous computing with any programming technique must be done gradually with doing small steps, but we can’t just sit and wait anymore these days.
The main problem, as I see it yet, could be a limitation of what implementers (of any parallel programming language or framework) can really do. I see this for Coarray Fortran and I believe the authors of the DPC++ book see it very similar for SYCL/DPC++, for example see chapter 19 in the book, the explanations for memory models, especially pages 496 and 504-506:
Memory Model and Atomics | SpringerLink .
This is very relevant to Coarray Fortran as well and we can already apply some relevant programming techniques. The authors of ‘Modern Fortran explained’ declared Fortran 2008 SYNC MEMORY statement, and ATOMIC_DEFINE and ATOMIC_REF as deprecated features (section A.9.1 in the book). I can understand why, but this is still a mistake: In my (experimental) programming practice only a single ATOMIC_DEFINE and a single ATOMIC_REF and a single atomic coarray variable is required. The authors make this statement (on page 452): “We see the construction of reliable and portable code in this way as very difficult – it is all too easy to introduce subtle bugs that manifest themselves only occasionally.” In my practice, it is quite the contrary: atomic_define and atomic_ref are the only fully reliable technique to detect any failure (i.e. missing atomic operation) across coarray images. (I am using additional checks for non-atomic coarrays as well, but this is never 100% reliably.)
Two years ago, @rouson made a statement here: Learning coarrays, collective subroutines and other parallel features of Modern Fortran - #29 by rouson
From a plain Coarray Fortran language perspective this is certainly a valid statement: “Let the compiler do its job.” (this is a quote from his quote).
So far I did use a (single) atomic coarray to implement customized non-blocking and lightweight synchronization to allow for (single-image) asynchronous coroutines in Fortran (still experimental): GitHub - MichaelSiehl/Spatial_Fortran_1 .
I am newly experimenting with COLLECTIVE SUBROUTINES myself and was soon able to achieve similar results as with my coarray approach described in the above Github. (It requires the use of three different kernel types, a single task- kernel, a multi-task kernel, and an all-image kernel to allow for execution of co_broadcast, that must be on all coarray images always). It works (on a CPU), but on the receiving side there is certainly an implicit synchronization for this to work. I did not try out yet, but this should certainly disallow (single-image) asynchronous execution. If I am correct with my assumption, then the statements made two years ago by @rouson could turn out as not being valid any more because single-image asynchronous code execution (the customized coarray approach) could certainly be a key to keep coarray images more busy working than collective subroutines with an implicit synchronization (on the receive side).
Of course, all this is ongoing work and preliminary. I think it is just too early to make any valid evaluation yet.