I seem to be having trouble and I’m wonding if anyone could help me…
I’m at a stand still until someone or some tutorial out there can help me.
I’m running on windows10 in the WSL. I’ve got caf compiling and cafrun works, but the coarrays just do not work.
Any ideas?
Example;
! Program Paratest
! Written by Frank Meyer
! Created Jan 9, 2022
! Version 0.1a
! Description: Testing some sync issues.
program Paratest
implicit none
! Create co-arrays for testing
integer :: i_test1[*]
! Other variables for testing purposes.
integer :: i_loop1, i_loop2, i_me, i_all
! Set variable for work
i_loop1 = 1
i_loop2 = 2
i_me = this_image()
i_all = num_images()
print *, i_me, " says; ", i_loop1 ! Does printing work? Good...
sync all
i_loop2 = i_me + 1 ! Can we access the coarray?
if (i_loop2 .gt. i_all) then
i_loop2 = 1
end if
i_test1[i_loop2] = i_me ! Put our number over one.
sync all
print *, i_me, " now says;", i_test1 ! Did it get here?
do i_loop1 = 1, i_all ! Add a dynamic value to it.
i_test1[i_loop1] = i_test1[i_loop1] + 1
end do ! Equiv to i_test1 += i_all
sync all
print *, i_me, " finally says;", i_test1 ! See the failure...
end program Paratest
I think there are two errors: One is the placement of the first sync all and the other is non-use of atomic intrinsic atomic_add for the +1 operation.
Your first sync does nothing useful. Your second is before the “now says”. As soon as images get past that sync, some other image can change some image’s i_test1 (by rushing ahead to i_loop1) so the print shows the changed value.
Try only having “sync all” around the last two prints (the final one can is not needed, of course) and call atomic_add.
This is your problem. You’re having each image update every value, with no synchronization and not atomically. Thus, some values may not be correctly updated.
Also, since there is no barrier between this loop and the previous print statement, some images may be updating the value before other images have a chance to print it.
The suggestions from others should give some clues as to how to fix these problems.
WOW! That oneAPI was impossible to setup! Is there a smaller group of installs available? Something less than 50? Or maybe an install.sh to do it all automatically?
Well… The sync all’s do have a reason, the first one keeps the print messages from the front aligned, the second after rotating everything helps those changes stay aligned, and the last should be all we need to get the last prints to be correct… And adding a sync all in the middle of the loop actually did nothing.
But I’m curious; what’s the code for atomic_add look like?
If your interest is only in Fortran, you can try proceeding with just Intel’s HPC toolkit, that shouldn’t be too difficult to setup and give you the IFORT compiler (and IFX, a beta version of Intel’s LLVM-based compiler): Download the Intel® HPC Toolkit
program Paratest
use iso_fortran_env , only: atomic_int_kind, lock_type
implicit none
integer(atomic_int_kind) :: i_test1[*]
Type(lock_type) :: test_lock[*]
integer :: i_loop1, i_loop2, i_me, i_all
! Set variable for work
i_loop1 = 1
i_loop2 = 2
i_me = this_image()
i_all = num_images()
print *, i_me, "a says; ", i_loop1 ! Does printing work? Good...
! sync all ! original, useless because of next sync
i_loop2 = i_me + 1 ! Can we access the coarray?
if (i_loop2 .gt. i_all) then
i_loop2 = 1
end if
i_test1[i_loop2] = i_me ! Put our number over one.
sync all ! original, must wait for i_test1 to be updated by other image
print *, i_me, "b says;", i_test1 ! Did it get here?
sync all ! needed, stops other images rushing ahead and changing
! i_test1 before we print it above
do i_loop1 = 1, i_all ! Add a dynamic value to it.
!!$ !Alt critical
!!$ critical
!!$ i_test1[i_loop1] = i_test1[i_loop1]+1
!!$ end critical
!!$ !Alt atomic
!!$ call atomic_add( i_test1[i_loop1] ,1)
!!$ !Alt locks
!!$ Lock(test_lock[i_loop1])
!!$ i_test1[i_loop1] = i_test1[i_loop1]+1
!!$ Unlock(test_lock[i_loop1])
!!$ !Alt race condition
!!$ i_test1[i_loop1] = i_test1[i_loop1]+1
end do ! Equiv to i_test1 += i_all
sync all ! original, needed, otherwise i_test1 will be printed
! before all images have finished updating it
print *, i_me, "c says;", i_test1 ! See the failure...
end program Paratest
The website for oneAPI does look great, but the instructions are not download/install/done… If I can’t get this to work any other way I’ll look into it, but it did seem to require a rather large amount of requirements and dependencies. As it is, I think I’ve got the WSL and cygwin compiling/running now and they both have the same behavior. Obviously it is my code that is flawed, just how to fix it will be the issue.
So, not sure why this comment was erased, but; redoing it.
My answer seems to be to add a sync all after the second round of printing to stop the early error in the second round of testing, and to use atomic_add statements to perform the changes for the last round of testing. This has worked perfectly.
Due to issues cutting and pasting to this forum, I’ll add the code as another reply.
! Program Paratest
! Written by Frank Meyer
! Created Jan 9, 2022
! Version 0.2a
! Description: Testing some sync issues.
program Paratest
implicit none
! Create co-arrays for testing
integer :: i_test1[*]
! Other variables for testing purposes.
integer :: i_loop1, i_loop2, i_me, i_all, i_locked
integer :: i_images(2)
! Set variable for work
i_loop1 = 1
i_loop2 = 2
i_me = this_image()
i_all = num_images()
i_images(1) = i_me
print *, i_me, " says; ", i_loop1 ! Does printing work?
sync all
i_loop2 = i_me + 1 ! Can we access the coarray?
if (i_loop2 .gt. i_all) then
i_loop2 = 1
end if
i_test1[i_loop2] = i_me ! Put our number over one.
sync all
print *, i_me, " now says;", i_test1 ! Did it get here?
sync all
do i_loop1 = 1, i_all ! Add a dynamic value to it.
call atomic_add(i_test1[i_loop1], 1)
end do ! Equiv to i_test1 += i_me
sync all
print *, i_me, " finally says;", i_test1 ! See the Success (values = i_me-1+16)...
end program Paratest
For another task, it looks like the third method might work, if it does then that is the last key to the puzzle! Interesting that you test the lock on a non-critical variable to prove the ability to reach the critical one… I will have to study this!
I have no experience with Windows/WSL, so I do not know how linux software is installed there. In regular Linux environment, both RHEL-like and Ubuntu/Debian-like, you can add an Intel oneAPI repository to teh system and then try to install just the fortran compiler. All needed extra packages will be automatically installed as dependencies.
Yeah… I’m running this in windows right now and WOW! to get CAF installed was VERY confusing. It looked like I needed the core components (with dependencies) then at least two other products (with different dependencies)… The website LOOKED nice, but was very unclear as to what did or did not have CAF or was compatible to what I was looking for. Your mileage my vary.
CAF comes (used to come) with opencoarrays which are used for gfortran. With Intel’s oneAPI ifort you don’t need CAF.
You are right, however, that CAF has become hard to install on Linuxes. Fedora and other RHEL-like distros abandoned opencoarrays quite a while ago. Ubuntu had open-coarrays-bin package until 18.04 but in 20.04 it is not available anymore. Only libcoarrays-* packages left but you have to know how to compile to use them.
As it turns out; I’ve have very good success with caf under WSL (a few issues with the version of Ubuntu that gets installed missing… everything! … and needing to be pointed to a better repository, updated, upgraded, updated, and upgraded AGAIN!). The install of CAF was instantaneous once I started using HomeBrew!
But the issue here is that CAF does NOT push to other images reliably. You CAN pull from another image with good reliability so long as you do so with a type that has a long integer as an iteration counter to compare with the iteration of this image. That way you can be sure that the data being pulled is good.
I have a parrallel sort program that does this and it’s proven to work reliably.
From my own experiences, ‘unreliable’ data transfer through coarray put operations are common if the SPMD coarray runtime (i.e. the runtime of the current coarray team) got ‘destroyed’ by violating the execution segment ordering in that coarray team, i.e. some programming mistake with a synchronization or atomic operation. Any following data transfer or synchronization may or may not succeed, the ‘destroyed’ runtime becomes unreliable.
The only way, that I am aware of, to recover from a ‘destroyed’ coarray runtime is to leave the current coarray team (END TEAM) and then to re-enter (CHANGE TEAM) the coarray team, if this is still possible. Of course this makes no sense if the execution segment ordering is violated again and again, it should not be violated at all.