How to launch a coarray program on one cluster node with SLURM?

I am learning using coarrays on one node clusters, with typically 24 cores. I can make tests on an interactive machine with 10 bi-thread cores and it works fine with:

$ module load intel/2024/compilers
$ ifx -Ofast -coarray ppm_coarray_buddhabrot.f90 && ./a.out

But I would also like to learn launching coarrays tasks with the SLURM workload manager. I tried the following job.slurm script but I think it ran on only one core:

#!/bin/bash

#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=24
#SBATCH --time=02:06:00
#SBATCH --job-name=buddhabrot
#SBATCH --mem=1024M

module load intel/2024/compilers
ifx -Ofast -coarray ppm_coarray_buddhabrot.f90 && ./a.out

Is anyone familiar with SLURM with coarrays?

Not familiar with SLURM but and internet search brought up this web page from CUNY.

Thanks, I will read it this evening and see if it helps.

The Intel Fortran documentation says,

By default, the number of images created is equal to the number of execution units on the current system. You can override this by specifying a number using the [Q]coarray-num-images compiler option on the command line that compiles the main program. You can also specify the number of images at execution time in the environment variable FOR_COARRAY_NUM_IMAGES.

I’d try the following:

# ... existing slurm settings

module load intel/2024/compilers
ifx -Ofast -coarray -o ppm_coarray_buddhabrot ppm_coarray_buddhabrot.f90

export FOR_COARRAY_NUM_IMAGES=$SLURM_CPUS_PER_TASK

./ppm_coarray_buddhabrot

You could also fix the number to the executable, -coarray -coarray-num-images=24. More guidelines are given in the documentation.

For running on a single node only, you can also try setting I_MPI_FABRIC=shm. This will use intra-node communication mechanism. There are further settings you can play with.

Does anyone know if shm fabric implies use of an API like POSIX shared memory (shm_open) or could it be something different? The Intel docs mention /dev/shm/ is used on Linux.

Thanks for the clues @ivanpribec
I will investigate it tomorrow.

In case you want to ever do CAF + OpenMP/GPU/…, it would maybe be more logical to frame your job like this:

#!/bin/bash

#SBATCH --nodes=1
#SBATCH --ntasks-per-node=24
#SBATCH --cpus-per-task=1

#...

export FOR_COARRAY_NUM_IMAGES=$SLURM_NTASKS
./ppm_coarray_buddhabrot

Normally you use tasks for MPI ranks (coarray images), and --cpus-per-task is for intra-node parallelism (OpenMP, pthreads).

Thanks @ivanpribec ,

I have made good progress. The problem was the missing export I_MPI_FABRICS=shm statement. This is my new job.slurm script:

#SBATCH --nodes=1
#SBATCH --ntasks-per-node=24
#SBATCH --error=error_%j.log
#SBATCH --time=00:10:00
#SBATCH --job-name=buddhabrot
#SBATCH --mem=1024M

module purge
module load intel/2024/compilers

export I_MPI_FABRICS=shm

ifx -Ofast -coarray -coarray-num-images=${SLURM_NTASKS} -o ppm_coarray_buddhabrot ppm_coarray_buddhabrot.f90
./ppm_coarray_buddhabrot

But this is not the end of the story. I have noticed that the images were continuing running after the end of all Fortran instructions, until SLURM stopped them after the allocated time.

I have added a few sync all, trying to improve the situation. But also added a stop at the end of the test if (this_image()==1), an advice found with the Mistral AI agent. My code is now schematically like this:

...
  computation: do i = 1, num_samples
	! All images working on their own array p()
  end do computation

  sync all
  call co_sum(p, 1)
  sync all
  
  if (this_image() == 1) then
	write(*,'(A)') "I am image 1 saving the picture in 'buddhabrot.ppm'"
	...
	! Stopping image 1 stops all images
	! It avoids problems with images sometimes continuing to run
	stop
  end if

It’s better, sometimes stopping, sometimes still hanging.

In our HPC cluster, I can make tests on an interactive machine and the Fortran images never hang at the end of the computation. It occurs only when it is launched by the SLURM workload manager:

  • just now, I am doing a test. It is hanging, image 1 has started to write the file buddhabrot.ppm, but only 17 Bytes (instead of 3 MB).

  • I cancel it, and restart the job, now it stops correctly and I have a 3 MB ppm file.

That’s not repeatable :roll_eyes: