Private allocatable arrays status at the end of an OpenMP // region?

Hi,

Allocatable arrays are automatically deallocated when they get out of scope. Does it apply also to private copies of an allocatable array at the end of an OpenMP parallel region? Do the OpenMP specs tell something about that? i.e.:

real, allocatable :: a(:)

!$OMP PARALLEL FIRSTPRIVATE(a)
allocate(a(n))

... ! some code

! Should we deallocate a(:) here, 
! or is the compiler required to automatically deallocate it?
deallocate( a )
!$OMP END PARALLEL

At least gfortran apparently deallocates the private copies before exiting the // region…

Why is the first thing you do to the “FIRSTPRIVATE(a)” is “allocate(a(n))” ?
I would have thought that for allocate to work in this context, it would not be previously allocated, so would not be able to be pre-initialised with what was before the OMP region ?
Why do you select pre-initialised for an un-allocated array ?
I think it should be PRIVATE (a), for an un-allocated array, and so removed at the end of the !$OMP region.

I am not aware of FIRSTPRIVATE being used to transfer the allocated status of an allocatable array. I thought this would be done automatically (no pun intended).
As a private array, it would be removed at the end of the OMP region, including it’s memory allocation if allocated.

Happy to learn if yours is a valid approach.

I don’t know, actually. I read a long time ago that simply private(a) could leave a in an undefined status (*), so I took the habit to always state firstprivate(a) in such case: at worst it’s useless, but it cannot hurt.

(*) maybe it was ambiguous with earlier versions of the OpenMP/Fortran standards (nowadays I think that no allocatable object can have an undefined status…), or maybe it a compiler bug at that time… I don’t remember.

This is what I also tend to think, and as a matter of fact gfortran and ifort seem to do that in practice. But it could just be because the developers of these compilers do things logically, and I’d like an explicit confirmation from the OpenMP specifications.

Possibly related? (about the use of firstprivate)

This page mentions finalization, but I cannot find auto-deallocation of allocatable arrays…

One concern is that, the behavior of the code may become different between single-thread (N=1) and multi-thread (N>1), when deallocate is not explicitly written and OpenMP auto-deallocates an allocatable array at the end of parallel region? (because for N=1, it means that the code does not deallocate it manually)

This explains why I started using firstprivate for allocatable arrays at some point :slight_smile:

Good point ! The problem is not if the code is run with a single thread in itself, but if it is compiled without OpenMP… So it looks reasonnable to deallocate explicitely the privatized arrays, whatever the compiler is supposed to do with them.

1 Like

Yeah, exactly! (I was thinking about serial build (= no -fopenmp), but “single thread” is a different thing…)

BTW, another approach might be to write a block construct within the parallel region, declare allocatable arrays in the block and manually allocate them, and expect the compiler to auto-deallocate the arrays (at the end of the block). Though I’ve never used this style, it might be of some use when there are lots of allocatable arrays inside the parallel region (not tested, just a guess)

I recall I may have overcome the deallocate status by not allocating the array outside the OMP region, leaving to allocate then deallocate explicitly in the !$OMP region.
If there remained any problem, I did this in a subroutine in the !$OMP region, where no local arrays were save. I still make sure that large local arrays are allocate > so go on the heap, but smallish arrays go on the thread stack. I have a 500GByte stack for all threads and make ALLOCATE arrays a multiple of memory page size; all to have heap arrays arrays on different threads not sharing memory pages. IT is a strategy that works, but difficult to identify if it is needed!

I also find private copies of allocatable arrays (as seen in the routine) go on the heap, but private copies of allocatable arrays that are routine arguments, go on the stack. ( using Gfortran)

Yes, it should do. However, the ifort version I am working with (18) does not like BLOCK constructs within OpenMP regions (it doesn’t like ASSOCIATE constructs either). Looks fine with gfortran, though.

Sure, but my point was “Does the standard(s) require(s) the automatic deallocation in such case?”

OpenMP is non standard (at this time?)

Right. It’s an API, with specifications. Which doesn’t change that much my questionning…

I find it difficult to identify which version of the OpenMP specification is supported by each version of the Gfortran I am using.
OpenMP version 5.1 has some expanded statements for controlling memory management. I am not sure what has been implemented of these additions. They appeared to be for GPU memory, but did not address my issue of managing stack and heap usage, when I was trying to improve this area of my code.
I really do not know what of these new features have been taken up by Fortran compilers.
A “feature” that I identified for OpenMP heap arrays, was to start new “large” arrays at the start of a new memory page. This does not appear to be a feature others have requested. :frowning:
The problem I see is it can be quiet complex code development to utilise these features, then the compiler might not have completely/successfully implemented the features you identify as beneficial.
I am using AMD hardware on Windows 10, so validating improvements can also be questionable!