Memory alignment within class definition

For sometime I thought that when a class is defined, compilers would try to optimize the memory alignment of the member elements one defines inside, but I just found out that I was wrong to assume such hypothesis:

(just a dummy example:)

type :: mytype1
    character(1) :: name(8)
    integer :: a
    integer :: n
    real(8) :: b
end type

type :: mytype2
    character(1) :: name(8)
    integer :: a
    real(8) :: b !!> Here is the difference: the double in between the two integers, imposing a jump in mem size 
    integer :: n
end type

type(mytype1) :: A
type(mytype2) :: B

print *, sizeof( A ), sizeof( B ) 

end

ifort / ifx / gfortran: all print 24 and 32

Stumbled on this by mere chance and now I’m wondering if this is something worth worrying about and whether there could be flag options to help mitigate its effect?

I know that when working with commons one had to respect a specific order to define different kinds together in the same common in order to minimize memory gaps.

1 Like

Data types get aligned according to the best fit: real(8) (double precision in general) will be on an 8-byte boundary, There will be gaps because of this requirement. If you think the extra memory will be problematic, then you will have to use similar rules as for COMMON-blocks.

That’s what I concluded, was just wondering if there would be some option that could enable (for simple cases) detect unnecessary gaps and re-arrange alignment… In the sense that when one tries to access a member of the class by something%member, in general, one doesn’t really care whether it is the first or last element. Internally swapping their relative addresses could be an interesting option.

This could be troublesome if one abuses of AoS. For SoA I think it is less of a problem.

If you include the SEQUENCE statement in the type definition, the Intel compiler will issue a warning about misaligned fields.

3 Likes
type :: mytype1
    SEQUENCE 
    real(8) :: b
    character(1) :: name(4)
    integer :: a
    integer :: n
end type

type :: mytype2
    SEQUENCE 
    character(1) :: name(8)
    integer :: a
    real(8) :: b
    integer :: n
end type
Compiler stderr
/app/example.f90(1): warning #6380: The structure length is not a multiple of its largest element; could create misalignments for arrays of this type.   [MYTYPE1]
type :: mytype1
--------^
/app/example.f90(9): warning #6379: The structure contains one or more misaligned fields.   [MYTYPE2]
type :: mytype2
--------^

Wow, excellent!! this is more than good enough :smiley: thanks @themos !!

It’s unfortunate that Gfortran does not behave the same.

I don’t get your point… Your example actually shows that the compiler does take care of the alignment of the components. Are you worried by the fact it introduces memory gaps instead of trying to reorder the components?

Yes, wether it could catch that there is a gap that it could potentially solve by internally swaping the order of the components.

The second type has 4 extra bytes of memory because an integer is surrounded by two 8-bytes components. If the compiler could analyze the components and create an internal representation from larger to smaller, maybe, one could obtain compact structures most of the time.

( I’m just thinking out loud here :wink: )

GCC has the -Wpadded flag for this purpose:

Warn if padding is included in a structure, either to align an element of the structure or to align the whole structure. Sometimes when this happens it is possible to rearrange the fields of the structure to reduce the padding and so make the structure smaller.

Using the example from @hkvzjal

$ gfortran padding_example.f90 -Wpadded
padding_example.f90:11:0:

   11 |     real(8) :: b !!> Here is the difference: the double in between the two integers, imposing a jump in mem size
      | 
Warning: padding struct to align ‘b’ [-Wpadded]
f951: Warning: padding struct size to alignment boundary [-Wpadded]

By the way, an essay relevant for this discussion is The Lost Art of Structure Packing:

This page is about a technique for reducing the memory footprint of programs in compiled languages with C-like structures - manually repacking these declarations for reduced size. To read it, you will require basic knowledge of the C programming language.

You need to know this technique if you intend to write code for memory-constrained embedded systems, or operating-system kernels. It is useful if you are working with application data sets so large that your programs routinely hit memory limits. It is good to know in any application where you really, really care about optimizing your use of memory bandwidth and minimizing cache-line misses.

Generally speaking, unless you are severely memory constrained, or know your operations on an array of structs are bandwidth limited, I wouldn’t pay too much attention to this.

I remember once watching a talk about either D or C++ where the author went to great lengths to use the empty padding areas to store various meta-data. The idea is to use the padding as a bit-field. I don’t recommend doing this, as memory is cheap nowadays.

2 Likes

Excellent!! Thanks @ivanpribec

This sounds like a good-bad idea at least for the applications I’m working on. Something that could start with the best intentions but would become a devilishly-hard to maintain code quite fast.

My main motivation was to know about options to help detect/fix such “easy wins”. The attribute/flag you provided @themos and @ivanpribec are on this regard perfect to help the developer pinpoint such issues :slight_smile:

The title of this thread is “[…] class definition”, and so far, the discussion has been focused on just simple derived types. What happens when the programmer extends a derived type, and then uses class() declarations to use that extended type or to allocate new entities of that extended type. What kind of padding or other memory alignment occurs in this case? The language goes to some effort to allow rearrangements within the structures in these cases, even so far is to not allow sequence to be used within them. Do compilers actually exploit that feature to advantage?

If you don’t say SEQUENCE, the compiler is allowed to rearrange the components in any way it feels appropriate. Compiler implementers, not unreasonably, understand that doing so would lead to a flood of complaints from users, so don’t do that. Even if there is no rearrangement, there can be padding, even with SEQUENCE types, and it’s implementation-dependent as to whether that happens by default or not.

In the case of Intel, -align norecords will remove padding between components. -align sequence will add padding in SEQUENCE types. align (intel.com)

2 Likes

Excellent!! Thanks so much @sblionel indeed, even without the sequence attribute this flag helps reduce the padding. By any chance do you have in mind a situation in which it would be counter productive to use this flag?

In the past, misaligned accesses could be slow, but that’s less of an issue in today’s processors. Word tearing for data shared across threads/processes can also be exacerbated by misaligned values. In most cases, it doesn’t noticeably hurt, but it can be an issue if the misaligned data is referenced a lot.

Perfect, got it!! I’ll keep that in mind!!

I got the point about potential reordering by the compiler and I guess you can witness this by using c_loc on the different components of the derived type. But what about writing unformatted files? Is the writing order the same as the order in memory (and therefore may be potentially reordered)? Or does the writing order corresponds to the one given in the definition of the derived type?

@davidpfister unformatted writes are AFAIK a bit to bit copy of what is in memory.

@PierU, I thought so. Thanks for the confirmation.