For sometime I thought that when a class is defined, compilers would try to optimize the memory alignment of the member elements one defines inside, but I just found out that I was wrong to assume such hypothesis:
(just a dummy example:)
type :: mytype1
character(1) :: name(8)
integer :: a
integer :: n
real(8) :: b
end type
type :: mytype2
character(1) :: name(8)
integer :: a
real(8) :: b !!> Here is the difference: the double in between the two integers, imposing a jump in mem size
integer :: n
end type
type(mytype1) :: A
type(mytype2) :: B
print *, sizeof( A ), sizeof( B )
end
ifort / ifx / gfortran: all print 24 and 32
Stumbled on this by mere chance and now Iâm wondering if this is something worth worrying about and whether there could be flag options to help mitigate its effect?
I know that when working with commons one had to respect a specific order to define different kinds together in the same common in order to minimize memory gaps.
Data types get aligned according to the best fit: real(8) (double precision in general) will be on an 8-byte boundary, There will be gaps because of this requirement. If you think the extra memory will be problematic, then you will have to use similar rules as for COMMON-blocks.
Thatâs what I concluded, was just wondering if there would be some option that could enable (for simple cases) detect unnecessary gaps and re-arrange alignment⌠In the sense that when one tries to access a member of the class by something%member, in general, one doesnât really care whether it is the first or last element. Internally swapping their relative addresses could be an interesting option.
This could be troublesome if one abuses of AoS. For SoA I think it is less of a problem.
type :: mytype1
SEQUENCE
real(8) :: b
character(1) :: name(4)
integer :: a
integer :: n
end type
type :: mytype2
SEQUENCE
character(1) :: name(8)
integer :: a
real(8) :: b
integer :: n
end type
Compiler stderr
/app/example.f90(1): warning #6380: The structure length is not a multiple of its largest element; could create misalignments for arrays of this type. [MYTYPE1]
type :: mytype1
--------^
/app/example.f90(9): warning #6379: The structure contains one or more misaligned fields. [MYTYPE2]
type :: mytype2
--------^
Wow, excellent!! this is more than good enough thanks @themos !!
I donât get your point⌠Your example actually shows that the compiler does take care of the alignment of the components. Are you worried by the fact it introduces memory gaps instead of trying to reorder the components?
Yes, wether it could catch that there is a gap that it could potentially solve by internally swaping the order of the components.
The second type has 4 extra bytes of memory because an integer is surrounded by two 8-bytes components. If the compiler could analyze the components and create an internal representation from larger to smaller, maybe, one could obtain compact structures most of the time.
Warn if padding is included in a structure, either to align an element of the structure or to align the whole structure. Sometimes when this happens it is possible to rearrange the fields of the structure to reduce the padding and so make the structure smaller.
$ gfortran padding_example.f90 -Wpadded
padding_example.f90:11:0:
11 | real(8) :: b !!> Here is the difference: the double in between the two integers, imposing a jump in mem size
|
Warning: padding struct to align âbâ [-Wpadded]
f951: Warning: padding struct size to alignment boundary [-Wpadded]
This page is about a technique for reducing the memory footprint of programs in compiled languages with C-like structures - manually repacking these declarations for reduced size. To read it, you will require basic knowledge of the C programming language.
You need to know this technique if you intend to write code for memory-constrained embedded systems, or operating-system kernels. It is useful if you are working with application data sets so large that your programs routinely hit memory limits. It is good to know in any application where you really, really care about optimizing your use of memory bandwidth and minimizing cache-line misses.
Generally speaking, unless you are severely memory constrained, or know your operations on an array of structs are bandwidth limited, I wouldnât pay too much attention to this.
I remember once watching a talk about either D or C++ where the author went to great lengths to use the empty padding areas to store various meta-data. The idea is to use the padding as a bit-field. I donât recommend doing this, as memory is cheap nowadays.
This sounds like a good-bad idea at least for the applications Iâm working on. Something that could start with the best intentions but would become a devilishly-hard to maintain code quite fast.
My main motivation was to know about options to help detect/fix such âeasy winsâ. The attribute/flag you provided @themos and @ivanpribec are on this regard perfect to help the developer pinpoint such issues
The title of this thread is â[âŚ] class definitionâ, and so far, the discussion has been focused on just simple derived types. What happens when the programmer extends a derived type, and then uses class() declarations to use that extended type or to allocate new entities of that extended type. What kind of padding or other memory alignment occurs in this case? The language goes to some effort to allow rearrangements within the structures in these cases, even so far is to not allow sequence to be used within them. Do compilers actually exploit that feature to advantage?
If you donât say SEQUENCE, the compiler is allowed to rearrange the components in any way it feels appropriate. Compiler implementers, not unreasonably, understand that doing so would lead to a flood of complaints from users, so donât do that. Even if there is no rearrangement, there can be padding, even with SEQUENCE types, and itâs implementation-dependent as to whether that happens by default or not.
In the case of Intel, -align norecords will remove padding between components. -align sequence will add padding in SEQUENCE types. align (intel.com)
Excellent!! Thanks so much @sblionel indeed, even without the sequence attribute this flag helps reduce the padding. By any chance do you have in mind a situation in which it would be counter productive to use this flag?
In the past, misaligned accesses could be slow, but thatâs less of an issue in todayâs processors. Word tearing for data shared across threads/processes can also be exacerbated by misaligned values. In most cases, it doesnât noticeably hurt, but it can be an issue if the misaligned data is referenced a lot.
I got the point about potential reordering by the compiler and I guess you can witness this by using c_loc on the different components of the derived type. But what about writing unformatted files? Is the writing order the same as the order in memory (and therefore may be potentially reordered)? Or does the writing order corresponds to the one given in the definition of the derived type?