Should I be concerned with loss of performance when calling subroutines using procedure pointers?

Hello everyone,

I’ve been enjoying the discussions in this discourse group and often learning from them!

I was wondering about something. In a code, I had to solve efficiently a bunch of tri-diagonal systems, and I have two different subroutines depending on the boundary conditions. The relevant part of the code looks like this (I guess it is not so important to give more detail):

if(bcz(0)//bcz(1).eq.'PP') then
  call gaussel_periodic(n(1),n(2),n(3)-q,a,b,c,lambdaxy,pz)
else
  call gaussel(         n(1),n(2),n(3)-q,a,b,c,lambdaxy,pz)
endif

I somehow find it a bit more elegant to have a procedure pointer at the beginning of the calling subroutine that, depending on the type of boundary condition, would point to the right subroutine. Something like:

procedure (), pointer :: tridsolve => null()
if(bcz(0)//bcz(1).eq.'PP') then
  tridsolve => gaussel_periodic
else
  tridsolve => gaussel
endif
(...)
call tridsolve(n(1),n(2),n(3)-q,a,b,c,lambdaxy,pz)

I remember that I did not implement things like this because I was concerned about losing performance; perhaps I read somewhere that I could lose some room for e.g. inline optimization if I would implement things in this way… Do you see a significant disadvantage in terms of performance/optimization when using the second approach?

Thanks!
Pedro

3 Likes

@pcosta thanks for the question and welcome!

Unfortunately I do not know the answer. However, as a user I would very much like compilers to tell us how things are actually optimized. That’s one of my long term goals with LFortran.

2 Likes

Any time you defer a decision to run-time there can be a performance penalty. My guess aligns with yours - with a pointer you lose inline expansion or cross-procedure optimization analysis. I would find it hard to believe that any difference for the actual call was measurable.

Ondrej, many compilers do have “optimization reports”, but how helpful they can be varies. It isn’t simple, because optimizations are often not easily described and, for example, might spread code for a statement across the procedure. They’re best suited at telling you what blocked an optimization.

2 Likes

@pcosta instead of procedure pointers I nowadays prefer function objects instead. There will still be some kind of dereference (same performance penalty applies, if any), but it usually avoids having to carry global/module-level parameters for an algorithm.

One way to figure out what code the compiler is generating is to look at the assembly. I usually find https://gcc.godbolt.org/ pretty useful for that (they also have Intel compilers and Fortran support).

But if the routine being called is in a separate compilation unit I would assume no inlining is going to happen anyway unless LTO is enabled, right?

3 Likes