Generic interface with subroutine and function

As mentioned in another topic, it is forbidden to have a generic interface containing both functions and subroutines:

module foo
   generic :: a => asub, afunc   ! Not allowed...
contains
   subroutine asub(x)
      integer, intent(out) :: x
      x = 5
   end subroutine
   integer function afunc()
      afunc = 5
   end function
end module

Is there a reason why this is prohibited in Fortran?

The equivalent code in C++ is permitted:

void a(int &x) { x = 5; }
int a() { return 5; }

(Technically, both are functions in C++. The return type does not participate in overload resolution. Generally speaking, overload resolution is quite complex in C++, when compared with the TKR rule in Fortran.)

3 Likes

I will go one step further: why not having a subroutine and a function both with the same arguments and under a generic name. Afterall, it will always be clear if we refer to the subroutine (because it needs to be called) or the function.

Not that I would use such a functionality often, but it is particularly useful for interoperating a C function. There are plenty of such functions that would usually be called as subroutines in the Fortran side, as the function result is not needed in many cases (e.g. it’s just a flag indicating success or failure.) At the moment, the obvious way to treat such a C function, say foo, is to define a subroutine and a function with slightly different names. My usual convention is to interop the subroutine as foo, and the function as foo_Fn:

subroutine foo(...)
type (c_ptr) :: ptr
ptr = foo_Fn(...)
end subroutine foo

interface
  function foo_Fn(...) bind(c, name="foo") result(res)
  ...
  end function foo_Fn
end interface

This works, but it would be nicer to have a generic foo pointing to both a subroutine foo_Sub and a function foo_Fn could be used instead.

I’m confused. C does not support function overloading, AFAIK. Can you share a full example?

I guess I was in a hurry. Edit: I corrected my previous post.

There is no function overloading on the C side. Maybe an example will make it clear: Consider the C function setlocale (defined in locale.h), with a prototype:

char *setlocale(int category, const char *locale)

In most cases, you don’t really care about the result of this function. So if I want to port this function in Fortran I would make it a bind(c) subroutine. But for completeness I could define the corresponding function as well. A complete example follows.

module locale
use, intrinsic :: iso_c_binding
implicit none
private

interface
  function setlocale_Fn(category, locale) bind(c, name="setlocale") result(res)
  import :: c_ptr, c_int, c_char
  type(c_ptr) :: res
  integer(kind=c_int), intent(in), value :: category
  character(kind=c_char), dimension(*), intent(in) :: locale
  end function setlocale_Fn
end interface

public :: setlocale, setlocale_Fn

contains

subroutine setlocale(category, locale)
integer(kind=c_int), intent(in) :: category
character(kind=c_char), dimension(*), intent(in) :: locale
type(c_ptr) :: ptr
ptr = setlocale_Fn(category, locale)
end subroutine setlocale
end module locale
!-------------------------------------------------------------------------------
program test_locale
use, intrinsic :: iso_c_binding, only: c_ptr, c_null_char
use :: locale
implicit none

type(c_ptr) :: cstr

call setlocale(1, "en_US.UTF-8"//c_null_char)
cstr = setlocale_Fn(1, "el_GR.utf8"//c_null_char)

end program test_locale

(to get the result of setlocale_Fn in readable form you have to convert the C pointer returned to a Fortran string, but that’s irrelevant here.) What this program does is to set the locale (for numeric output) in US English (using the subroutine) then to Greek (using the function.) They both work as expected.

The important part is you can have both the subroutine and the function above defined. But sadly, you cannot have them under a generic name, even though they are clearly distinguishable.

C does not have subroutines, as far as I know. It has functions that can return nothing (void), but the language doesn’t distinguish syntactically between calling a function returning a result and calling one that doesn’t. In C you can also call a function and ignore the return.

While I can’t think offhand of a reason Fortran couldn’t do this, it seems to me not useful enough to spend time/effort developing the feature.

That’s exactly what I meant. Technically, they are all functions. But C libraries are literally full of non-void functions where you can (and often should, for simplicity) just ignore the result and not even mention it in a program. Typically, such a result is an integer indicating success or failure, which you can ignore if success is guaranteed - or you just don’t want to bother, for whatever reason. The result can also be a C “string”, as in the example above - or anything, really. I have even seen C functions returning the exact same thing as one of their arguments, for no reason other than having a result instead of just being void.

Fortran clearly distinguishes subroutines and functions, C just have functions that can behave more or less like some kind of “subroutines”. Usually those are ported in Fortran as just functions, where you must use the result, even if you don’t need it. You have to, say, declare an integer named check and call the function as check=foo_function(...), even if you don’t care about check. This becomes tedious very quickly.
A more “elegant” way to deal with such C functions is something like my little example above - but even then, you have to change the name of the subroutine or the function.

It would be nice if we could have a generic interface for those two under a common name. For example we could have a private subroutine setlocale_Sub and a private function setlocale_Fn “wrapped” in an interface under a public generic setlocale. Those two would have the exact same arguments, yes, but they are still distinguishable in Fortran, because it’s a subroutine and a function. Whenever we call the generic, it will be crystal clear if it is the subroutine we want - call setlocale(...) - or the function - foo=setlocale(...), print *,setlocale(...) etc. So I see no apparent reason why an interface of those two is not possible. Granted, there is no real need for something like that in Fortran itself. I can think of it being useful when porting C functions though.

I’ve been pointed toward the constraint C1514 (pg. 316 in J3/23-007r1)

Within the scope of a generic name, each pair of procedures identified by that name shall both be subroutines or both be functions, and … [rules for dummy arguments omitted]

although the rationale why this constraint was introduced in the first place is omitted.

The best reason I could think of myself is because adding free functions to a generic interface (also known as an overload set) usually comes with some expected behaviour (semantics). Permitting functions and subroutines to have the same generic name, could be surprising to the developer.

Quoting from C++ Software Design (pg. 60):

The problem is that it might not always be entirely clear what the expected behaviour is, especially for an overload set that is scattered across a big codebase. You might not know all the expectations and all the details. Thus sometimes, even if you’re aware of this problem and pay attention, you might still not do the “right” thing. This is what several people in the community are worried about: the unrestricted ability to add potentially LSP-violating functionality into an overload set. And as state before, it’s easy to do. Anyone, anywhere, can add free functions.

In the C++ Core Guidelines this is expressed as,

  • C.162: Overload operations that are roughly equivalent
  • C.163: Overload only for operations that are roughly equivalent.

In other words of K. Iglberger,

Whereas C.162 expresses the advantage of having the same name for semantically equivalent functions, C.163 expresses the problem of having the same name for semantically different functions.


I think the following line of reasoning goes both ways:

Imagine you are writing a cash register system for your newly opened craft beer pub. You set the display language so that local pub staff can use it:

cstr = setlocale(1, "el_US.utf8"//c_null_char)

One of your employees is an ale aficionado, so he writes a subroutine that allows registering beers into different categories:

interface  setlocale
   module procedure setlocale_sub
end interface

integer, parameter :: CATEGORY_ALE_IPA = 1
integer, parameter :: CATEGORY_ALE_BLONDE = 2
integer, parameter :: CATEGORY_ALE_AMBER = 3

contains

! Set the local ale (as in beer) of choice 
subroutine setlocale_sub(category, ale_name)
   integer(c_int) :: category
   character(len=*) :: ale_name
end subroutine

so you register a few of them:

cstr = setlocale(1, "el_US.utf8"//c_null_char)  ! The C function
! ...
! ...
category = CATEGORY_ALE_AMBER
call setlocale(category, "Olde Fortran Malt")

A few months later you need to extend the software, and you hire a young C programmer straight out of college. Naive as they come, the programmer assumes that the two setlocale functions are the same, and starts wondering if the whole project is a practical joke… He resigns from the project the same afternoon.

2 Likes

This explains the reason the overloading is permitted in C++. Both are functions, and since the return argument can be ignored, you can also call them the same way:

foo(a1,a2);     // returns void, no error can occur
foo(b1,b2,b3);  // returns int error flag, but we discard it

thus making the functions suitable to use in the same overload set.

But in Fortran, this might make you raise an eyebrow:

call foo(a1,a2)      ! Huh, why doesn't foo return something?
stat = foo(b1,b2,b3)

Instead we adopt the Python zen “Explicit is better than implicit” and use the rename at import facility:

use baz, only: foo
use bar, only: foo_always_safe => foo

call foo_always_safe(a1, a2)
stat = foo(b1,b2,b3)
if (stat /= 0) error stop "foo failed"
1 Like

It’s just a data point but gfortran allows some procedures to be accessed either as a function or a subroutine (but not both), ETIME (The GNU Fortran Compiler) as an example. However, I think this is limited to the GNU extensions.

For the setlocale case, I personally wouldn’t bother wrapping the routine going the full way:

This way, both Fortran users not familiar with C conventions (null-terminated strings) and power-users can select the option they prefer:

integer(c_int), parameter :: LC_ALL = 1
character(len=:), allocatable :: locale_US
type(c_ptr) :: dptr  ! Dummy pointer

! --- Fortran users ---

call setlocale(LC_ALL,"en_US.UTF-8",locale_US) ! switch to US
call setlocale(LC_ALL,"el_GR.utf8")            ! switch to Greek

! Restore US locale
call setlocale(LC_ALL,locale_US)

! --- Advanced users familiar with C ---
dptr = c_setlocale(1,"en_US.UTF-8"//c_null_char)
1 Like

That might be a reason, although I’m still not convinced. I totally agree having a function and a subroutine with the same (generic) name looks… ugly. And yes, it will definitely raise an eyebrow. I totally agree explicit is better than implicit - afterall, I’m still using the F standard whenever I can, which is as explicit as it gets.
I would, however, make an exception when dealing with ported C functions, where the ambiguous behavior is already there. In such cases, and such cases only, I’m not sure which one is uglier, this:

… or this (direct copy from a program, before I “upgraded” my GLFW bindings):

integer :: check
...
check = glfwSetKeyCallback(window, windowKeyCallback)
check = glfwSetWindowSizeCallback(window, windowResizeCallback)
check = glfwSetWindowCloseCallback(window, windowCloseCallback)
check = glfwSetMouseButtonCallback(window, windowMouseBCallback)
check = glfwSetCharCallback(window, windowCharCallback)

And, trust me, this is not as ugly as it can be. It was taken from a simple example program, I was just lazy to look in my directories for a more “real-world” example, which would easily have 10 more check = ... in a row. Tedious and ugly as it is, it raises an eyebrow as well. For the story, I made the above less ugly by using the “trick” I tried to explain earlier, so there is no need for all those useless checks anymore.

In Fortran, the compiler can easily tell which one we call - the function or the subroutine. In C, however, a function with its result omitted behaves like a subroutine of shorts. It’s a function, but behaves like a “subroutine”, for all practical purposes; a “subroutine” without the call statement. A loosely-defined “subroutine”, if you will. So the C developer in your example should have been familiar with that already.

Last but definitely, absolutely not least, the funniest part of all the above was the ale part, and the “Olde Fortran malt” pun - which I assume most Fortraners in this discourse are aware of.

The setlocale was just a quick and easy example I used, built-in C, so I wouldn’t have you bother with a specific third-party C library.

The optional returning argument you mentioned was another option I was considering when I first started using iso_c_binding: Fortran subroutines porting those C functions, and if the user wants the result, they can have it by adding an optional last argument. However I thought a subroutine named as the C function (which will be used 90% of the time) and a Fortran function named accordingly (with a _Fn suffix, used less often) is more convenient, because you can literally take a C example program and “translate” it to a Fortran program “on the fly” somewhat more easily.

In my bindings, I never leave the user at the mercy of C “strings”. The bindings always have a c_f_string() function which is used to return a normal Fortran string instead of a c_ptr. Such a function should have been part of the iso_c_binding, if you ask me. The conversion function I am using is very similar to the one you posted - I just omitted that part in my example for the sake of shortness.
And the user never needs to add c_null_char to strings passed as arguments, my bindings always do that anyway. I just tried to point out the function/subroutine thing, omitting the “details”.

Edit: The subroutine and function approach would allow something like

if (foo_Fn(...)==1) then
  ...

which is often the only reason to use the function result instead of omitting it, and isn’t directly possible with the optional argument in a subroutine.

1 Like

Absolutely, one can find arguments in favour or against generic names covering the mixed case of both subroutines and functions. I know I’ve encountered this constraint in the past in other scenarios besides the C binding one you brought up.

For example a “generate” which writes result the result to existing memory or also allocates it:

subroutine generate(x,y,f)
   real, intent(in) :: x(:)
   real, intent(out) :: y(:)
   procedure(real_func) :: f
   do i = 1, size(x)
      y(i) = f(x(i))
   end do
end subroutine

function generate(x,f) result(y)
   real, intent(in) :: x(:)
   procedure(real_func) :: f
   real, allocatable :: y(:)

   allocate(y,mold=x)
   call generate(x,y,f)
end function

Granted, you could allocate outside but sometimes it’s more convenient to use allocation on assignment:

allocate(y,mold=x)
call generate(x,y,f) 

y = generate([x,x],f)

Since C1514 is a formal constraint in the standard, a standard-conforming processor is required to have the capability of diagnosing violations of the constraint.

But it would be nice to know when was this constraint introduced (presumably Fortran 90), and mainly for what reason.

I tried it with all gfortran major versions from 5 to 13,

real(4) :: v(2), time
call etime(v,time)
time = etime(v)
end

but the compiler always returns an error:

/app/example.f90:4:12:

 time = etime(v)
            1
Error: Unexpected use of subroutine name 'etime' at (1)

Addendum I: ooops, I missed the extra piece of information you quoted:

This intrinsic is provided in both subroutine and function forms; however, only one form can be used in any given program unit.

If I comment either one out, it works.

Addendum II: the symbols for etime as a subroutine or function differ in the symbol table:

                 U __gfortran_etime
                 U __gfortran_etime_sub

so the compiler appears to treat this case specially. In contrast, in C, even if the return value is discarded, the same function is called:

! f_etime.f90
function f_etime(v) bind(c)
use, intrinsic :: iso_c_binding
real(c_float), intent(out) :: v(2)
real(c_float) :: f_etime
f_etime = etime(v)
end function
extern float f_etime(float v[2]);
int main(void) {
    float values[2];
    float time = f_etime(values);
    f_etime(values);
    return 0;
}
$ gfortran -c -Wall f_etime.f90 
$ gcc-13 -o call_etime call_etime.c f_etime.o -lgfortran
$ nm call_etime
0000000100008008 d __dyld_private
                 U __gfortran_etime
0000000100000000 T __mh_execute_header
0000000100003ec8 T _f_etime
0000000100003e9a T _main
                 U dyld_stub_binder

The text first appears in Fortran 2003. I can find it in drafts as early as 1994, but at that time “papers” were physical, not electronic, so any record of “why” isn’t readily available.

1 Like

In such situations I usually go for the function version, returning an allocatable array. Having the option to use a subroutine would be nice, however.

Presumably F90/95, indeed. To me at least, the “real need” for such functionality comes with iso_c_binding which is a F2003 thing.

My old codes used several compiler-specific extensions which had similar constraints. They barely compile today - and that after some modifications, because extensions tend to change over time. But using extensions was more or less mandatory back then; everybody used them. Those constraints, together with the obvious lack of portability, is the reason I learned to avoid extensions since then.

I’m sure Dr. Fortran could write a “Ask Doctor Fortran” post about the topic in the usual style of writing, something like “to function or to subroutine, that is the question”.

1 Like

Pure functions must have all their arguments intent(in) unless they are pointers, procedures or have the value attribute, but pure subroutines may have intent(out) and intent(inout) arguments.

I can see a use case, for functions that return an array (or any large object):

function ffoo(...)
real, allocatable :: ffoo(:)
...
allocate( ffoo(n) )
...
end function

program bar
real, allocatable :: a(:), b(:)
a = ffoo(...)
allocate( b(m) )
...
b(:) = ffoo(...)

In the first case you can reasonably expect the compiler to perform a hidden move_alloc(ffoo,a) to avoid an aditional allocate + copy. But in the second case the compiler will likely allocate ffoo, fill it, then copy the result to b(:). In the latter case having a subroutine is better:

function ffoo(...)
real, allocatable :: ffoo(:)
...
allocate( ffoo(n) )
...
end function

subroutine sfoo(...,x)
real, intent(out) :: x(:)
...
x(1:n) = ...
...
end subroutine 

program bar
real, allocatable :: a(:), b(:)
a = ffoo(...)
allocate( b(m) )
...
call sfoo(...,b)

If sfoo() is called repeatedly with the same b that is allocated once, the subroutine will likely be more efficient. Then, one would like to get ffoo() and sfoo() under a single generic name:

interface foo
   module procedure ffoo, sfoo
end interface

I didn’t look hard enough - the restriction is indeed in Fortran 90, 14.1.2.3 Unambiguous generic procedure references:

Within a scoping unit, two procedures that have the same generic name must both be subroutines or both be functions, …

I asked Malcolm about it and he replied, “I might guess that the reason could be that it was considered to be a bad idea on the grounds that it makes mistakes more likely, and programs harder to read. But I was not involved in any technical discussion on the restriction.”

1 Like