Is there another function in Fortran like `merge` that does not evaluate its arguments?

A major use-case for the short-circuiting behavior that has been heavily discussed including on this Forum is to process optional parameters; often also brought up when discussing allowing default values for optional parameters.

function maybe(opt)
integer,intent(in),optional :: opt
integer :: optlocal
optlocal=merge(opt,100,present(opt))     ! some thought this was acceptable
optlocal=merge(opt/2,100,present(opt))  ! consensus was ultimately this was not OK

if you have multiple optional parameters using if/else/endif can quickly become quite cluttered; so there was a strong desire to have a short-circuiting MERGE() procedure.

2 Likes

In early versions of the Cray CFT compiler, one of the requirements for automatic vectorization of a loop was that it fit in a single basic block. IF statements in the interior of a loop mess this up as they split the loop into multiple basic blocks. The CVMGx intrinsics were introduced as a work-around to allowing some conditional code without breaking vectorization.

In the case of CVMGT, all three expressions in the actual arguments were always evaluated. Results of the first two arguments were kept in two vector registers. The logical expression was used to create a vector mask in the mask register. Then there was a conditional assignment instruction which would select an element from one or the other vector register based on the corresponding bit in the VM register, and place the result in a third vector register.

F90 introduced the MERGE intrinsic which should have the same semantics as CVMGT. I like using MERGE to shorten my code. However it doesn’t work when using it for the optional arguments use case. When one of the arguments in the MERGE is the optional argument, it tries to dereference the optional argument, and if not present, abends.

The history of this actually goes way back before Cray. Check out John McCarthy’s paper on the history of LISP. He initially attempted to use Fortran and found that a three argument function he called XIF useful… Page 5 of the following:

http://jmc.stanford.edu/articles/lisp/lisp.pdf

2 Likes

From a user’s perspective, I would expect the compiler to choose the fastest option considering that the semantics of merge allow to evaluate both expressions.

As a side note: np.where from NumPy also evaluates both expressions:

np.where([True,False],1/np.array([1,0]),np.array([1,1]))

has a valid result ((1.0,1.0)) but give a divide by zero encountered in divide warning.

1 Like

F23 also has conditional arguments, with .NIL. indicating an omitted argument. As for default values, I would very much like to see this, but haven’t been able to get traction on it. Figuring out what the syntax would be seems to be a sticking point, as would whether this is done on the caller or callee side.

What I do in my codes with many optional arguments is have local variables that are assigned either the passed value, if present, or a default. Then I use the local variable. It’s not a perfect solution.

So regarding using MERGE(), it should not have optional arguments passed to it, but you recommend
METHOD I, but METHOD III will be available in F2023? I too would like to see some succinct way of specifying defaults for optional values; I have seen something like shown below proposed several times but cannot remember the arguments against them. Is there a current proposal for that?

METHOD I for large arrays would seem less appropriate because it would require a copy, so I might want to do a conditional if/else/endif but gets so long and ugly otherwise that it is one of the few places where I could use semi-colons to make them into one-line constructs as shown. And I dislike having to make the extra declarations of the local names but seems like the most portable standard method for some time to come.

program testit
! lots of optional arguments
   call work()
   call work(b=11.0)
   call work(111.0,222.0,e=555.0)

contains

   subroutine work(a, b, c, d, e)
      real, optional :: a, b, c, d, e
      real          :: a_local, b_local, c_local, d_local, e_local

! METHOD I: Standard
! described method is standard
      if (present(a)) then; a_local = a; else; a_local = 0.0; end if
      if (present(b)) then; b_local = b; else; b_local = 0.0; end if
      if (present(c)) then; c_local = c; else; c_local = 0.0; end if
      if (present(d)) then; d_local = d; else; d_local = 0.0; end if
      if (present(e)) then; e_local = e; else; e_local = 0.0; end if

! which, at least for standard types you could make a generic for.
! using a contained procedure for a real argument to illustrate
      a_local = short_circuit(a)
      b_local = short_circuit(b)
      c_local = short_circuit(c, 10.0)
      d_local = short_circuit(d, 20.0)
      e_local = short_circuit(e, 30.0)

! METHOD II: WILL FAIL WITH SOME COMPILERS, PERHAPS ONLY WITH CERTAIN COMMAND SWITCHES
! issues with Fortran not short-circuiting and questions on
! whether undefined variable can be used in an executed statement.
! general concensus seems to be if just the variable is passed and
! not an expression with an operator or a function call it is OK,
! but good arguments against because MERGE() description does not
! indicate that the first argument is optional.

      a_local = merge(a, 0.0, present(a))
      b_local = merge(b, 0.0, present(b))
      c_local = merge(c, 0.0, present(c))
      d_local = merge(d, 0.0, present(d))
      e_local = merge(e, 0.0, present(e))

! METHOD III: ! F 2023 PROBABLY NOT AVAILABLE YET

   a_local = (present(a) ?a:0.0)
   b_local = (present(b) ?b:0.0)
   c_local = (present(c) ?c:0.0)
   d_local = (present(d) ?d:0.0)
   e_local = (present(e) ?e:0.0)

   write(*,*) a_local, b_local, c_local, d_local, e_local
   end subroutine work

   function short_circuit(variable, default)
      real, optional, intent(in) :: variable
      real, optional, intent(in) :: default
      real                       :: short_circuit
      real                       :: default_local
      if (present(default)) then
         default_local = default
      else
         default_local = 0.0
      end if
      if (present(variable)) then
         short_circuit = variable
      else
         short_circuit = default_local
      end if
   end function short_circuit

end program testit

The Future?

! one possible syntax for defining defaults
subroutine work(a=0.0,b=0.0,c=0.0,d=0.0,e=0.0)
   real,intent(in),optional :: a,b,c,d,e
   write(*,*) a, b, c, d, e
end subroutine work
! double-equal means to assign on each call instead of first instantiation, 
! but used with parameters only if optional. Although having == and = have
! different meanings has historically been a problem with C, for example
subroutine work(a,b,c,d,e)
   real,intent(in),optional :: a==0.0,b==0.0,c==0.0,d==0.0,e==0.0
   a=a+1
   write(*,*) a, b, c, d, e
end subroutine work
1 Like

Here is another test program where the compiler may choose to evaluate elements of an array that will not be needed when SUM is used with the MASK= argument. Invalid subscripts or undefined values may be used. Would it not be useful to let the user specify “evaluate only those elements of a vector expression that match mask = .true.”?

program tmsum

   ! test sum(array expr, mask=mask); compiler must support "allocate on assign"
   ! does expr have to be possible to evaluate even when the corresponding mask element = .false.?
   integer :: idx(6) = [ 1, 0, 3, 0, 5, 0 ]
   real    :: x(6)
   integer, allocatable :: sidx(:)

   x([1,3,5]) = [11.0, 13.0, 15.0]  ! x([2,4,6]) are undefined
   sidx = pack(idx, idx > 0)
   print *,sum(x(sidx)*x(sidx))     ! should always work

   print *,sum(x(idx)*x(idx), MASK = idx > 0) ! may not work, may reference x(0)
                                              ! or x(2), x(4), x(6)
end program
3 Likes