Conditional compilation based on environment variables

I noticed a bit of strange FORTRAN code logic here, as described below.

The following code snippet is used in the Makefile:

$(MOD1):%.o:%.f90
	$(F90) -c $(FLAGS) -fpp -DIRVSPDATA $*.f90

And the following code snippet is used in the nonsymm.f90

#ifdef IRVSPDATA
      call get_environment_variable('IRVSPDATA',spgpath)
      !spgpath = '/storagehome/jcgao/soft/irvsp/src_irvsp_v2_zj'
#else 
      write(6,*) "Environment variable 'IRVSPDATA' must be provided."
      write(6,*) "Please run the following commands to make the library:"
      write(6,*) "./configure.sh"
      write(6,*) "source ~/.bashrc"
      write(6,*) "make"
      stop
#endif 

It seems that the above code snippets have implemented the conditional compilation based on the environment variable IRVSPDATA provided by the user. But based on my testing, even this variable, even if this variable is not provided, the program can still be compiled smoothly, and this variable only needs to be set at run time.

So, I’m very confused above the code logic used here. Any hints/comments will be highly appreciated.

Regards,
HZ

Since the definition is forced in the make compilation command, the first branch is always compiled. I agree the logic is a bit faulty. It looks to me as if written by a PhD student who never came back to complete it. If you look into configure.sh, you’ll find

#!/bin/bash
echo "export IRVSPDATA="`pwd` >> ~/.bashrc

I guess what you could do instead is use the status variable of get_environment_variable:

call get_environment_variable('IRVSPDATA',spgpath,status=stat)
if (stat == 1) then
   write(6,*) "Environment variable 'IRVSPDATA' must be provided."
   write(6,*) "Please run the following commands to make the library:"
   write(6,*) "./configure.sh"
   write(6,*) "source ~/.bashrc"
   write(6,*) "make"
   stop
end if

Ideally, you’d wrap get_environment_variable in a fail-safe function, which also checked the other conditions (sufficient string length, processor support) and gave you the option to provide a default value. An example would be get_env used in the Fortran package manager.

Using get_env you might write:

character(len=:), allocatable :: spgpath

spgpath = get_env('IRVSPDATA')
if (spgpath == '') then
   write(6,*) "Environment variable 'IRVSPDATA' must be provided."
   ! ...
end if

This is not the way I expected. I don’t want to use such script to automatically modify the user profile. On the contrary, I want to use a method similar to the following before make command is executed:

$ export IRVSPDATA=$(pwd)

Now, I tried to use the code snippet above, but failed when compiling with Intel oneAPI as follows:

$ module load compiler mkl mpi/2021.4.0
$ export IRVSPDATA=$(pwd)
$ make
ifort   -c -g -traceback  -fpp -DIRVSPDATA nonsymm.f90
nonsymm.f90(196): error #6404: This name does not have a type, and must have an explicit type.   [GET_ENV]
spgpath = get_env('IRVSPDATA')
----------^
nonsymm.f90(196): error #6054: A CHARACTER data type is required in this context.   [GET_ENV]
spgpath = get_env('IRVSPDATA')
----------^
compilation aborted for nonsymm.f90 (code 1)
make: *** [Makefile:88: nonsymm.o] Error 1

Any suggestions for solving this problem?

Regards,
HZ

Have you added the get_env function to your module/project?

Click to see source code
! Copyright (c) 2020 fpm contributors
!
! This work is licensed under the terms of the MIT license.  
! For a copy, see <https://opensource.org/licenses/MIT>.

    !> get named environment variable value. It it is blank or
    !! not set return the optional default value
    function get_env(NAME,DEFAULT) result(VALUE)
    implicit none
    !> name of environment variable to get the value of
    character(len=*),intent(in)          :: NAME
    !> default value to return if the requested value is undefined or blank
    character(len=*),intent(in),optional :: DEFAULT
    !> the returned value
    character(len=:),allocatable         :: VALUE
    integer                              :: howbig
    integer                              :: stat
    integer                              :: length
        ! get length required to hold value
        length=0
        if(NAME.ne.'')then
           call get_environment_variable(NAME, length=howbig,status=stat,trim_name=.true.)
           select case (stat)
           case (1)
               !*!print *, NAME, " is not defined in the environment. Strange..."
               VALUE=''
           case (2)
               !*!print *, "This processor doesn't support environment variables. Boooh!"
               VALUE=''
           case default
               ! make string to hold value of sufficient size
               allocate(character(len=max(howbig,1)) :: VALUE)
               ! get value
               call get_environment_variable(NAME,VALUE,status=stat,trim_name=.true.)
               if(stat.ne.0)VALUE=''
           end select
        else
           VALUE=''
        endif
        if(VALUE.eq.''.and.present(DEFAULT))VALUE=DEFAULT
     end function get_env

Concerning the workflow, I agree that using configure to append an environment variable to the .bashrc config is murky. It’s best to confront the user about it in the usage instructions and fail elegantly at runtime if the variable is not set (with a nice message how to fix this). After-all, there is no need to recompile, only to set the environment variable. Hence you could get rid of the conditional compilation logic entirely, and also remove the -DIRVSPDATA flag in the Makefile.

As an extra precaution you could add the following to the Makefile:

ifndef IRVSPDATA
$(warning Warning - Environment variable 'IRVSPDATA' has not been set.)
endif

You could also promote the warning into an error, which would cause make to fail. But I don’t think it’s of that much help. (It’s also not fail-proof, as a user can silence the warning by issuing the command make IRVSPDATA=0, leaving the actual environment variable unchanged.)

IMO, the conditional compilation would make sense if you attempt to insert the directory path directly into your Fortran code. This would however require a rebuild if you’d like to change the folder. To pass the path you could do make IRVSPDATA=$(pwd) or use the exported environment variable. In case you’d want to insert the path directly, you’d also need to quote the preprocessor define flag -D'IRVSPDATA="$(IRVSPDATA)"', and the string it contains. Personally, I think the runtime version is adequate.

No, I forgot that this is a custom function that needs to be inserted into the module/project where it’s needed.

Do you mean to set the variable to a fixed value directly in the FORTRAN code and then compile it conditionally?

This is obviously not a good practice and should be avoided.

As described below, is my understanding of this method correct?

  1. This method requires make IRVSPDATA=$(pwd) or the exported environment variable.
  2. However, as you mentioned earlier, it seems that using this preprocessor define flag in the question discussing here is a little clumsy and a bit awkward.

Do you mean the following method:

  1. Get rid of the conditional compilation logic entirely, and also remove the -DIRVSPDATA flag in the Makefile. In short, just leave the following line:
call get_environment_variable('IRVSPDATA',spgpath)
  1. Add the following extra precaution to the Makefile:
ifndef IRVSPDATA
$(warning Warning - Environment variable 'IRVSPDATA' has not been set.)
endif

BTW, where should the above code fragment be placed in the Makefile?

  1. Run make as follows:

$ make IRVSPDATA=$(pwd)
or
$ make

Regards,
HZ

Yes and yes.

  1. Yes, however you should keep the runtime check if ˙IRVSPDATA˙ is set and exit gracefully if not.
  2. Yes, but not so critical if you issue a runtime error message from your program. (You just don’t want the error to be issued once part of your calculations have already been done; you should try to exit as early as possible.)
  3. Since the IRVSPDATA variable is technically not used in the Makefile, you could place it anywhere (apart from inside a Make rule). I’d place it between lines 13-15 and also add a comment to what’s the purpose of it.
  1. Based on your earlier suggestion here, I adopted the following code snippet:
call get_environment_variable('IRVSPDATA',spgpath,status=stat)
if (stat == 1) then
   write(6,*) "Environment variable 'IRVSPDATA' must be provided."
   stop
end if

But the above code snippet will trigger the following compiling error:

$ make clean
$ module load compiler mkl mpi/2021.4.0
$ make
ifort   -c -g -traceback  -fpp nonsymm.f90
nonsymm.f90(180): error #6404: This name does not have a type, and must have an explicit type.   [STAT]
call get_environment_variable('IRVSPDATA',spgpath,status=stat)
---------------------------------------------------------^
nonsymm.f90(180): error #6362: The data types of the argument(s) are invalid.   [GET_ENVIRONMENT_VARIABLE]
call get_environment_variable('IRVSPDATA',spgpath,status=stat)
---------------------------------------------------------^
compilation aborted for nonsymm.f90 (code 1)
make: *** [Makefile:98: nonsymm.o] Error 1
  1. I’ve disabled the following Warning message triggered from Makefile:
ifndef IRVSPDATA
$(warning Warning - Environment variable 'IRVSPDATA' has not been set.)
endif

Regards,
HZ

It looks like you didn’t add the variable declaration, at line 172 add:

      CHARACTER(len=180) :: spgpath, spgfile   ! line 171
      INTEGER            :: stat               ! line 172

Apologies for not mentioning this earlier. (You could also name the integer spgstat for consistency.)

Great. The following code snippet does the trick:

INTEGER            :: spgstat
[...]
call get_environment_variable('IRVSPDATA',spgpath,status=spgstat)
if (spgstat == 1) then
   write(6,*) "Environment variable 'IRVSPDATA' must be provided."
   stop
end if

See here for the complete code snippet adopted now.

Based on the current test, the above method gracefully solves this problem:

$ module load irvsp/master
$ unset IRVSPDATA 
$ irvsp_v2 -sg 158
 Current command : irvsp_v2 -sg 158                                  
 Argument count :            2
You can choose another version by inputing a version number (nv):
###$: irvsp -v $vn 
### 2022-03-18  08:43:57

 unknown   
   P  lattice
 Non-symmorphic crystal without inversion symmetry
 Complex eigenfunctions
 No spin-orbit eigenfunctions
 No spin-polarization


 Transformations:
 Direct lattice vectors in Cartesian coord. system (BR2)
        t1 :            t2 :            t3 : 
      6.12000000     -3.06000000      0.00000000
      0.00000000      5.30007547      0.00000000
      0.00000000      0.00000000      5.65800000

 Reciprocal lattice vectors in Cartesian coord. system (BR4)
      0.16339869      0.09433828      0.00000000 : g1/2pi
      0.00000000      0.18867656      0.00000000 : g2/2pi
      0.00000000      0.00000000      0.17674090 : g3/2pi
 Environment variable 'IRVSPDATA' must be provided.

Best,
Hongyi

A possible explanation of the odd logic is that the developer only wanted the program compiled by someone who had run the configure script and had the environment variable therefore set (making a lot of assumptions, like the user is always using the bash shell, that the data directory is the same location the user built the program in, and so on … definitely not a robust way to ensure it). So the logic is probably just an unusual way (but I have seen it before) to warn someone compiling the program to build it in a specific way. The suggestions above put you on a better path, or simply reading the variable and if it is not set providing a warning and stopping would probably provide the originally desired effect; but I think that is very likely the purpose of the original code; basically — “only build this using the provided build method, which will try to change your environment so the code runs as designed”.

We should change the target user environment in a controlled way, not based on the black box operation of modifying multiple user profiles, which may cause many unexpected problems in the future.

Regards,
HZ

Totally agree; and the advice given above provides several more robust methods. I have seen similar constructs, sometimes even used by a single user-developer of a code, as a reminder to build a code a particular way. Making the code itself more general and robust as you are doing is indeed better as a long-term approach.