Newly registered and wanted to share with you my experience in building stdlib:
OS: Ubuntu Gnome desktop 22.04.5 LTS
CPU: 2x AMD Epyc 7343. 512GB ram.
nvfortran: 24.11-0 (supports my 1660Super gfx card - Turing arch)
Nvidia env setup via modules.
Python3.10 env for fypp setup with venv.
CMAKE_INSTALL_PREFIX=/home/bob/local/Fortran_stdlib/stdlib
After cloning the stdlib Git repo into stdlib_src I ran from that directory: cmake --fresh -B build -DCMAKE_INSTALL_PREFIX=/home/bob/local/Fortran_stdlib/stdlib
and experienced no problems.
Followed by: cmake --build build
with no incident.
…and: cmake --build build --install test
which ran through all tests without an error, in just over 32s.
I’m surprised to say the least. stdlib has several 2008 features that nvfortran does not yet support AFAIK. I have nvfortran 24.11 on one machine and got the 25.5 version in another… with neither have I been able to build stdlib as there are several modules which will fail due to those features.
Sorry to play the skeptic but, are you sure that when you compiled it was not the default gnu compilers that took over? Even if you activate the nvidia hpc_sdk module, most probably you have gnu11 by default, and if you did not specify the compilers with -DCMAKE_Fortran_COMPILER=nvfortran -DCMAKE_C_COMPILER=nvcc -DCMAKE_CXX_COMPILER=nvcc it is likely that you actually compiled with gfortran/gcc instead of nvfortran/nvcc.
cmake -B build -DCMAKE_Fortran_COMPILER=nvfortran -DCMAKE_C_COMPILER=nvcc -DCMAKE_CXX_COMPILER=nvcc -DCMAKE_INSTALL_PREFIX=/home/bob/local/Fortran_stdlib/stdlib --fresh
-- The Fortran compiler identification is NVHPC 24.11.0
-- The C compiler identification is GNU 11.4.0
-- Detecting Fortran compiler ABI info
-- Detecting Fortran compiler ABI info - done
-- Check for working Fortran compiler: /opt/nvidia/hpc_sdk/Linux_x86_64/24.11/compilers/bin/nvfortran - skipped
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /opt/nvidia/hpc_sdk/Linux_x86_64/24.11/compilers/bin/nvcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Performing Test WITH_CBOOL
-- Performing Test WITH_CBOOL - Success
-- Performing Test WITH_QP
-- Performing Test WITH_QP - Failed
-- Performing Test WITH_XDP
-- Performing Test WITH_XDP - Failed
-- Performing Test f18errorstop
-- Performing Test f18errorstop - Success
-- Performing Test f03rank
-- Performing Test f03rank - Failed
-- Performing Test f03real128
-- Performing Test f03real128 - Failed
-- Searching for external BLAS/LAPACK
-- Looking for Fortran sgemm
-- Looking for Fortran sgemm - not found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Looking for Fortran sgemm
-- Looking for Fortran sgemm - found
-- Found BLAS: /usr/lib/x86_64-linux-gnu/libopenblas.so
-- Looking for Fortran cheev
-- Looking for Fortran cheev - found
-- Found LAPACK: /usr/lib/x86_64-linux-gnu/libopenblas.so;-lm;-ldl
-- Found external BLAS: /usr/lib/x86_64-linux-gnu/libopenblas.so
-- Found external LAPACK: /usr/lib/x86_64-linux-gnu/libopenblas.so;-lm;-ldl
-- Using standard 32-bit integer interface
-- test-drive: Find installed package
-- The CXX compiler identification is GNU 11.4.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /opt/nvidia/hpc_sdk/Linux_x86_64/24.11/compilers/bin/nvcc - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Configuring done
-- Generating done
-- Build files have been written to: /home/bob/local/Fortran_stdlib/stdlib_src/build
2nd run:
[ 15%] Building Fortran object src/CMakeFiles/fortran_stdlib.dir/stdlib_error.f90.o
NVFORTRAN-S-0034-Syntax error at or near identifier rank (/home/bob/local/Fortran_stdlib/stdlib_src/build/src/stdlib_error.f90: 382)
NVFORTRAN-S-0034-Syntax error at or near end of line (/home/bob/local/Fortran_stdlib/stdlib_src/build/src/stdlib_error.f90: 383)
NVFORTRAN-S-0034-Syntax error at or near end of line (/home/bob/local/Fortran_stdlib/stdlib_src/build/src/stdlib_error.f90: 385)
NVFORTRAN-S-0034-Syntax error at or near identifier default (/home/bob/local/Fortran_stdlib/stdlib_src/build/src/stdlib_error.f90: 387)
0 inform, 0 warnings, 4 severes, 0 fatal for appendr
gmake[2]: *** [src/CMakeFiles/fortran_stdlib.dir/build.make:996: src/CMakeFiles/fortran_stdlib.dir/stdlib_error.f90.o] Error 2
gmake[1]: *** [CMakeFiles/Makefile2:2455: src/CMakeFiles/fortran_stdlib.dir/all] Error 2
gmake: *** [Makefile:146: all] Error 2
My understanding is that nvfortran is under minimal maintenance as Nvidia is investing heavily on flang Welcome to Flang’s documentation — The Flang Compiler, which I imagine will replace it once it reaches production readiness.
If you want to do GPU programming and also use (help improve maybe?) stdlib, you can consider installing the GNU offloading extension instead. You might not get the top notch performance from the vendors compilers but at least get the gists of the craft.
Thanks for the link, that was an interesting read.
So my options are, basically:
Cuda Fortran compiled with nvfortran allowing me to write cuda kernels, but does not support features in the stdlib. (There was something about NVHPC v24.11 that was the last release to support my hardware - so I haven’t even tried any later releases.)
gfortran with GPU extension to offload loops to the GPU & support for the stdlib previously mentioned, but no ability to code cuda kernels.
For now, I believe both options are fine. Coding isn’t my day job; PCB layout and database management is.
I guess you can mix gfortran (to have the recent additions to the standard) and nvfortran (to write GPU kernels and wrappers to them). Just, you are then restricted to pure F77 style interfaces between them.
What is missing in the standard is in my opinion is a standardized ABI that would allow modern Fortran interfaces between different compilers. In practice it would be standardized .mod files. It could be optional, enabled with a bind(Fortran) specifier.
The bind(c) approach works to use multiple Fortran compilers in the same project, except descriptor arrays, which do have an API, but not an ABI, and while one could write adaptors per compiler, as suggested here: blog/Dealing_with_imperfect_Fortran_compilers_2.md at main · jeffhammond/blog · GitHub, it would be better to figure out an ABI for this, like for the rest of the bind(c) features.
One can imagine adding more Fortran features into this bind(c) subset.
There is some tiny overhead at the API boundary, as the Fortran compiler must convert internal representation to the ABI-stable C interface, but this is made explicit by the programmer using bind(c) in the function declaration. The data is not copied but the descriptor might. That seems acceptable.
The bind(Fortran) approach might work in a similar way. Possibly it might not even be needed if bind(c) is powerful enough.
Why there has been no effort to come up with a standardized module format is still a puzzle to me. Personally, I would place a higher priority on adding a standardized module format to the language than about 99% of the things that have been added in F23 and proposed for F2Y. The amount of time wasted building compiler specific versions of Fortran libraries just because the module formats are incompatible alone justifies some effort to standardized formats. I think a standard format could easily exist with native module formats and act as a “bridge” between compilers in the same way the Coordinate based sparse matrix storage formats act as a bridge between other storage formats in some (most) sparse matrix packages.
While nice, I lack confidence that the (commercial) compiler developers would view it as a priority.
As we’ve discussed already in this thread, the Nvidia compiler is not usuable for proper Modern Fortran (It also does not support coarray syntax on a single image like gfortran does.)
Flang-new will get there (It at least gives not-implemented error messages rather than mysteriously fails to compile), but it is a long way from being fully ready in my view and experience (I spent three days with the AMDflang-new last month).
Intel does best in my experience, but is of course not actually useful for Nvidia/AMD GPUs.
So if they don’t even have existing (and long-existing) features, why would they implement something to make it easier to not use their compiler?
The ABI includes aspects like the processor instruction set and calling conventions. With the multitude of machines on the market, I don’t think this is something the Fortran committee can (or is supposed to) tackle.
and ways to pass them suggestions. The main obstacle I see is that currently all vendors use their own array descriptor. There are are also other obstacles, like e.g. allocatable arrays:
subroutine foo(a)
real, allocatable :: a(:)
! ...
end subroutine
To make allocation and deallocation work, the compilers also need to share the runtime library. So there are many things which have to be discussed, and the reality is the majority of Fortran programmers don’t care how these things are implemented.
Flang made an interesting decision that the module format is a stripped-down version of the original module. For example if I take stdlib_array module as an example, after compilation with flang I get,
$ flang-new -c stdlib_array.f90
$ cat stdlib_array.mod
!mod$ v1 sum:ee48f2a609b77219
module stdlib_array
private::logicalloc
contains
pure function trueloc(array,lbound) result(loc)
logical(4),intent(in)::array(:)
integer(4),intent(in),optional::lbound
integer(4)::loc(1_8:int(count(array),kind=8))
end
pure function falseloc(array,lbound) result(loc)
logical(4),intent(in)::array(:)
integer(4),intent(in),optional::lbound
integer(4)::loc(1_8:int(count(.NOT.array),kind=8))
end
pure subroutine logicalloc(loc,array,truth,lbound)
integer(4),intent(out)::loc(:)
logical(4),intent(in)::array(:)
logical(4),intent(in)::truth
integer(4),intent(in),optional::lbound
end
end
The hermetic module file option looks interesting. Sorta like what I sometimes do in having a “master” mod file that just contains USE directives for all the modules in program. Of course you have to be very careful to avoid name conflicts between various modules but I do that as a matter of personal coding standards. Once something has a name in one module that name is not repeated in other parts of the code.
Agree, and this would go against the “hardware agnostic” principle that drives the standard. There could be recommendations, though, which would not be part of the standard. And the committee still looks the most appropriate place to issue such recommendations.
Probably because the module approach can potentially enable hardware dependent optimizations, for instance by specifying that a given argument must be passed in a register rather than on the stack.
This is something I’ve wondered about for decades. Historically, fortran programs require the same compiler for the whole program. Certainly now because of modules, optional arguments, assumed shape arguments, and so on, there are many low-level code conventions and low-level data structures that must be shared throughout the code, and if those aren’t standardized somewhere, either by the language standard, or an OS standard, or by a hardware-specific standard, then interoperability among compilers is impossible. There are also i/o library conventions which make it difficult, or impossible, to write routines that can be compiled by one compiler and used by another.
Yet, somehow, C compilers seem to do this much better than fortran compilers. Anyone know why? It may not always work, but it is not unusual at all for objects compiled with one C compiler to work seamlessly with other compilers. What is it that makes this difference in the languages? To what extent is this feature shared by C++ compilers, especially with code that uses the higher-level features of the language?