Hi everyone. I’m a real beginner with Fortran and have found this site a fantastic resource. My question is actually one my son asked me and I couldn’t answer, so I wondered if anyone might be able to educate me a bit!
So a thing I have found useful in learning the language is doing some simple programs to help my son with his school maths homework - really simple routines to help him check his answers in homework, that kind of thing. I showed him a simple program I did which I ran on both my Linux and Windows laptops. I had originally compiled the code on the Linux (Ubuntu) machine and it only takes up 18Kb on the machine. However, when I compile the very same code on the Windows machine, the file is over 2.5Mb!
There is absolutely no difference in the code and they were both compiled with the respective Windows and Linux versions of gfortran. So he asked me why one file was so much bigger than the other, and I can’t answer him. Any ideas?
@FortranFan thanks for your reply! It will definitely be interesting to hear what people think.
I get that end executable to 16.0Kb on Linux
I’m sure there is a perfectly good explanation (I did think probably something to do with Windows not being very economical, but that’s a pure guess so I told my son I’d get the real reason from someone who knew a bit about it).
I’m using GNU Fortran (Ubuntu 11.2.0-7ubuntu2) 11.2.0, and Ubuntu 21.10 on the Linux PC and the Windows is Windows 10, but I’ll have to check the gfortran version tomorrow as it’s my work computer. I’m thinking maybe it’s an older version of gfortran, but that hardly would account for such a massive difference.
I haven’t had a chance to investigate this hunch (and I’m not entirely sure to go about doing so), but I suspect that some run-time/system library is being statically linked (i.e. the compiled code for interacting with the system/running any Fortran program is included in the executable file) by default on Windows, but not on Linux. Just a hunch, and I could be totally wrong.
There are several different distributions of Gfortran for Windows (Cygwin, Mingw64, etc.) and the following applies only to the Cygwin version 9.3 on Windows 10 X64 that I have.
S:\lang>cat > end.f90
end
S:\lang>gfortran -O2 end.f90 -s
S:\lang>dir a.exe
Volume in drive S is RAMDISK
Volume Serial Number is xxxx-yyyy
Directory of S:\lang
11/29/2021 05:03 PM 8,704 a.exe
1 File(s) 8,704 bytes
0 Dir(s) 64,946,176 bytes free
It also enables -finline-functions, causes the compiler to tune for code size rather than execution speed, and performs further optimizations designed to reduce code size.
Concerning the Windows executable being larger, I think the right answer is the one from @everythingfunctional. The libgfortran runtime library is probably linked statically on Windows (see Link Options (The GNU Fortran Compiler)). I’ve done some searching but I couldn’t find the right commands needed to see all the steps taken when you call gfortran.
On many systems you can choose to build a program as a dynamically loaded executable or
as a static or “self-contained” program.
On GNU-Linux the command “ldd” will tell you whether your program is statically linked, or will list all the additional libraries required to execute it that will be loaded when you invoke it.
As mentioned, compiler options can change the size of the executable extensively. The commands that are most useful are “size”,“strip”, and “ldd”. No idea what MSWIndows does; but on ULS (Unix-Like Systems) using gfortran:
In MSWIndows where you might want to build a program you could then execute in any of the different MSWIndows environments it would make sense to have a default be static.
I am a strong proponent of static loading under many circumstances, but if a program uses something like MPI or PC or X11 Windows graphics dynamic loading is almost mandatory so the discussion of the pros and cons can get quite technical, but note that there is a big difference in the size between the two executables above because to actually execute a lot of system libraries for doing everything from loading your program to doing I/O are required. When static, all that gets packed right into your executable so it is in a lot of ways a “stand-alone” executable; in the dynamic executable case the file contains a minimal amount of information mostly about your code and what additional parts it needs to execute; and when you actually execute it in the one case your file is pretty much just loaded into the system and in the other case your program is merged with a lot of other files at the last moment (creating pretty much the equivalent of the static file (well, sort-of) and then executed.
As others have already said, this is because of static vs. dynamic linking. Moreover, I think your observation is specific to the gcc installation that comes from equation.com, which does not ship with any dynamic (dll) versions of the non-windows libraries. As @urbanjost explains, this allows for essentially standalone executables that you can redistribute.
I’m using ldd from MSYS2.
The winlibs.com distribution allows both static and dynamic linking of the non-windows libraries.
PS. It’s great to hear you learning Fortran alongside your son with maths examples. I learnt programming in a similar way by writing small programs for my school maths problems. Maths and programming complement each other really well for learning; maths provides great example problems for programming, and programming teaches a logical and algorithmic mindset for approaching mathematics.
Thanks for the replies everyone, really interesting and it looks like I have an answer for my son, as well as a nice bit of learning for myself. I was actually wondering this morning would the use of system calls be the issue. The program I mentioned in my original post uses a simple system call to clear the screen and that’s the only difference in the code between them (because of the use of ‘clear’ in the Ubuntu version, ‘cls’ in Windows), and I noticed that some of the other executables I’ve done from Fortran code in Windows are a little smaller in file size when they don’t use any system calls. But of course, correlation is not causation so that could be pure coincidence.
But now I have read the helpful posts above by @urbanjost , @lkedward and others the static V dynamic linking and the choice of gcc installation makes sense to me.
No prob. In the past even Unix loaded everything statically, but partly to accomodate machines with smaller memory and file systems dynamic loading became very common. Using dynamic loading allows for the machine to load the same text (executable instructions) being used by several programs in once and let several programs share it, for example.
Now that file space and memory are very much cheaper that feature is less important, but one of the big differences is that if you load something statically it takes a snapshot of the external routines there at the moment and places them in the file; while if something is dynamically loaded it uses the ones available at the time you execute the program; so if the libraries changed you are actually running a different program than the last time you ran .
That might be good. Maybe someone fixed a bug in that library, for example. That might be bad. Maybe someone added a bug to that library, or that library is not on a system where you want to run.
The reason I mentioned the “size” command is because the size of the file can also change depending on how you declared arrays. If they are allocatable they will rarely change the file size significantly; but there are times where what arrays your program declares can change the file size dramatically. Nowadays the compilers are generally quite clever in avoiding creating giant files just because you use giant arrays though.
There are fancier programs like “objdump” for looking at what got put in the file, but those are things you rarely need to look at, but running something like “objdump -x a.out” can be interesting just to look at all the things that actually have to go into a program even as simple as the “end.f90” example just so it can execute.