Building a Fortran compiler in pure Fortran (Mega thread)

This thread is to continue the discussion from the following thread :

Other Links :

Please feel free to participate.

@RonShepard
How would this affect interoperability with libraries and packages written in C for example ?
How would one get around this ?

1 Like

Presumably one would simply use the ABI/calling convention of an existing C compiler. GCC would probably be the conventional choice, but any would work so long as an implementation of the CFI_Fortran_binding.h header was made available.

1 Like

@Aurelius_Nero which part of the compiler do you want to write in Fortran? Here are the parts:

  • parser,
  • semantic analysis
  • optimizations on the intermediate representation
  • code generation to LLVM or even directly to machine code

Some of it, all of it?

If you want to have a pure Fortran frontend, you can for example generate LFortran’s intermediate representation (ASR) and reuse everything else.

If you or anybody else is interested in discussing this, I am happy to have a phone call. I recommend we join forces.

8 Likes

Given that I am still new to this, I would like to read up more before jumping in. I am happy to join forces with you.

  1. I would start with the parser and semantic analysis.
  2. I would like to see if it’s also possible to write SymEngine in pure Fortran as well.
  3. Let me study up more on compilers and then we can arrange a call.

Thanks

1 Like

For that, you need to rewrite the following files to pure Fortran:

The current status is that the free-form parser and tokenizer and prescanner is probably “beta” quality (or close to it), the fixed-form tokenizer/prescanner is “alpha”, the C preprocessor is “alpha”. The semantics is almost “beta” for some subset (such as “Minpack”), “alpha” for others (such as “stdlib”) and a prototype for some other stuff.

1 Like

The idea of writing a parser or tokenizer in Fortran isn’t new:

I believe that there are many projects around Fortran that would benefit from a pure Fortran compiler, or at least parts of it.
Perhaps it would be most appropriate to begin with these parts. If they turn out well, they can be used to build a whole compiler later.

1 Like

So I am heading down the right path then ? by picking the parser and semantic analysis first ?

Perhaps a project to spin off this is a MIT-licensed Fortran version of the intrinsics. Any new compiler such as LFortran might benefit, users who find the intrinsics unsuitable would have a resource for starting a custom Fortran version and if they are developed sufficiently existing compilers which often are calling C functions) could make them available, just for starters.

Unlike adding new procedures as in stdlib the interfaces are already designed, many old existing procedures exist (although the copyright status of many might be questionable or an issue).

Recent discussions in this forum on the Bessel functions and sum() have shown the potential value. Back before C several vendors supplied in their manuals the intrinsics and implementations in Fortran in great detail. CDC, Cray, and IBM did this for starters. Not just the code but the descriptions were great examples for programmers, particularly before on-line resources were readily available.

I still have a DIY collection of most of the intrinsics with range checking and other features added but the pedigree is such that those would not be suitable for release but there might be a number of them in netlib or other sources to start as seeds. If the descriptions of the development of the procedures as well as the code were available I think it would be a great resource, would be a terrific test bed for trying emerging features such as templating and parallel algorithms using coarrays, would be something Fortran only programmers could contribute to and would further several ongoing projects and potentially even existing compilers as well as the one described here.

For a new programmer being able to see a trivial implementation of something like SUM() where you just loop through and add elements of the type you are interested in; and then some versions
conditioning the data or correcting for accumulated error; and then seeing how that can be made generic; and then seeing a parallel version would have immense value in my opinion; and would contribute to this project as well.

It seems to be the low-hanging fruit that has immediate value of its own.

Personally, I think some of those old manuals were a terrific resource for learning about numeric methods and the perils of floating point operations that I sorely miss.

Things like this were available from multiple sources …

::::::::::::::
src/dasin.f
::::::::::::::
      double precision function dasin (x)
c may 1980 edition.   w. fullerton, c3, los alamos scientific lab.
      double precision x, asincs(39), pi2, sqeps, y, z, dcsevl,
     1  d1mach, dsqrt
      external d1mach, dcsevl, dsqrt, initds
c
c series for asin       on the interval  0.          to  5.00000e-01
c                                        with weighted error   1.62e-32
c                                         log weighted error  31.79
c                               significant figures required  30.67
c                                    decimal places required  32.59
c
      data asincs(  1) / +.1024639175 3227159336 5731483057 85 d+0     /
      data asincs(  2) / +.5494648722 1245833306 0111959029 24 d-1     /
      data asincs(  3) / +.4080630392 5449692851 3070561492 46 d-2     /
      data asincs(  4) / +.4078900685 4604435455 5988239056 12 d-3     /
      data asincs(  5) / +.4698536743 2203691616 0485301362 18 d-4     /
      data asincs(  6) / +.5880975813 9708058986 4543855520 74 d-5     /

I think some of the old manuals might still be available in scanned form that might act as a model for documentation, and so on. How many people know of Fortran sources available under an open source for the intrinsics?

The testing harnesses that would have to emerge by themselves would be an additional invaluable resource and example as well.

3 Likes

I could try looking for them. What do they look like ? or did they have particular names ?

Cannot remember the names. Since many of the intrinsics were at least originally provided to allow operations otherwise not possible in pure Fortran, the mathematical functions are the easiest to attack. Recent additions to Fortran like optional arguments, class(*), and generics make another set possible now that a user would have been hard pressed to create otherwise in the past.

Well, interfacing to C is also a relatively newly standardized solution that good examples of would be useful for.

At least for the forseeable future machine code is probably not where things want to go at least initially.

I hope to check an internal library soon, as “we keep everything” is a local motto; but will probably not get a chance till next week to get some manual names that way.

1 Like

Yes, the runtime library should be maintained in Fortran, as we do in LFortran. There has to be a small C layer to interface with the system, and then a larger Fortran layer over that, implementing all of the ~200 intrinsic functions. You can use intrinsics that you already defined, but obviously you have to be careful to avoid cyclic dependencies. Some intrinsics are truly intrinsic, like size or len, which have to be implemented by the compiler itself, but many other intrinsics can be implemented as a library.

If anyone wants to help with that, I am happy to make it a standalone project.

See also Fortran runtime math library, where I proposed to maintain Fortran versions of the intrinsic functions as a community, but back then it wasn’t received very enthusiastically, however I think based on this thread, that now is a better timing.

1 Like

I think that is a good goal that would be supported by the fortran programmer community. I could contribute to this kind of project much easier than to a more complicated compiler project.

2 Likes

We have some experience of writing compilers in Fortran.

i. The entire MPS10 simulation language compiler was written in Fortran in 1982-3. This was a domain-specific language for real-time simulation on a small, highly specialised parallel computer.

ii. We wrote the front-end of the ADSIM compiler in Fortran. Again, ADSIM is a domain-specific simulation language but with most of the syntax derived from Fortran. This in 1988.

iii. fpt is written almost entirely in Fortran (There is a third-party c component for command-line editing and the Linux version calls realpath. All else is Fortran. fpt is under active development (Fortran keeps changing ;-). fpt contains all of the lexical and static semantic analyses of a full compiler, but the instruction selector is replaced by code to re-engineer and re-write the code.

So it can, and mostly has been done. We have found almost nothing that we can’t do in modern Fortran Yet.

6 Likes

amazing :slight_smile:

The runtime library is here: lfortran/src/runtime at main · lfortran/lfortran · GitHub, for now anyone is welcome to contribute there. We can make it a separate project but we would still need to ensure it remains usable with LFortran, so that we can use it. The exact interface what is implemented directly in the compiler and what is a library function might change, so you will have to work with us. See also How to contribute to LFortran's runtime library, where we determined that the sqrt function should probably be moved from the library into the compiler.

@Aurelius_Nero - I have tried to contact you through the forum to reply but my messages have failed. My e-mail address is john.collins@simconglobal.com. Please mail me to discuss fpt.

John

Reviving this topic with a related find from Lawrence Radiation Laboratory (LRL) from 1964:

LRL FORTRAN-FORTRAN (Technical Report) | OSTI.GOV

The abstract reads:

Investigation showed that the FORTRAN language is sufficiently machine independent to allow the FORTRAN compiler to be written in FORTRAN. The output of the resultant compiler is inherently machine independent and may be efficiently directed to other machines by a suitable translation or assembly process. The operation of the compiler is described. (D.C.W.)

LRL is now known as Berkeley Lab. I found a related report in the Berkeley Library catalogue, but only a tiny preview is visible without an account:

… FORTRAN written in FORTRAN. The impetus behind this study was a local need to move rapidly and efficiently from one machine to another…

1 Like

I have a paper copy of the user documentation for the CIVIC compiler - which was the LRLTRAN compiler for the Cray machines. It has quite a few extensions to Fortran 77 - including a preprocessor facility, ways of defining/accessing data structures, pointers for based arrays/structures (which the Cray Pointer extension was taken from), a bit data type, and so on.

The LRLTRAN compiler was written in itself. And much of the DOE LTSS and CTSS time sharing systems were written in LRLTRAN.

One of the many things I need to see if Al Kossow can scan for bitsavers - before my wife throws it all in a dumpster…

1 Like