Question about double precision on ARM processors

JohnCampbell · November 14, 2023, 10:35pm

Gfortran’s precision 18 (real*10) is a real mess, with 10 byte format in a 16 byte storage.

I think a number of other compilers support 10-byte storage.
Salford/Silverfrost FTN95 supports real10, but only for 32-bit environment. It kas KIND values similar to Nagfor of [ 1,2 3 ] which makes REAL10 syntax an easy and portable approach.

Harper · November 15, 2023, 12:07am

When I was writing my kinds.f90 program which is available via https://homepages.ecs.vuw.ac.nz/~harper/fortranstuff.shtml
I found that Silverfrost would not allow select_int_kind(0) so I made the program’s lowest integer kind selected_int_kind(1). That prevented it detecting a 4-bit integer kind even if one was available. Does Silverfrost still have that feature?

RonShepard · November 15, 2023, 4:50am

Does anyone know why this was done? Was it because of some alignment constraints? Do those constraints still apply to current CPUs?

JohnCampbell · November 15, 2023, 5:53am

Interesting, Silverfrost supports 5 integer KIND values; 1 = int8, 2 = int16, 3 = int32, 4 = int64.

There is also kind=7 = memory size so 4-byte for 32bit compiler or 8-byte if using 64bit compiler. This gives a uniform kind size for memory address values interfacing Windows API routines.

Use of kind = 4 is confusing if moving between FTN95 and Gfortran, especially if using “2_4” as a 64-bit integer constant. 64-bit integer constants are always a problem.

I have not noticed any demand for 4-bit integer types.
There is no support for n-byte un-signed type, as would have been useful for 32-bit 3GByte addressing.

JohnCampbell · November 15, 2023, 6:07am

I have no idea who chose this approach, but they had little appreciation of the use of real10.
I have little understanding of hardware support for real10 in modern processors.
Silverfrost FTN95 only supports it in a 32-bit environment, but not 64-bit.
When Intel introduced MMX 64-bit instructions, it effectively ended the advantage of 80-bit registers.
I do not know if Gfortran real*10 is hardware supported or software emulation, but it is no longer viable when targeting automatic AVX efficiency in Fortran code.

I just wonder if AVX hardware will ever support 10, 16 or even 12 byte formats.
I am surprised / disappointed that 16-byte is not being considered.

PierU · November 15, 2023, 10:19am

The best guess is indeed to optimize the alignements. I don’t think it’s a hard constraint, but rather a choice of the compiler writer.

80bits reals stored on 128 bits would have little interest (not to say no interest at all) compared to 128 bits reals if they were software emulated. The only reason why they are available is to fully take advantage of the 80 bits registers/instructions of the x86 FPU. But as you said, they are not compatible with vector instructions.

RonShepard · November 15, 2023, 4:11pm

10-byte floating point was not just a feature on intel machines. The Apple SANE library (from the late 1970s) also supported it on, if my memory is correct, Motorola, PowerPC, and intel hardware. Standard Apple Numerics Environment - Wikipedia I used several fortran compilers on those machines, and I think they all supported this with REAL*10 variables that were stored compactly in 10 bytes. Those compilers included MS Fortran, Language Systems Fortran, ABSOFT fortran, IBM XLF fortran, g77, and gfortran. I’m unsure about g77, and also about whether gfortran padded the variables to 16 bytes: it might have been 10 bytes on PowerPC and 16-bytes on intel, or maybe it was not supported on PowerPC at all. Of those machines, only intel ever had full hardware support; on the others it was some mixture of hardware and software, so it was slow compared to 8-byte arithmetic, both scalar and vector.

I have never just declared all variables as REAL*10 in a program. I have always used REAL*10 selectively, for example, as accumulators in dot products and things like that.

As for whether 10-byte variables are useful, of course there are performance tradeoffs when comparing to vector instructions, but extended precision accumulations are still useful even in these cases. The choice then might be between 10-byte and 16-byte variables. If done with software, and if both REAL*10 and REAL*16 are stored in 16 bytes, then there is little incentive to use 10-byte variables. Further, in modern fortran REAL128 is part of the iso_fortran_env intrinsic module, while REAL80 is not.

Harper · November 15, 2023, 9:12pm

@PierU said “80bits reals stored on 128 bits would have little interest (not to say no interest at all) compared to 128 bits reals if they were software emulated.” A while ago I found 80-bit reals very useful with gfortran because I had an ill-conditioned problem in which 64 bits were not enough and 80-bit reals ran much faster than 128. Using both 80 and 128 revealed that 80 was accurate enough. More recently in another problem the higher speed of 80 bits was useful when debugging and for exploring the effect of changing input data, but 128 bits were needed for useful results.

oscardssmith · November 15, 2023, 9:20pm

Generally you can get better results using double-double for conditioning problems since it gives you 106 bits of precision and will SIMD pretty well and therefore usually be a good bit faster than software emulated Float128.

PierU · November 15, 2023, 10:45pm

@Harper OK, but I definitely didn’t say that 80 bits reals were not useful.

RonShepard · November 16, 2023, 2:59am

Sometimes, this is the best way to determine the needed precision. Unless you are an expert in numerical analysis, it is sometimes difficult to know ahead of time what are the tradeoffs involved with grid spacings, polynomial orders, matrix condition numbers, etc. So even if you never expect to use, say real128, in production mode, it is good to have available anyway.

Harper · November 19, 2023, 7:57am

@Johncampbell suggested

    double precision dp /.711D25/

On trying a program including that with the 4 compilers I can use, AMD flang and ifort happily compiled and ran it, gfortran also compiled and ran it but said

Warning: GNU Extension: Old-style initialization at (1)

and g95 gave what I think is a standard-conforming result:

Error: Syntax error in data declaration at (1)

Sorry to take so long to address this problem but I had to find my copies of McCracken’s books on FORTRAN (1961) and FORTRAN IV (1972). Neither they nor the f66 standard nor any later one mentions what looks like a cross between a declaration and a DATA statement. Hence my question: what old Fortran or FORTRAN compiler brought in the old-style initialization that gfortran mentioned?

JohnCampbell · November 19, 2023, 8:51am

Whell I did suggest try “double precision dp /.711D25/”, ( implying not E25 )
but I probably should have said try
double precision :: dp = .711D25

or have said try
double precision dp
data dp /.711D25 /

I must admit I have not used “double precision” for a long time, as I always use:
real*8 :: dp = .711D25

would standard conforming compilers complain about the following ?
real :: dp = .711D25

I did a lot of early development on Pr1me FTN and Lahey LF95, where they were more concerned for the programmer with what .711 probably should mean.

Also, I was interested to read your comment
“I had an ill-conditioned problem in which 64 bits were not enough and 80-bit reals ran much faster than 128. Using both 80 and 128 revealed that 80 was accurate enough. More recently in another problem the higher speed of 80 bits was useful when debugging and for exploring the effect of changing input data, but 128 bits were needed for useful results.”

I have rarely had practical problems where this clear outcome was possible. Most are demonstration cases, such as summing 10^9 real random numbers

I did a test of my FE analysis where I changed the matrix assembly and solution from real64 to real80, for a problem that had some identified precision problems ( significant mixed material stiffness )
The results were not compelling, but the true answer was to improve the FE modelling approach to mitigate the round-off errors. The issue of round-off is usually a combination of factors and short cuts.

Topic		Replies	Views
Best way to declare a double precision in Fortran? Help	51	21847	April 7, 2024
Can floating point literals be adapted by the compiler to double precision variable?	93	4743	June 19, 2023
Wrong (strange outputs) of real numbers Help	27	1646	April 26, 2023
How useful is selected_int_kind and selected_real_kind? Poll	36	2055	May 6, 2024
Compile time detection of extended-double or quad precision Help	33	1641	November 18, 2022

Question about double precision on ARM processors

Related topics