Can floating point literals be adapted by the compiler to double precision variable?

certik · October 9, 2021, 12:07am

This is another common gotcha:

and I wonder if the compiler could not simply take a code like this:

integer, parameter :: dp = kind(0.d0)
real :: x
real(dp) :: y
x = 5.35
y = 5.35

and it would internally treat it as:

integer, parameter :: dp = kind(0.d0)
real :: x
real(dp) :: y
x = 5.35
y = 5.35_dp

In other words, it would look where the floating point number is assigned to and adapt it accordingly.

One way to describe it could be that a floating point literal like 5.35 would be “infinite” precision, and then when used in operations, it would be “downcasted” to whatever precision is needed for the operation.

For example, 5.35 / y would implicitly become real(5.35, dp) / y because y is double precision. If y was quadruple precision qp then it would become real(5.35, qp) / y.

Two questions:

Could this break existing codes in any way?
Would this completely fix the current problem that one can assign a single precision number 5.35 to a double precision variable and lose accuracy?

Beliavsky · October 9, 2021, 12:44am

I believe that a general rule of Fortran is that what is on the LHS of an assignment never affects how the RHS is evaluated, and someone new to Fortran should learn that. I suggest that LFortran not make exceptions to that rule. You don’t want people to write bad code that LFortran silently fixes, only for them to encounter problems when using other compilers. For your first code LFortran could emit a warning about

y = 5.35

such as “double precision variable assigned to a real value”, and the user could turn such a warning into an error. This could be made the default, although in strict mode LFortran must compile the legal code. For the code

integer, parameter :: dp = kind(0.0d0)
real :: x
real(dp) :: y,z
x = 5.35
y = 5.35
z = 5.35_dp
print*,x,y,z
end

gfortran -Wconversion-extra says

prec.f90:5:4:

    5 | y = 5.35
      |    1
Warning: Conversion from 'REAL(4)' to 'REAL(8)' at (1) [-Wconversion-extra]

and gives output

5.34999990 5.3499999046325684 5.3499999999999996

I see that I suggested your idea in 2014 in comp.lang.fortran. Richard Maine pointed out some difficulties. Now I’m against it.

certik · October 9, 2021, 1:06am

Good points.

Let’s assume the rule:

The RHS evaluation must not depend on the LHS.

Currently a floating point literal 5.35 is assumed to be single precision. I think to be more precise, I think it is assumed to be the same precision as the default real type.

So one way forward is to make floating point literals like 5.35 either the widest precision supported by a given compiler (say double or quadruple) or even “infinite” precision.

I don’t know if there can be issues with double / quadruple precision, especially with regards to the default real type.

However, the infinite precision would work as follows: 5.35 would simply mean 5.35 exactly. It is not equal to any other real type. Then when you assign it, the = operator casts it appropriately. The same with binary operators like + or *. They already insert implicit casts such as in 1 + 1.5, the integer gets cast to a float first. Would this not work?

Beliavsky · October 9, 2021, 1:28am

I don’t think so. We still want it to be legal to write f(5.35), where f is a function with a single precision argument. I don’t think the meaning of 5.35 should depend on the context. The previously cited comp.lang.fortran discussion has a similar comment:

Richard Maine

May 30, 2014, 8:52:57 PM
to

Quadibloc <jsa…@ecn.ab.ca> wrote:

Could a simple and predictable rule be established that would allow a
language like FORTRAN, where different real precisions have (near-)equal
stature, to still be very simple?

I would say that when you have 1.1 instead of 1.1E0, 1.1D0, or 1.1Q0 (for
EXTENDED PRECISION), the thing to do is to say that it’s initially
converted to binary at the maximum available precision (i.e. 1.1Q0,
128-bit real or 80-bit real) and then possibly demoted the first time it
comes into contact with something of a specified real type.

And if it never “comes in contact” with anything else, then it stays at
the maximum precision? So

call sub(1.1)

doesn’t work if the dummy argument of sub is default real? And assume
that sub has an implicit interface, so in the calling scope you don’t
know anything about the dummy. Or perhaps that sub is generic with
multiple possible kinds.

I guess I find the existing rule a lot simpler than any alternative
proposal I’ve seen. The existing rule is that 1.1 is a real of default
kind - always. Sure, one can make mistakes ignoring that rule just like
one can make mistakes by ignoring other rules. But I don’t think you can
beat it for simplicity.

P.S. If anyone is in touch with Richard Maine, could they invite him to join this forum?

fortran4r · October 9, 2021, 2:02am

I find the suffix d in Fortran super annoying. Most scientific codes nowadays use double precision real numbers. That means people need to write a lot of suffix d in Fortran, which greatly hurts readability.

oscardssmith · October 9, 2021, 2:19am

This is a bad idea if Fortran ever wants to get the ability to write generic code. In such cases it will be far less clear what type to make things.

FortranFan · October 9, 2021, 2:33am

@certik and anyone interested in the default kinds of constants (also literals), please see this proposal:

github.com/j3-fortran/fortran_proposals

Default KINDs for constants and intrinsics

opened 04:12PM - 11 Nov 19 UTC

FortranFan

Clause 7

A proposal from UK national body from year 2013: https://wg5-fortran.org/N1951-N…2000/N1975.txt ``` -------------------------------------------------------------------------- Number: UK-01 Title: Default KINDs for constants and intrinsics Status: For Consideration Basic Functionality: Program-specified default KIND for constants etc. Rationale: (a) In the 64-bit world default integer generally being 32-bit is increasingly leading to incorrect programs. (b) In the floating-point world default real generally being 32-bit not infrequently leads to incorrect programs. (c) It is tedious and error-prone to have to attach kind_param's to individual literal constants. (d) It is tedious and error-prone to have to specify a KIND= argument as required for each individual reference to an intrinsic function. (e) Program specification of the default type parameters is possible for derived types but not intrinsic types. Specification (requirements): 1. Provide a mechanism for specifying the default kind parameter for the intrinsic types REAL and INTEGER. 2. Decouple the concepts of "default kind" from those of "single precision" and "basic integer"; "default kind" to be used for literal constants, implicit typing, etc., while "single precision" et al to be used for the old storage association contexts. Discussion: Specifications (detailed): -------------------------- a. There will be a new statement, that can appear only between the USE statements and IMPLICIT statements, to specify the default kind for a particular intrinsic type. b. The effect of this statement is to change the default kind for the remainder of the scoping unit. To avoid circular dependencies, it does not affect the default kind of literals appearing in or before the statement itself, nor does it affect the type implied by a PARAMETER statement that appears before the statement. Rationale: circular dependencies bad, named constants good. c. The default kind setting in a scoping unit is initially that of its host scoping unit; note that as nested scoping units appear necessarily after any IMPLICIT statements, this will inherit the user setting. d. In the case of REAL, the default kind for COMPLEX is also affected. It does not affect double precision kind, which remains as twice single precision in storage. e. Terminology: "default kind" = user-specifiable default kind "single precision kind" = old default real kind "basic integer kind" = old default integer kind f. OPTIONAL: There is a reasonable argument to be made that permitting the user to specify the kind for "double precision" constants and variables would also be valuable. g. Table of places where "default kind" is used, and what that should correspond to in the new scheme, follows. Context Should be --------------------------------------------------------- (1) integer literal with no <kind-param> default kind (2) "kindly" intrinsics with absent KIND= default kind * Note T0 (3) intrinsics with no KIND argument default kind * Note T1 (4) <type-spec> with no KIND parameter, e.g. REAL default kind * Note T2 (5) arguments to generic intrinsics accept both * Note T3 (6) args/result for specific intrinsics, e.g. AMIN1 basic kind * Note T4 (7) constants in ISO_FORTRAN_ENV basic kind * Note T5 (8) having the size of one numeric storage unit basic kind (9) EQUIVALENCE/COMMON real/integer matching basic kind Note T0. This includes LBOUND, LCOBOUND, SIZE, SHAPE ... i.e. all the ones that have a KIND= argument and return a REAL/INTEGER/COMPLEX result of the specified kind when present, and default kind when not. Note T1. Actually these don't really matter much, as the value always fits into basic kind. However, it would be more convenient when passing as an actual argument, and more consistent, for these to be the new user-specifiable default kind too. Here is a representative list of the intrinsics concerned: DIGITS, PRECISION, RANGE, EXPONENT, MAX_EXPONENT, and THIS_IMAGE. Note T2. Wherever the <type-spec> is, viz a type declaration statement, component definition statement, array constructor, etc. Note T3. The affected intrinsics:arguments are DATE_AND_TIME: VALUE, EXECUTE_COMMAND_LINE:EXITSTAT,CMDSTAT, GET_COMMAND:LENGTH,STATUS, and similar. Not a big deal (values always representable in the basic kind) but why not relax the requirement to permit larger kinds always anyway? Note T4. Not actually important. These are wholly redundant anyway. Note T5. As if it were a user-defined module with those constants. Syntax (illustrative): ---------------------- DEFAULT <intrinsic-type-name> ( [ KIND = ] <int-constant-expr> ) where <intrinsic-type-name> is INTEGER or REAL. Cnnn (Ryyy) The kind number specified in a DEFAULT INTEGER statement shall specify a kind whose storage size is at least as great as that of basic integer kind. {Reason: To stop the user shooting his foot off with a short integer kind.} OPTIONAL: If we permit specification of double precision kind, add this additional syntax: DEFAULT DOUBLE PRECISION ( [ KIND = ] <int-constant-expr> ) Cnnn The kind number specified in a DEFAULT DOUBLE PRECISION statement shall not specify a kind whose storage size is less than that of default real kind. {Reason: There might not be a bigger real kind than the user-specified default, but it would be counter-intuitive to permit specification of a double precision kind that is actually smaller than default real.} Example: -------- This is just to show how it works, it is too trivial to show much advantage. Program sum_tan_prefix Default Integer (Kind=Selected_Int_Kind(18)) Default Real (Kind=Selected_Real_Kind(15)) Real,Allocatable :: x(:) Print *,'Input vector length N' Read *, n If (n<2) Stop 'Don''t be silly.' Allocate(x(n)) Print *,'Input vector with',Size(x),'values in degrees' Read *,x tmp = 0 Print *,'SUM_TAN_PREFIX results' Do i=1,Size(x) tmp = tmp + Tan(x(i)*3.1415926535897932384/180) Print *,i,x End Do End Program ```

I personally think that is the Fortrannesque way to introduce the use cases into the language, as discussed thus far in this thread, without affecting existing semantics.

I truly wish the above proposal (or some further refinement thereof) had made it into the language in Fortran 2018 itself.

certik · October 9, 2021, 3:36am

I see two options for a single sub subroutine it could:

cast into single if sub accepts single, or double if it accepts a double precision
or not work and you have to specify the kind explicitly

And if sub is a general procedure, it might:

refuse to work until a kind is specified
or select the widest precision available

For implicit interface:

implicit interface should not be used anyway, but it can default to single

I am not sure if I like this or not.

Another option might be to simply make the default real a double precision. In Julia:

julia> 4.5 
4.5

julia> typeof(4.5)
Float64

mecej4 · October 9, 2021, 3:45am

I’m afraid that this “infinite precision” will require an infinite number of bits. Your number, 5.35, is not representable exactly in IEEE 32-bit format, which uses a number base of 2, not 10. This number has a bit representation of Z’40AB3333’, and there follow an infinite number of Z’33333…’ that got chopped off.

Similarly, the “simple” single-digit decimal number 0.1 does not have a finite representation in the internal binary representation. The IEEE 32-bit representation is Z’3DCCCCCD’, and the last nybble is D rather than C because of rounding. If you want infinite precision, you would need an infinite number of C-s at the right end.

certik · October 9, 2021, 3:57am

Yes, it promotes the single precision 5.35 to double, but it will only be accurate to about 1e-7:

integer, parameter :: dp = kind(0.d0)
real :: x
real(dp) :: y
x = 5.35
y = 1
y = 5.35 / y
print *, x
print *, y
end

This prints with GFortran 9.3.0:

   5.34999990    
   5.3499999046325684

The way I would like it to promote it is like in y = 5.35_dp / y and then it prints:

   5.34999990    
   5.3499999999999996

What I have in mind is to make 5.35 be something like Decimal("5.35"), so an exact value. And then in any operation it just gets “downcasted” to whatever accuracy is needed. So real(5.35, dp) becomes 5.3499999999999996 and not 5.3499999046325684 as it currently happens.

Indeed, we do not want to have any kind of slowdowns and we definitely do not want to introduce quadruple precision (which is indeed slow) unless needed.

At the very least I want to just warn by default, i.e. the -Wconversion-extra in gfortran:

$ gfortran -Wconversion-extra  a.f90
a.f90:6:8:

    6 | y = 5.35 / y
      |        1
Warning: Conversion from ‘REAL(4)’ to ‘REAL(8)’ at (1) [-Wconversion-extra]

Although I would prefer it if the warning was on by default and something along the lines of:

$ gfortran a.f90
a.f90:6:8:

    6 | y = 5.35 / y
      |        1
Warning: Conversion from ‘REAL(4)’ to ‘REAL(8)’ at (1) will make 'y' only accurate to single precision
Suggestion: change 5.35 to 5.35_dp to achieve full double precision accuracy
Note: you can turn off this warning with [-Wno-conversion-extra]

oscardssmith · October 9, 2021, 5:07am

Are there implementations of Fortran (or packages) that provide a good DoubleDouble type? DoubleDoubles (using the sum of 2 Float64) tends to be a very high performance solution when you need more accuracy than a Float64. You get 106 bits of precision and the same exponent range as Float64, but the basic math operations only take 10 or so cycles

ivanpribec · October 9, 2021, 10:53am

I think the doubledouble library QD from Bailey is actually implemented in C++, with a Fortran wrapper.

Looking through his bibliography I think there was an older package which was apparently in Fortran, but I recall not being able to find the source code.

themos · October 9, 2021, 5:03pm

The NAG Fortran Compiler implements REAL128 with a “double double”.

urbanjost · October 12, 2021, 4:28am

I agree the current state of affairs would be difficult to improve upon without breaking backward compatibility, but I was thinking how long this has been a point of confusion. To help explain it in the past I used to use an example somewhat like the following, especially to highlight how E and D suffixes do not impact NAMELIST and most input, but are effective on output, and to warn about truncation of long constants without an explicit type. It went something like this, except I know “g0” was not available:

program pidigits
character(len=80) :: cpi='3.141592653589793238462643383279502884197169399375105820974944592307'
character(len=80) :: cpi1='3.141592653589793238462643383279502884197169399375105820974944592307e0'
character(len=80) :: cpi2='3.141592653589793238462643383279502884197169399375105820974944592307d0'
doubleprecision :: pi,pi1,pi2,pi3,pi4,pi5     
namelist /args/ pi
character(len=256) :: input(3)=[character(len=256) :: &
&'&args', &
& ' pi=3.141592653589793238462643383279502884197169399375105820974944592307,', &
& '/']

data pi2/3.141592653589793238462643383279502884197169399375105820974944592307e0/
data pi3/3.141592653589793238462643383279502884197169399375105820974944592307d0/

   pi4=3.141592653589793238462643383279502884197169399375105820974944592307e0
   pi5=3.141592653589793238462643383279502884197169399375105820974944592307d0

   write(*,'(g0)')cpi

   write(*,'(g0)')pi2,pi3,pi4,pi5
   
   read(input,nml=args)
   write(*,'(g0)')pi

   read(cpi,*)pi1
   write(*,'(g0)')pi1
   read(cpi1,*)pi1
   write(*,'(g0)')pi1
   read(cpi2,*)pi1
   write(*,'(g0)')pi1

   write(*,'(g0)') 3.141592653589793238462643383279502884197169399375105820974944592307
   write(*,'(g0)') 3.141592653589793238462643383279502884197169399375105820974944592307e0
   write(*,'(g0)') 3.141592653589793238462643383279502884197169399375105820974944592307d0

end program pidigits

and how the value of a constant may change with compiler options such as

ifort pi.f90 -r8
nvfortran pi.f90 -r8
gfortran -fdefault-real-8 pi.f90 -fdefault-double-8

was always an issue too.

A common comment was that most thought that a constant should be promoted to a type large enough for all the explicitly given digits or a warning be required to be given.

I think all modern compilers have a switch that will produce a warning about constants being truncated and about precision possibly being lost when the LHS and RHS types do not match,
which people should be encouraged to use, especially as more users now use languages with features such as arbitrary precision that therefore might be even more surprised than in the past about values they explicitly entered being significantly truncated.

Even if automatic promotion to a kind sufficient to hold the explicit values was the rule, people would probably still be surprised that “A=1.0” might not produce the same as “A=1.00000000000”, so I am not certain that change would be better, and there is still the issue that compilers can have different default types; but I do think the most inadvertent error is caused by “double_value = single_precision_constant”, which that would not address anyway.

Maybe a new syntax like “value <== constant” where the constant would be treated the same as if it were read with list-directed input would address constants, but that would not address where the constants are used in functions.

And it looks like the rule about LHS being ignored when determining the value on the RHS
is too sacrosanct to be broken, which seems like the only logical path to improving the current state. SIgh. I really have had a number of complaints about this though. It is very non-intuitive .

So overall, it looks like the status-quo is a reasonable compromise, but quite a mess.

billlong · October 13, 2021, 8:36pm

The basic issue here is that the type and kind of an expression is determined by the expression (3.51 is an expression with only one primary), and not by how the expression is used later. So changing to “auto-promote” constants would result in a pretty basic (and backwards incompatible) change in the standard.

I agree that a lot of scientific code is written using 64-bit precision floats. This is why Cray, before migrating to x86_64 processors, made 64-bit floats the default REAL kind. Thus 3.51 defaulted to be 64 bits as an expression and “problem solved”. Other compilers, wanting to be compatible, added options named, often, -r8 to convert to the Cray default mode.

certik · October 13, 2021, 9:12pm

Is 64-bit the default in Cray even today? I didn’t know that. I think that would fix most problems.

If you want to select a single precision in Cray, I assume you can’t do just integer, parameter :: sp = kind(0.0) because that will be double precision. So have to use either selected_real_kind or iso_fortran_env?

billlong · October 13, 2021, 9:25pm

The classic vector machines were 64-bit default. The default on recent Cray systems is 32-bit for both REAL and INTEGER. You can use the compiler option -sdefault64 to change both defaults to 64-bits. The default is driven by the hardware instruction set, which today is usually either x86_64 (Intel, AMD) or ARM. Both use 32-bit as default. The most portable scheme is to use selected_real_kind.

certik · October 14, 2021, 1:45am

Thanks Bill. Last question – I thought that x86_64 as well as 64-bit ARM would be considered to use 64-bit by default, wouldn’t they?

JohnCampbell · October 14, 2021, 5:22am

The interpretation of 5.35 in Fortran code is now as a single precision constant 5.35E0 or 5.35_real32.
However, the interpretation of “5.35” read from a text file or character string into a higher precision real variable will be interpreted as 5.35_real128 if read into a quad precision real variable.

Prior to Fortran 90 most 32 bit compilers interpreted y=5.35 as 5.35_real64 or 5.35_real80, but Fortran 90 defined that 5.35 used in Fortran code is a default real constant.

For earlier compilers, the outcome was not defined, however most 32-bit compilers tried to support the precision required when converting 60-bit Fortran code to 32-bit Fortran, where 64-bit reals were the typical real precision in use.
Most programmers at the time learnt from experience, where converting to a F90+ compiler that resulted in a reduced precision outcome from conversion testing. This was further exacerbated on intel like processors when converting to 64-bit Fortran or SIMD, where 80-bit 80387 registers were not used, again removing default higher precision values.

The interpretation of 5.35 as a default real now has a strong historical understanding and so returning to higher precision interpretation, as with a read, is very unlikely.

The other confusion aspect that can hide this from a new user to Fortran is that many common real constants are the same at any precision, eg 1.0 2.0 10.0 or 0.5, while 0.1 or 1./3. are not and can be the source of unexpected rounding error when selecting real64 or real128 precision.

This problem is not limited to reals, as in the 64-bit environment where 64-bit integers are also required, using integer constants can overflow when 64-bit addresses or array sizes are used. The default is not always what you want, eg size (array).

This is a real differentiator between the Fortran standard and Fortran users.

JohnCampbell · October 14, 2021, 6:02am

I can not identify where this could occur in Fortran, as any real precision value of 1.0 should be implicitly converted to the same value in any precision. “A = 0.1” would be a very different case.

Topic		Replies	Views
Best way to declare a double precision in Fortran? Help	51	20499	April 7, 2024
Question about double precision on ARM processors	52	1914	November 19, 2023
Efficiency and suitability of using floats, decimals and integers Help	26	936	February 12, 2023
"real" type of a calculation with mixed precisions Language enhancement	10	436	July 29, 2024
Which Kinds are Real? Announcements	19	1593	October 3, 2021

Can floating point literals be adapted by the compiler to double precision variable?

Richard Maine

Related topics