ES0.4E0 Considered Harmful

What would it take to get the legacy output format sn.nnnnsnnn (s is sign, n is a digit) fast-tracked for removal from the language?

Overflowing an ‘E’ output edit descriptor can lead to malformed and in cases dangerously misinterpreted output, specifically when the exponent has more digits than the implied default of 2. The ‘E’ separator between significand and exponent is dropped and is replaced by the sign of the exponent. This behavior is allowed by the standard and documented but it violates user expectations of what constitutes scientific notation. There is likely a clear historical explanation for this behavior similar to that of the legacy ASA carriage control codes.

The difficulty goes beyond the human factors problem of using an archaic malformed numeric display. Few if any other languages interpret this format correctly which breaks data portability via text file, the principal means of data interchange.

The following demonstrator compiles cleanly on Intel Fortran (the specific compiler shouldn’t matter) and generates CSV output that reliably breaks both Python’s float() type conversion and has a particularly insidious, quiet, and dangerous failure mode when read as CSV by Excel.

program trivial_loss
  use, intrinsic :: iso_fortran_env, only: REAL64, OUTPUT_UNIT
  implicit none
  real(kind=REAL64), dimension(6), parameter :: baddata =               &
    [  1.0E+100_REAL64,  10.0_REAL64,  1.0E-100_REAL64,                 &
      -1.0E-100_REAL64, -10.0_REAL64, -1.0E+100_REAL64 ]
  integer i
  integer j
1 format('"Full","Naive","Nominal","Better","Malformed"')
2 format(G0, ',', E11.4, ',', 1PE11.4, ',', ES13.4E4, ',', ES0.4E0)
  continue
  write(unit=OUTPUT_UNIT, fmt=1)
  do i = 1, size(baddata, 1)
    write(unit=OUTPUT_UNIT, fmt=2) (baddata(i), j=1,5)
  end do
  stop
end program trivial_loss

Example output:

"Full","Naive","Nominal","Better","Malformed"
.1000000000000000E+101, 0.1000+101, 1.0000+100, 1.0000E+0100,1.0000+100
10.00000000000000, 0.1000E+02, 1.0000E+01, 1.0000E+0001,1.0000+1
.1000000000000000E-99, 0.1000E-99, 1.0000-100, 1.0000E-0100,1.0000-100
-.1000000000000000E-99,-0.1000E-99,-1.0000-100,-1.0000E-0100,-1.0000-100
-10.00000000000000,-0.1000E+02,-1.0000E+01,-1.0000E+0001,-1.0000+1
-.1000000000000000E+101,-0.1000+101,-1.0000+100,-1.0000E+0100,-1.0000+100

Python is doing the right thing by throwing an exception when asked to convert malformed numerical data. Excel is half-working, treating positive malformed data as a string but evaluating negative malformed data as a formula when imported as values.

Importing CSV-formatted numerical data is a common use case and while it’s not Fortran’s job to work around every possible misinterpratation of its output, likewise it should not generate malformed output. The user expectation is that values will be formatted in scientific notation or values will be clearly marked as unrepresentable in the specified format (all *s).

I’ve seen this behavior before but it most recently appeared when trying to parse radiation dose results generated by actively developed code. Investigating the issue, I found that using a field width of 0actually made matters worse despite guidance that using a zero-width field avoids clearly invalid output.

This is a confluence of small seemingly-inconsequential problems across several systems:

  • the default behavior of the ‘E’ edit descriptor does not take into account the extended width of the exponent field in 64-bit or larger floating point types.
  • the historical sn.nnnnsnnn format is incompatible with most systems where Fortran codes are currently used
  • an E format combined with zero field width can reliably generate malformed output for all values, not just those with exponents of more than two digits
  • developers are given no warning that this behavior is possible; contrast with the w < d + 7 warning that most compilers produce when a risky format such as E10.5 is detected
  • scientific software developers are often unaware of Fortran’s standard behavior or potential mitigations
  • evaluating whether very large or very small magnitude values are reasonable is a verification task which is often omitted even in mission- or safety-significant applications; regardless, the language and common tooling do not make this verification task any easier
  • Excel exhibits unexpected and dangerous behavior when converting text values to numerical quantities. This is a well known flaw in the application which the vendor is aware of and has no intention to address
  • Culturally, workers in safety- and mission-critical roles routinely use Excel in roles where it is not appropriate and no amount of training, education, policy, or disciplinary measures will change that.

I don’t see user culture or Excel as being fixable. Both tools and developer practice can change with enough time and incentive but I also don’t see that happening in a reasonable or predictable timeframe. Knowing that Fortran has been and is currently used in roles where it can affect safety, this issue seems more effectively addressed at a language standard level.

1 Like

This is a feature, not a bug. It allows the value to be printed within its field width with no loss of information. I think other languages that aspire to do scientific, engineering, or numerical computations should be fixed, fortran should not be limited, especially in a backwards-incompatible manner.

The alternatives are to fill the field with asterisks, losing information, or to exceed the designated field width by printing both the E and the wide exponent field.

Why can’t this be fixed in Excel? Is there some fundamental reason or inconsistency that prevents it?

Along this floating point format tangent, there are two types of missing floating point formats in fortran. One useful format would be to print the maximum number of significant digits given a specified field width. The other would be to print to the minimum field width given the specified number of significant digits. In both approaches, any optional plus signs would be deleted from the output, leading zeros would be eliminated, the whole exponent field might be eliminated by shifting the decimal point, and maybe even the decimal point would be eliminated when printing integer values. Within these constraints, there are sometimes multiple ways to write a number, and some convention should be adopted so that all compilers print the same output in all cases.

This is a feature, not a bug.

It was a feature. It has outlived its usefulness given the changes to hardware, precision, and how code code results are used in the 60+ years since its introduction (see section 7.2.3.6.2 of the F66 spec, though the format is likely a carryover from much older punched card tabulators).

I understand the rationale for this behavior but it has been considered a bug in every environment I’ve worked in since the early 90s. If we saw this output we expanded the field to fit the data so the output was not malformed, we did not rely on this inconsistent fallback behavior.

The alternatives are to fill the field with asterisks, losing information […]

This is the correct behavior. It shows that the value cannot be represented in its intended form and that the output format descriptor needs to be expanded.

I think other languages that aspire to do scientific, engineering, or numerical computations should be fixed, fortran should not be limited, especially in a backwards-incompatible manner.

I wish you well in that endeavor. My scope is more limited in migrating Fortran toward the de facto standard accepted by the vast majority of human and machine consumers of Fortran output.

Why can’t this be fixed in Excel? Is there some fundamental reason or inconsistency that prevents it?

I’m going to be charitable and assume you are joking.

But in case this is a serious question, review your arguments against changing Fortran behavior and apply them in the context of an Excel developer.

Or apply them to the current argument: “Why can’t this be fixed in the Fortran standard? Is there some fundamental reason or inconsistency that prevents it?”

Regardless, Excel is used as an example of ubiquitous software which cannot read Fortran’s malformed numerical output. Fixing Excel is out of scope for this conversation.

The point is that this is unexpected behavior which causes problems in modern usage and which in some circumstances may lead to unsafe interpretation of results. Let this F66 behavior go the way of the Hollerith constant; compilers can provide features to support legacy behavior if it’s intended or necessary.

1 Like

The standards committee is very reluctant to delete things from the language, but you could suggest
that the ES0.xE0 format be designated obsolescent, so that compiling with the standard-conforming option would produce a warning when such a format is used. Your example should be included in a list of Fortran “gotchas” that has been discussed.

ES0.4E0 is not allowed to remove the exponent letter. In the original F2018, ES0.4 (no E0) would allow the letter to be removed for exponents between 100 and 999, but interp F18/033 noted that this is not the intended behavior. This passed and was incorporated into F2023.

ifort has a bug here - I reported a similar one a while back, and will report this one too.

The result of the interp is that if w=0, whether you specify E0 or not, you get all the exponent digits plus the E.

3 Likes

I think once the programmer specifies a 0 width, he has already said that the field width is unimportant, so the processor should add exponent digits as necessary. The exponent field width seems redundant in this situation.

The idea of removing the E is to avoid filling the fixed field width with asterisks, thereby losing information. I think it would be unwise in a numerical language like fortran to throw away information when there is still a possibility to fill the field with the full requested output information. I do not agree with the OP that the asterisks field should be the “correct” choice here. Consider a long calculation where that is the critical output value. Filling the field with asterisks means that the whole calculation would need to be repeated, or in a time critical situation that the value would not be available to meet the deadline. I think in many other languages, such as C, the output field width is expanded to include always the full information, so even there, the choice is not to fill with asterisks.

If you read the interp paper I linked to, you’ll see that the intent is that if you specify w=0, then the processor will supply an adequate number of digits so that you never get asterisks. I believe the new wording meets that goal.

Thank you @sblionel. I tried your test program with its statements

A = 0.12345E123_DP 
B = 0.12345E1234_QP

but on changing the format to '(SS,E0.7)', gfortran gave the output

0.1234500E+123
0.1234500E+1234

If the intention is to use the minimum amount of space why not turn E+ into E? Does gfortran have a bug here, or does that + not count as an optional plus removed by the SS edit descriptor? In the latter case I think F202Y should have another edit descriptor such as SE to turn E+ into E in output.

It does not - SS refers only to the leading sign character.

Thank you @sblionel. That is why I suggested that F02Y allow a new format descriptor, possibly called SE, that would turn E+ into E in floating-point output. It would help those wanting minimal-length E, ES, EN or G output.