Convert double to quad precision without junk in high digits

What has this example achieved ?
You are left with a result where u /= z

You could have equally just had
u = z
write (*,fmt=‘(es22.15)’) u

It you wanted an answer to 15 figures, don’t bother with u at all !
write (*,fmt=‘(es22.15)’) z

More unnecessary bloat !

1 Like

@rdryne , in this instance the simple assignment is the same as u = 1.0_qp * z
and they are equivalent to u = real(z, kind(u) ).

So you can try out the following:

   use, intrinsic :: iso_fortran_env, only : dp => real64, qp => real128
   character(len=*), parameter :: fmtg = "(*(g0))"
   real(dp) :: z
   real(qp) :: u
   z = 0.123456789_dp
   u = z
   print fmtg, "1) u = z: ", u
   u = real( z, kind(u) )
   print fmtg, "2) u = real( z, .. : ", u
   u = 1.0_qp * z
   print fmtg, "3) u = 1.0_qp*z : ", u
   call dp_to_qp( u, z )
   print fmtg, "4) dp_to_qp( u, z ): ", u
   elemental subroutine dp_to_qp( lhs, rhs )
      real(kind=qp), intent(out) :: lhs
      real(kind=dp), intent(in)  :: rhs
      character(len=*), parameter :: fmtdp = "(1pg22.15)"
      character(len=22) :: s 
      write( s, fmtdp ) rhs
      read( s, fmt=*) lhs
   end subroutine 

which can lead to the following output:

C:\temp>ifort /standard-semantics p.f90
Intel(R) Fortran Intel(R) 64 Compiler Classic for applications running on Intel(R) 64, Version 2021.6.0 Build 20220226_000000
Copyright (C) 1985-2022 Intel Corporation.  All rights reserved.

Microsoft (R) Incremental Linker Version 14.30.30706.0
Copyright (C) Microsoft Corporation.  All rights reserved.


1) u = z: .123456788999999997336054491370305
2) u = real( z, .. : .123456788999999997336054491370305
3) u = 1.0_qp*z : .123456788999999997336054491370305
4) dp_to_qp( u, z ): .123456789000000000000000000000000

Based on your original post, it appears you want the behavior with 4) and that’s why I suggested you to consider such a helper subprogram to help you achieve your goal.

@FortranFan Thanks for this solution. It had not occurred to me that the problem could be solved with internal I/O. Very clever!

I didn’t provided any context for my question, so I’ll do that briefly: Some colleagues noticed that two computations of a complicated function (call it f(z), where z can take on 1000’s of values) produced results that were not identical, in some places the relative error was around 1.e-5. We suspected a loss of precision due to how the calculation was performed. So I wrote a code to calculate the function in double precision and in quad precision. But it occurred to me that the values of z (computed internally by the code) should be numerically the same in both versions, and I was having trouble with that, hence my question. Based on this discussion, I now realize that a better approach is to arrange things so that the values of z are exactly representable in binary.

@rdyne, as I mentioned earlier, the option with the internal IO was just something I meant for you to consider given what you indicated in the original post. Which is that you somehow arrive at a floating-point value programmatically - 0.123456789 as you put it - and you want to represent it as-is in a different (higher) precision for whatever reason.

I did not mean it to be “clever” and please know compared to approaches following via libraries in other languages (e.g., C strtod) the use of internal IO with Fortran can be slower - see this thread.

And I was not intending with my suggestion involving internal IO to get into any messaging to you as to whether it is prudent or not from a numerical analysis involving floating-point calculations, rather leave that up to you, particularly considering you will have best knowledge about your program.

@FortranFan I simply meant that what you proposed was an easy way (just a few lines of code, no external libraries,…) to accomplish what I asked about. But I don’t mean anything more than that. I’m not advocating that other people should do this. People should adopt whatever approaches they think are appropriate to the problem at hand.

The truth is that I went off on a tangent when I started this thread. But it has been informative. Now I can return to my original problem of dealing with a loss of precision in a certain calculation.

1 Like

@kargl My problem does involve large cancellation errors. So I am using Mathematica to rearrange the underlying formulas to avoid small denominators that need not be present.

@kargl I have used David Bailey’s programs in the past. But I don’t need to in this case. I succeeded in using Mathematica to produce Fortran code for the function that I am evaluating, and to do it in a way that the small denominators in separate terms (that led to the cancellation of large terms) are not longer present. The plot shows the relative difference between double and quad precision for the function computed at 1025 points in the range (0,1). The original formula has huge errors. But it’s better than 1.e-14 with the rearranged formula.