Bounds of random_number

The standard says that

call random_number(x)

gives x such that 0 <= x < 1. What are the smallest and largest values for double precision x based on current implementations of random_number? They will vary by compiler. I tested this with program

program main
implicit none
integer, parameter :: wp = kind(1.0d0), ikind = selected_int_kind(18), &
                      nran = 10_ikind**9
integer(kind=ikind) :: i,iter
real(kind=wp) :: xmin,xmax,xran
print*,"#ran, wp, tiny(0.0_wp)=",nran,wp,tiny(0.0_wp)
call random_seed()
do iter=1,10
   xmin = 1.0_wp
   xmax = 0.0_wp
   do i=1,nran
      call random_number(xran)
      xmin = min(xmin,xran)
      xmax = max(xmax,xran)
      if (xran == 0.0_wp) then
         print*,"xran = 0.0_wp for i=",i
         exit
      end if
   end do
   print*,"xmin, 1-xmax =",xmin,1.0_wp-xmax
end do
end program main

giving sample results

gfortran

 #ran, wp, tiny(0.0_wp)=  1000000000           8   2.2250738585072014E-308
 xmin, 1-xmax =   1.2764767021167245E-010   8.5712281716610050E-010
 xmin, 1-xmax =   3.5704161849281491E-010   5.8078597486854733E-010
 xmin, 1-xmax =   3.5418714627866166E-009   1.9406080076223020E-009
 xmin, 1-xmax =   3.8126641843660991E-009   8.9600260544386856E-010
 xmin, 1-xmax =   7.0739225588312138E-010   1.5527540364601577E-009
 xmin, 1-xmax =   1.0432037456098442E-009   2.4081048266566540E-010
 xmin, 1-xmax =   6.7113703483556719E-010   8.0433837368332206E-010
 xmin, 1-xmax =   2.2587054449019206E-010   5.0732962275645832E-010
 xmin, 1-xmax =   8.7153861905164831E-010   1.9965620268180828E-009
 xmin, 1-xmax =   1.8497274822948384E-009   6.2101734865649405E-010

ifort

 #ran, wp, tiny(0.0_wp)=  1000000000           8  2.225073858507201E-308
 xmin, 1-xmax =  9.313226114783520E-010  4.656632857091836E-010
 xmin, 1-xmax =  9.313226114783520E-010  4.656632857091836E-010
 xmin, 1-xmax =  1.396983917217528E-009  4.656632857091836E-010
 xmin, 1-xmax =  1.396983917217528E-009  1.862647258654704E-009
 xmin, 1-xmax =  5.122274363130936E-009  4.656632857091836E-010
 xmin, 1-xmax =  4.656613057391760E-010  4.656632857091836E-010
 xmin, 1-xmax =  9.313226114783520E-010  4.656632857091836E-010
 xmin, 1-xmax =  4.656613057391760E-010  4.656632857091836E-010
 xmin, 1-xmax =  2.793967834435056E-009  4.656632857091836E-010
 xmin, 1-xmax =  2.328306528695880E-009  2.328308545962443E-009

For ifort it appears that the smallest x or (1-x) is 4.6566e-10. For gfortran it appears that values below 1e-10 are rare, although I did have a run where 4.6566e-11 appeared. It looks like you don’t need to check for the case that x == 0.0_dp in a simulation using random_number. If you sample from a non-uniform distribution using the inverse cumulative distribution, you can estimate the bounds of those variates given bounds for the uniform variates.

1 Like

Setting wp = kind(1.0) to get the single precision version of the program, with gfortran you can get x = 0.0, and the smallest value of 1-x appears to be 5.96046448E-08. For ifort the smallest value of x appears to be 4.6566129E-10, and the smallest value of 1-x is consistently 5.9604645E-08.

single precision results
gfortran
 #ran, wp, tiny(0.0_wp)=  1000000000           4   1.17549435E-38
 xran = 0.0_wp for i=              3803816
 xmin, 1-xmax =   0.00000000       5.96046448E-07
 xran = 0.0_wp for i=             11175661
 xmin, 1-xmax =   0.00000000       5.96046448E-08
 xran = 0.0_wp for i=             10164700
 xmin, 1-xmax =   0.00000000       5.96046448E-08
 xran = 0.0_wp for i=             12367588
 xmin, 1-xmax =   0.00000000       5.96046448E-08
 xran = 0.0_wp for i=             20387183
 xmin, 1-xmax =   0.00000000       5.96046448E-08
 xran = 0.0_wp for i=             49395465
 xmin, 1-xmax =   0.00000000       5.96046448E-08
 xran = 0.0_wp for i=              7602516
 xmin, 1-xmax =   0.00000000       1.78813934E-07
 xran = 0.0_wp for i=              6211945
 xmin, 1-xmax =   0.00000000       5.96046448E-08
 xran = 0.0_wp for i=              5937715
 xmin, 1-xmax =   0.00000000       2.38418579E-07
 xran = 0.0_wp for i=             53826598
 xmin, 1-xmax =   0.00000000       5.96046448E-08

ifort
 #ran, wp, tiny(0.0_wp)=  1000000000           4  1.1754944E-38
 xmin, 1-xmax =  4.6566129E-10  5.9604645E-08
 xmin, 1-xmax =  4.1909516E-09  5.9604645E-08
 xmin, 1-xmax =  4.6566129E-10  5.9604645E-08
 xmin, 1-xmax =  1.8626451E-09  5.9604645E-08
 xmin, 1-xmax =  1.3969839E-09  5.9604645E-08
 xmin, 1-xmax =  4.6566129E-10  5.9604645E-08
 xmin, 1-xmax =  1.8626451E-09  5.9604645E-08
 xmin, 1-xmax =  4.6566129E-10  5.9604645E-08
 xmin, 1-xmax =  9.3132257E-10  5.9604645E-08
 xmin, 1-xmax =  4.6566129E-10  5.9604645E-08

Yeah, I just found a bug in a program for someone that basically assumed that using the random_number() intrinsic and multiplying by huge(0.0) was resulting in any representable non-negative value which is a bad assumption for a lot of reasons but one of the reasons the values were really skewed was because of the lack of small values. Was thinking after that that stdlib should perhaps have more functions for simple-looking things like a random number over a wide range instead of {0,1) that are not always all that simple after all.

I’m not certain about this, but I think most PRNG algorithms do not sample the full set of possible real numbers in the [0,1) range. For example, you will never get numbers with the same exponent as tiny(xtype). This has always seemed like a bug to me, but there are several PRNG quality tests that are used, and the PRNG algorithms pass these tests without cycling through the full set of real numbers in that range.

As that program tried nran=1E9 I’m not surprised its kind(1d0) xmin was of order 4.7E-10. One would expect the probability of getting a lower random number between 0 and 1 was about 4.7E-10, which is near 1/2.1e9. Try larger or smaller nran values if you want to test that.

What a typical pseudorandom number generator does is generate a “random” unsigned integer between 0 and 2**nbits_used-1, and scales down by dividing by 2**nbits_used. To avoid rounding up to 1, nbits_used is typically 1/epsilon of the output real kind.

2 Likes

Too speed this code up, don’t generate the random numbers one by one, but define xran as an array and generate many random numbers at once. I wrote a tiny program, which does exactly this:

program main
implicit none

    integer, parameter :: N = 128

    real*8    :: rand_min, rand(N), tmp
    integer*8 :: i

    rand_min = 1.
    i        = 0

    do
        i = i+N
        call random_number(rand)
        tmp = minval(rand)
        if(tmp<rand_min) then
            rand_min = tmp
            print*, i, rand_min
        end if
    end do
end program main

Please ignore real*8, I’m lazy. Time to write a snippet for this. :sweat_smile:

Result after 80 minutes, compiled with gfortran:

  # of random numbers   smallest number
                  128   1.0738239283927253E-003
                 1408   9.2422457855334539E-004
                 2944   3.8353413446978735E-004
                 3072   7.8078975527917649E-005
                15616   6.5587736192140866E-006
                62976   4.9325438717939818E-006
                69888   1.8853540449947914E-006
               429568   1.1228334619861613E-006
              1567872   2.8967965903792248E-007
              8091136   2.7340916974871732E-008
             69558272   1.5002061815039269E-008
             84105728   6.1537964723967775E-009
            172165632   1.1525747023455324E-009
           1899706624   4.4249925945649693E-010
           2840530048   7.1406436319421118E-011
          15115545088   4.3850367781317345E-011
          21987621504   9.2775787052801206E-012
         119356990208   3.9074299351682384E-012
         420486078592   3.3796299092614390E-012
         494812285312   4.5397019476922651E-013

(and it goes on…)

One should not expect (many) random numbers significantly less than epsilon. If the sample is to be uniformly distributed between 0 and 1, it should adopt the number density close to 1, not 0. Mathematically, the range 0-1 is equivalent to 1-2 but numerically, there is a huge difference. The distance between 0.0 and nearest(0.0,1.0) is more than 30 orders of magnitude less than the distance between 1.0 and nearest(1.0, -1.0). For double precision, it is 300 orders of magnitude!

The constraints of the floating point representation make the range [0,1) quite unique and, for random numbers generation, rather unfortunate. There is 8388608 different 4-byte reals in the range [1,2) but more than a billion in [0,1). Of those, 8388608 fill the [0.5,1) range, with remaining (still more than) billion in [0,0.5).

If the PRNG is indeed realized as an integer generator with period of 2^n, one should expect the smallest (but positive) real random number of an order of 2^{-n}. For n=32 it would be 2e-10.

1 Like

The period is a different issue from the number of distinct numbers that are returned. The distinct numbers depend, as you say, on the number of bits. But in practice, it is not the number of bits in the whole floating point value, it is just the number in the mantissa. That’s something like 23 or 24 bits for a 32-bit floating point value (I’m not sure whether the hidden bit should be counted). Your value of 8388608 is for 23 bits (i.e. without the hidden bit), and I think that is probably right. However, within the period of the PRNG, each of those values can appear multiple times. I have seen reports of periods longer than 10^9000. My previous comments were about why all of the other floating point numbers in the [0,1) range are not sampled. It seems like they should be, as long as the correct distribution is maintained, and as you point out, there are a whole bunch of them.

Another question I have about the fortran PRNG is why can’t it return integer values too? As is, you can get integer values from the floating point ones, but one must be careful when generating 2^32 integer values from the much smaller number of 2^23 floating point values. If random_number() would just return integers directly, then that is one less thing to worry about.

RonShepard said “However, within the period of the PRNG, each of those values can appear multiple times.” My favourite example of that sort of thing is the Gregorian calendar’s algorithm for finding the date of Easter. There are only 35 possible dates but the algorithm’s period is 5,700,000 years.

A typical random number generator keeps its state in a small array of integer values. This allows it to have a period much larger than 2**nbits, where nbits is the bit size of the integer. It uniformly samples 1/epsilon bins separated by epsilon uniformly distributed over the interval [0,1). (Technically it is epsilon/2 off from the centers of the bins.)

I don’t know why the the Fortran standard does not provide a random integer generator. It is easy to return an integer value from the integer array used in the floating point random number generator. You can then use a bit mask or the modulo function to down sample the integers to the range required. However if the number of integers is not a power of two simply applying the modulo function will not generate a uniform distribution over the values. This is typically handled by checking if the returned integer is greater than our equal to the the larges value that is a multiple of the number of integer values to be sampled and less than 2**nbits.

In order to sample the real number values uniformly over [0,1) using the non-uniform spacing of single precision floating point numbers you would have to do something like this:

  1. Generate 24 random bits
  2. If the highest order bit is 1 divide the upper half of the interval of interest into 2**23 bins and use the remaining 23 bits to randomly sample the bins.
  3. If the given bit of interest is 0 the interval of interest is the lower half of the current interval. Recurse to 1 unless lower half is the interval of denormals.
  4. If the lower half is denormals generate 23 random bits, divide the lower half into 2**23 bins, and use the 23 bits to ample those bins.
  5. If you aren’t using denormals things become a bit trickier.

You can do this better by generating a uniform random 23 bits for the mantissa and generating the exponent in 1 step from an exponential distribution.