Computations with units (meters, seconds, ...)

I may not understand fully what you mean. Perhaps the generic programming is something else. Based on the discussions above, I am sure I am missing something :sweat_smile:

But usually I will try to make the numerical calculations unitless first, which I believe is the best choice.
Recover the corresponding unit at the end.
So I do not need to carry unit during computations.

A toy example (sorry for grammar error, just donā€™t bother to fix),

It seems you have clear mathematical descriptions of the problem you want to solve. Such problems lend themselves well for that approach. But it becomes impractical if you are dealing with semi-empirical formulae. Not impossible, but impractical. An example we describe in the article is a so-called Q-H relation, which links the flow rate of a river to a water level. Something like: Q = a * h**2.6 (Q in m3/s, h in m above some reference level, a an emprical constant with an unruly unit). Even if you convert the quantities into dimensionless constants, you still have to be aware that the water level is measured to some reference level.
The impracticality in general comes from it being simpler to think in terms of actual sizes than in terms of dimensionless quantities. So, quite often, people do not even think of a proper scaling to avoid the intracies of units.

1 Like

There is a serious bug in these two lines of your with-units version of function Laplace. The dependent variable u has units (of temperature, electric potential, etc.), therefore, the constants 1 and 0 must also have the same units (and, perhaps, a reference base such as 0 kelvin or 0 deg. F.). I know that this is not working code, but it provides a useful test case that someone trying to advance support for units should consider (at compile time or at run time, at this point it matters little to me, a skeptic). Would the units-support system that you champion have caught this bug? How would you correct the bug and modify the code to indicate that the 0 and 1 have the units of u? What if u was in deg C and the constant 1 was in deg. F?

Let us say that I have two variables dTa and dTb, both representing temperature changes. One of them is in Celsius and the other in Kelvin. What should I do with the expression (dTa - dTb)?

If a unit-checking software was used and failed to catch an error such as the example above, who is to be held responsible?

Much has been written about compile-time versus run-time unit checking and the impact on performance of software. I worry that the proposed solutions involve much additional coding time and may lead to bloated code with all the additional declarations needed for unit checking.

Iā€™m afraid that the solutions proposed may be far more painful than the disease, which itself is very infrequent (among well-trained professionals) and easily addressed by traditional means.

2 Likes

@mecej4 excellent points. Thatā€™s why we have to have a prototype and actually try it and use it before we even attempt to standardize it.

Regarding the fact that u(1,:)=1 has units of [K] for u, but no units for 1, how would you design it so that you can assign to arrays with unit? You have to assign numbers to them at some point. One can either do:

integer, unit(K) :: x
x = 1
u(1,:) = x
x = 0
u(2:,:) = x

Which seems awkward and we still have the problem with x = 1, where x has a unit of [K], but 1 is unit less. Another option is:

u(1,:) = 1_K
u(2:,:) = 0_K

Or something like that, the unit becomes something like ā€œkindā€. Not sure about the syntax. A third option is just what was there before:

u(1,:) = 1
u(2:,:) = 0

And the idea is that an integer literal like 1 and 0 is simply unitless and it assumes the unit of the LHS. But if you use a variable like u(1,:)=x, then x and u has to have the same units. I think this is the approach I was imagining above. Is that not a good idea?

1 Like

Ondřej, I am relieved and pleased to read your response. There are several different issues to consider.

Where did the 0 and 1 come from? The physical problem may have been: A plate with initial temperature T_0 has boundaries fixed at (uniform and steady) T_w. A mathematician would naturally replace T_w by 0 and T_0 by 1 within a blink. Decades ago, I took a calculus course years after my physics courses, and it took me months to come to grips with how much ice mathematicians hoarded to keep boundaries at 0 degrees, and how they managed to get thermometers to read 0.123 degrees, and where they obtained the infinite heat flows that they would need to change the boundary temperature in no time.

The programmer and the users may use different units, and the program would have to accommodate all its users. Most inputs that are represented as real or complex numbers, whether supplied as constants in expressions, in declarations with Parameter attribute, as return values from functions and subroutines or data read from input files, may have units, and there has to be a standard way of stating those units in inputs and outputs and lines of program code. To avoid bloat, some restrictions may be needed on the number of variables and constants that have units.

If a program reads data from an input file and some of that data has units, errors cannot be checked except when the program runs. If the data is available before the program is run, one could run a data-checking program on the input file as a first step. This, however, cannot be done if the data is coming from, for example, a survey or a measuring instrument.

In any expressions involving quantities with units, functions can have argument(s) with units only if their Maclaurin series have exactly one term.

Excellent point. Here is another way to do the example above then:

real(dp), parameter, unit(K) :: T1 = 1, T2 = 0
u(1,:) = T1
u(2:,:) = T2

One can imagine a restriction that you can only assign a number if it is first assigned to a parameter (with a unit). However, you still want to allow expressions like (2*(dx**2 + dy**2)), where dx has units. An alternative thus is the following:

real(dp), unit(K) :: u(N,N)
u(1,:) = 1 * K
u(2:,:) = 0 * K

And unless the unit is explicitly specified, whether via dx or via K, it would give an error.

I think thatā€™s fine, the read function in Fortran would check the units at runtime and fail if the unit does not agree. There would be some mechanism to handle the error in the program if desired.

Are you talking about functions like sin(x)? Those are unitless, so that shouldnā€™t be a problem. Do you have some examples of functions that accept arguments with units?

In the original physical equations as well as their numerical implementations indeed one can always enforce units. However I am not sure if in practice there might be some issue with such restrictive checking. So we would need to also provide a mechanism to explicitly override unit checking (to allow assigning incompatible units or to strip units). Then I think it might work in all cases.

As many as you want. Consider, for example, the simple pendulum. The formula for the period of a simple pendulum, in the Wikipedia article on the pendulum, could be programmed as

real function T_pend(L, g, theta_0)

Such a mechanism would amount to sabotage!

Who selects what part of a code is not to be checked, and who decides that it is safe to do so? When the program prints its results, is there a record in the results that some unit checks were turned off, and which ones they were? How do you contend with the senior engineer/physicist/chemistā€¦ who says ā€œI donā€™t make mistakes with units!ā€ and programs accordingly?

1 Like

In an earlier post in this thread, I wrote:

The famous Mars Climate Orbiter (MCO) mishap of 1999 involved many causes and involved the reading/transfer of impulse inputs from an ā€œAMDā€ file furnished by Lockheed Martin to a JPL control program. The specification of the AMD file contains this:

COLUMNS    CONTENT             DESCRIPTION
------------------------------------------------------------
  1-11     "THRUSTER 12"       Record label. This record is
                               always present, even if
                               thruster is inactive.
 12-12     " "                 Blank space
 13-23     "NO._PULSES="       Label to indicate the number
                               of pulses fired during the
                               course of a single desaturation
                               event.
 24-26      I3                 Integer number of pulses for
                               thruster 12. If = 0, this
                               thruster is inactive. If
                               not = 0, thruster is active.
 27-27     " "                 Blank space
 28-36     "DEADTIME="         Label to indicate time between
                               pulses for thruster 12.
 37-41     F5.2                Time in seconds between pulses.
 42-42     " "                 Blank space
 43-53     "PULSEWIDTH="       Label of the width or span of
                               the pulse for thruster 12.
 54-58     F5.3                Duration or width of the pulse
                               of the active thruster in
                               seconds.
 59-59     " "                 Blank space
 60-71     "IMPULSE BIT="      Label specifying the impulse per
                               pulse for thruster 12.
 72-79     F8.5                Value of the impulse/pulse in
                               "Newton seconds" for active
                               thruster 12.

Note the information that is supposed to be found in columns 72-79.
The same document contains sample AMD file data under " Appendix: SAMPLE WRAPPED FILE". Data lines, in the format specified above, are as follows:

THRUSTER 01 NO._PULSES= 31 DEADTIME= 8.0 PULSEWIDTH=0.10 IMPULSE BIT= 0.44000
THRUSTER 02 NO._PULSES=  0 DEADTIME= 8.0 PULSEWIDTH=0.10 IMPULSE BIT= 0.44000
THRUSTER 03 NO._PULSES=  0 DEADTIME= 8.0 PULSEWIDTH=0.10 IMPULSE BIT= 0.44000

There is nothing in the data file itself to indicate that the numbers (the 0.44000 values in columns 72-79) are in the specified units. If they are in pounds(force).seconds, instead of newton.seconds, as specified, none of the unit conversion/checking software schemes that we have discussed at length in this thread would have caught the error.

What I have written in these paragraphs is based on a snailā€™s eye look at a very limited, but pertinent, piece of evidence. It is extremely likely that the full explanation of the mishap is enormously more complicated. For a sample of other issues, see the IEEE Spectrum article.

1 Like

Great example. The link you provided shows the exact period which depends on the amplitude \theta_0. The actual physical units of L and g get transferred via sqrt to T. The Taylor series of the \sin^{2n}\left(\theta_0\over2\right) functions is unitless. To be specific, it assumes that the angle \theta_0 is in radians and treats it as unitless. You can take as many terms as you want and the units will work.

Itā€™s a good questions how we should encode radians vs degrees in the units to make the above work.

I can answer this part: it would be similar to Rustā€™s unsafe (and is some sense also unwrap). Rust is safe by default, but provides mechanisms to override it.

So if you override units, it means that part of the code is not safe. And as a consequence the whole code might not be safe. So one way to use these features is to isolate those into one or two modules, and those modules have to be ā€œcertifiedā€ that they are correct (using whatever process you have in your organization). The rest of the code will be ā€œcertifiedā€ by the compiler itself.

I would say if we are going to do units, letā€™s do them in a way that would prevent such a mishap. So we need a mechanism in read and write that saves the unit in the file (from one Fortran code) and reliably reads it in (and checks at runtime) in another Fortran code. There are many ways how that could be done in both binary and text files and we can brainstorm some solutions.

2 Likes

A safe maxim is to think of y = sin x as a solution of d^2 y/dx^2 = - y, where x and y are real. No need to think about geometry, plane or spherical triangles, etc. No need to think of x as an ā€˜angleā€™, and create the name ā€˜radianā€™ for an ordinary real variable with no units. Unfortunately, we tend to remember concepts in the order in which we were taught those concepts.

Sure, I only use radians. But my understanding is that many engineering applications use degrees. Fortran is even getting trigonometric functions that accept degrees. So might be a good idea to ensure one does not accidentally plug in values in degrees.

It may not be necessary to extend READ to label units. What we are trying to do in fpt is to infer the relationships between units. The user may, but is not required to provide any labels. David Flower suggested a system where known physical constants, for example, c = 2.997E8, could be picked up and the units inference propagated. Similarly many intrinsics require inputs or produce outputs which are dimensionless and this places constraints on the relationships between associated variables. Values read in should have the same units as the variables they are read into, and we go from there.

1 Like

A problem that arose in mathematics of units, rather than computing with them, when I was teaching calculus was the arbitrary constant in integral (1/x) dx.
First-year students usually wrote log(x) + c. Some wrote log(|x|) + c. The integral often makes sense when x has units, but log(x) does not. (We can find log(2.0) but not log(2.0 m/s).) So I always told them that in this case a better answer is log(x/c) where c has the same units as x and, for first-year students, the same sign as x. Similarly, if n/=-1 then integral xn dx = x(n+1)/(n+1) + c, where c has the units of x**(n+1) and does not cause trouble. I particularly hated when teaching complex analysis that many students still thought integral(1/z) dz = log(|z|) + c. when z is complex.

2 Likes