Computations with units (meters, seconds, ...)

Hi everyone,

I was reminded of this topic when reviewing the j3-fortran proposal on generic programming in Fortran (use cases): correctly handling units in computations. It is a topic with a long history and - as far as I know - no definitive answer. In fact I tend to think there is no 100% proof method to deal with the issue and still use all the usual programming techniques. And that is what I would like to work out in a paper/blog/whatever - a piece of hopefully coherent text :blush:. And yes, it is a hobby horse, but a serious one.
My questions right now are:

  • What solutions have you seen or used?
  • What requirements are to be imposed?
  • What (fundamental) problems have to be tackled?

For starters: I know that the ideal solution would be that the compiler checks your expressions and throws an error if you attempt to add 0.2 m to 5 K. I know of a library by Grant Petty (PHYSUNITS) that deals with this sort of things at run-time. I have been experimenting with a different approach myself, geared to the typical problems I see with less formal units - grams chlorophyll per gram carbon, for instance.

So, I would like to know your ideas on this topic.

1 Like

I read that paper about CamFort (I have not tried it):

Contrastin, Mistral, Andrew Rice, Matthew Danish, et Dominic Orchard. « Units-of-measure correctness in fortran programs ». Computing in Science & Engineering 18, no 1 (2016): 102‑7.
https://www.cl.cam.ac.uk/~acr31/pubs/contrastin-units.pdf

Here, we demonstrate how our freely available, open source tool, CamFort, provides a low-effort and automated way of detecting mismatched units-of-measure in code.

See also:
Orchard, Dominic, Andrew Rice, et Oleg Oshmyan. « Evolving Fortran Types with Inferred Units-of-Measure ». Journal of Computational Science 9 (juillet 2015): 156‑62. Redirecting.

SimCon has a project for Automatic Analysis of Physical Units and Dimensions. I don’t know if anything is ready for use.

Ada has support for physical units. It could be worth looking at what it does.

@everythingfunctional created quaff

Quantities for Fortran. Make math with units more convenient.

This library provides all the functionality necessary to almost treat quantities with units associated with them as though they were just intrinsic real values. However, since a quantity has it’s own unique type, you get some compile time safety that you don’t mix them up in your argument lists, and you don’t have to worry about doing unit conversions or remembering what units you’ve stored things in when you start doing math with them.

Turning a number into a quantity is as easy as defining what units that number is in, like 1.0d0.unit.METERS . And, if you need the number back out, just say what units you want the value in like time.in.SECONDS .

3 Likes

I usually have to deal with units in the context of electronic structure theory. By performing all calculations in atomic units most prefactors and constants drop from the used equations.

For the actual calculations the dimensions and rank of the quantity are usually quite descriptive, e.g. an energy (scalar, Joule → Hartee) for a cartesian geometry ([3, nat], Meters → Bohr) and its energy derivative w.r.t. displacements ([3, nat], Joule/Meter → Hartree/Bohr) go well together without extra annotations and can be easily distinguished from a charge distribution ([nsrc], Coulomb → unit charge) and its potential ([nsrc], Joule/Coulomb → Hartree/unit charge). The overall handling of those quantities feels quite natural in the actual computation.

For input quantities this can be more difficult, as they are usually given in a more human friendly unit system (Ångström for length, eV or kcal/mol for energies, g/L for mass densities, 
). Rigorously converting all input into the internal unit system and only converting it back for human facing output has worked quite well so far.

One thing I have to deal with frequently is different sign conventions, like the sign switch from energy gradients to forces (both Joule/Meter), or normalizations, like the difference between a virial and a stress tensor, where the latter is derived from the former normalized with the systems volume. Those conventions happen to be seldom documented in existing code bases or libraries and take some effort to find out by trial-and-error.

3 Likes

Not sure if this is strictly related to units. Quantities with cartesian componts and in spherical harmonics tend to be interesting.

The classic is the component ordering of the first moment, which can be

  • x (1), y (–1), z (0)
  • y (–1), z (0), x (1)
  • z (0), x (1), y (–1)
  • 


I have seen all of the above and more in actual implementations, all have their merits and drawbacks. Not sure how this could be handled gracefully by a unit tracking tool.

See this proposal and the associated discussion:

2 Likes

This seems like something where generic programming is great; I really like the approach of defining e.g. a type Meters which is just a thin wrapper to real, but where +, -, *, / and ‘**’ act appropriately with the units to produce other types which are also just thin wrappers to real. If done right, this doesn’t change the compiled code at all, doesn’t change the written code apart from at variable declarations, and throws a compile-time error if you get your units wrong.

A quick web search points to e.g. this rust library, which looks nice. I’m hoping that if some form of generics are included in Fortran 202Y then this will become possible in Fortran too.

In my own Fortran code I have lots of types like CartesianDisplacement, NormalModeForce and PhononWavevector which are all just wrappers for real(dp), allocatable :: vector(:) arrays. But this ends up requiring absolutely tons of duplicated code, so it’s not an ideal solution.

1 Like

@Arjen, I suggest reviewing the prior discussion on this topic under the notion of “reliability” at comp.lang.fortran that you may recall: please see this thread.

As you may know, this is something that gets discussed every now and then in the context of statically typed and compiler-based programming languages toward scientific and technical computing, particularly Fortran and C++.

Until now there have been no solutions that are fundamentally based on the physical nature of quantities and their dimensional analysis and which are truly compile-time and that are acceptable to implementors (think commercial vendors here first) and also practitioners and which can then be integrated into the core language.

The generic programming feature, if done well in Fortran, holds the prospect in the very distant future of some progress on the more important aspect of dimensionality of physical quantities in floating-point operations and secondarily in the unit-of-measure conversions in library and user code. That is, post 2040 by when some compilers may have implemented the feature set reliably.

1 Like

TLDR, I’ve spent a fair amount of time exploring this space and it’s really hard and there isn’t just one right way to do it.

The fundamental trade-offs I’ve found are between flexibility and convenience versus run-time performance versus maintainability versus fidelity (precision), with some other subtle nuances thrown in. In my library I focused on flexibility and convenience and low runtime cost, at the expense of maintainability (although I have ways to mitigate some aspects of that) and fidelity.

I have a type for each kind of quantity (i.e. length, time, etc.), and operators to convert to and from real numbers given one of a set of available units. Values are stored internally in SI units, and all of the mathematical operators you’d expect are available (i.e. adding two lengths together works, dividing a length by a time gives a speed, etc.). This design maximizes compile time safety (i.e. you can’t inadvertently assign a speed to an acceleration) and flexibility (i.e. you can add 1 m to 1 ft and get the answer in yards). And since I provide ways of going to and from strings, you can expose this flexibility to your users as well. Since unit conversions and tracking happen only when converting to or from a number, the run time cost of doing math is minimized.

The maintenance cost is that there are a large combination of operations between quantities that need to be supported, and a huge number of units that should be available. The other cost is possible loss of precision, where if you’re doing calculations at either end of the extreme (i.e. in light years or femtoseconds), then storing the values in meters or seconds might incur some loss of precision.

I’ve seen designs that take different approaches to balancing the costs and benefits. For example, you can reduce the loss in precision by making values in every different unit a different type, but that comes at the cost of flexibility (i.e. I can’t add feet and meters any more), or maintenance (i.e. now I have to manually support all of the possible operations between all different quantities and units).

One really cool example is the Haskell library Numeric.Units.Dimensional, which uses some really advanced features of the type system to minimize maintenance costs and run-time overhead, but only supports SI units. The Rust library Dimensioned is pretty cool too.

3 Likes

@pmk why do you think generic programming can’t handle dimensional analysis?

As far as I can tell, the Rust library I linked to does exactly that. It defines a bunch of basic unit types like Meters and Seconds, and then it has macros to define things like Meters . Seconds and Meters / Seconds. And then if you multiply something of type Meters / Seconds by something of type Seconds you get something of type Meters.

If you replace Rust macros with Fortran pre-processor I don’t see why Fortran couldn’t to something similar.

Thanks for all these pointers - I knew there was a lot of stuff out there. The overall aspects to consider are:

  • Static versus dynamic checking
  • Dimensions versus units
  • Ease of use and maintenance

What I have in mind is to sketch the above issues and how this might influence a penultimate solution. Anyway, even food for thought :slight_smile:

1 Like

I’d be interested see what you come up with. And if you’re looking for co-authors I’d be happy to help.

Well, I gladly accept that offer :slight_smile: My idea is to examine the problem from first principles first. Then see how the requirements that result are addressed by existing solutions. My gut feeling at the moment is that there is no satisfactory solution for all requirements. But I should put that on paper before speculating too much. And of course read the material referenced upthread.

4 Likes

This is yet another unit library (here in Nim), which says there is no runtime overhead and supports multiplications of units etc (so might be similar to that in Rust above
?)

1 Like

Ah, thanks - from what I have read so far, the compile-time approaches mostly guarantee dimensional correctness. Some of the run-time approaches also allow unit conversions, such inches to centimeters and the like. I have not seen a list of requirements or wishes yet and that is what I want to focus on. BTW. Simcon’s approach is an interesting one, as that tries to determine the dimensions directly from the source code.

Stefano Zaghi has written a package of routines called FURY (“Fortran Units (environment) for Reliable phYsical math”), freely available on GitHub. I have not (yet) used it.

1 Like

Interesting, yet another possibility - mind you, there is much more to say about this than it would seem at first.

FYI, Brad Richardson and I are working on a paper about this topic. The purpose is to make an inventory of what use cases (or perhaps better usage patterns) there are, what they mean in terms of requirements to any programming solution and how well the existing and perhaps envisaged solutions support these requirements.

2 Likes

@Arjen @everythingfunctional you may also want to talk with @arclight who has spent a lot of time thinking about this problem.

1 Like

Thanks for the tip!