Tagged pointers in Fortran?

I was wondering if anyone has experience with tagged pointers in Fortran?

For those unfamiliar with this term:

In computer science, a tagged pointer is a pointer (concretely a memory address) with additional data associated with it, such as an indirection bit or reference count. This additional data is often “folded” into the pointer, meaning stored inline in the data representing the address, taking advantage of certain properties of memory addressing.

See also:

A usage for tagged pointers can be found in data structures such as linked lists (e.g. if you need to check if node stores a True or False value) or in red and black search trees (the tag marks if the node is black or red) or in a kd-tree (to mark if a node is a leaf node).

1 Like

I don’t have experience with this in Fortran itself, but I am considering using those in LFortran to allow faster arbitrary integer arithmetic (smaller integers are represented as an int, and if they are bigger, the int becomes a pointer to an arbitrary precision integer from gmp most likely).

I will also experiment with using tagged pointers to represent some of the most commonly used nodes in AST / ASR, such as integers or variables.

@ivanpribec,

Owing to a lack of Generics in Fortran, are you resorting to unlimited polymorphism CLASS(*), POINTER and is that why you are looking at tagged pointers? In some languages such as C, a pointer can basically be seen as void * regardless of its dressing and tagging can become important, especially on memory constrained or restricted platforms. It’s rare to face such issues with Fortran for the typical use cases in scientific and technical computing unless one is resorting to unlimited polymorphism in which case type safety is severely compromised.

If this is what’s of interest to you, have you considered a simple Fortrannic solution using an abstract “base” node that encapsulates suitable “additional data” for tagging and using that as the type of pointer in your Fortran data structures (linked lists, trees, etc.)? Your real nodes can be subclasses of this abstract type.

1 Like

In some sense, Fortran pointers are already “tagged”. For example, a pointer to an array knows it’s shape. However, the programmer doesn’t really have “access” to what the tags contain, they’re just there to make the language feature “safer”. If you’re after something more, then I would recommend using the pattern suggested by @FortranFan

@FortranFan:

I was just curious if this is possible after watching a lecture from the D developers conference “Bit Packing like a Madman”. The author gives an example of a red-and-black tree, where the nodes are marked as red and black by flipping a bit in the pointers to the child nodes. Since the pointer does not need to be dereferenced to check the color of the node, this can apparently increase performance.

While I admit this idea is kind of hacky and not in line with Fortran’s concepts of type safety, it might still be possible by misusing the C binding mechanisms.

For example with my processor (compiler), the following code

program test

  use iso_c_binding
  use iso_fortran_env
  implicit none 

  type, bind(c) :: my_type
    integer(c_int) :: x
    real(c_float) :: r(5)
    character(kind=c_char) :: c(5)
  end type

  type(my_type) :: mt

  write(*,*) "Size of whole instance: ", c_sizeof(mt)
  write(*,*) "Total size of the components: ", c_sizeof(mt%x) + c_sizeof(mt%r) + c_sizeof(mt%c)

end program

produces the output

 Size of whole instance:                    32
 Total size of the components:                    29

This means there are 3 bytes of padding, which I could use to store other metadata.


Edit: Even without the bind(c), I still get three bytes of padding (in this case I need to use the intrinsic storage_size which returns the size of the instance mt in bits).

In the presentation I linked above, the author used tagged pointers to implement value type polymorphism in a his prototype D compiler (SDC), and thereby avoid the need for a virtual table.

For situations where wasted storage space in data structures can be of great concern, you may find value in making inventive use of padding in structures. However, note portability may be highly restricted for any such details will likely vary significantly from compiler to compiler and for the same compiler with the ABI going from one architecture to another. You can also run into considerable variation with compiler options and/or directives toward struct alignment and padding.