How to conceal derived-type components?

From MRC:

Once the implementation details of a module have been separated into submodules, the text of the module itself can be published to provide authoritative documentation of the interface without exposing any trade secrets contained in the implementation.

If I have a module which provides a derived type, how do I hide the members of the type? I’m not concerned with visibility, but with hiding implementation details (e.g. the names of present fields and their types).

The only option I see is encapsulating any private members into their own type, which remains hidden in a separate module:

module patient_data
   
   use private_info, only: private_patient_info
   implicit none
   public 

   private private_patient_info

   type, public :: patient
      character(:), allocatable :: forename, surname
      type(private_patient_info), private :: info
   end type

   interface
      !> Age in years
      module function age(p)
         ! age calculated from contents in patient%info
         ! however we keep it secret how age is actually stored
         type(patient), intent(in) :: p
         integer :: age
      end function
   end interface

end module

Are there any other possibilities? The solution above uses composition, is there an option using inheritance?

C++ uses the pimpl idiom (which I believe you are referring to).
https://en.cppreference.com/w/cpp/language/pimpl

C uses opaque types:

In the thread Should we avoid assignment of derived types in robust programs? - #7 by ivanpribec we established that for maximum robustness, pointer types should be hidden in their own “dirty” type.

In the pimpl idiom you link, they use a forward declaration. Any private members are hidden in the struct providing the implementation. Since Fortran doesn’t have forward declarations, I cannot put the private_patient_info type directly in the submodule defining the procedures.

It seems like Fortran needs it’s own idiom. The procedures are implemented in submodule, while any private declarations must go to a separate module.

Here’s a more specific (imaginary) example.

Imagine we are developing a blood measurement device, and our partner company provides us with a module to register blood samples and store them in a database:

module blood_sample_registry

implicit none
public

integer, parameter :: rk = kind(1.0e0)

type :: blood_sample
   integer :: id
   character(2), private :: blood_type
   real(rk), private :: sodium
   real(rk), private :: hematocrit
end type

interface
   module function register_sample(blood_type,sodium,hematocrit) result(sample)
      character(2), intent(in) :: blood_type
      real(rk), intent(in) :: sodium, hematocrit
      type(blood_sample) :: sample
   end function
   module subroutine write_sample(fileunit,sample)
      integer, intent(in) :: fileunit
      type(blood_sample), intent(in) :: sample
   end subroutine

end interface

end module
program blood_device

   use blood_sample_registry
   implicit none

   type(blood_sample) :: sample 
   integer :: database

   sample = register_sample( &
      blood_type='AB', &
      sodium=130., &
      hematocrit=40.1)

   open(newunit=database,file="blood_database.txt",access='stream')
   call write_sample(database,sample)
   close(database)

end program

Someone who happens to come across the database file, could easily run it through a hexdump:

$ hexdump -C blood_database.txt 
00000000  01 00 00 00 41 42 00 00  02 43 66 66 20 42 02 00  |....AB...Cff B..|
00000010  00 00 30 30 00 00 07 43  66 66 14 42              |..00...Cff.B|
0000001c

It’s easy enough to notice the AB and 00 symbols. A minute later the malevolent person runs a $ grep -nr "blood_sample", and is quickly able to locate a left-over module source file which was unsuccessfully deleted after installation. Using the type definition, the person can easily decode the whole thing:

print *, transfer(int(z'0001'),1)
print *, achar(int(z'41')), achar(int(z'42'))
print *, transfer(int(z'43020000'),1.0)
print *, transfer(int(z'42206666'),1.0)
end
           1
 BA
   130.000000    
   40.0999985  

The quote from MRC that submodules can help conceal implementation secrets is only valid for procedures. Derived-type components however, are left exposed. They need to be hidden within a separate derived type.

Full example
!> Hidden
module blood_private

   implicit none

   integer, parameter :: rk = kind(1.0e0)

   type :: blood_sample_private
      ! Members should be public if we don't want to use getters
      character(2) :: blood_type
      real(rk) :: sodium
      real(rk) :: hematocrit
   end type

end module

!> Visible to clients for integration in blood measurement devices
module blood_sampler

   use blood_private, only: rk, blood_sample_private

   implicit none
   private

   public :: blood_sample, register_sample, write_sample, rk

   type :: blood_sample
      integer :: id
      type(blood_sample_private), private :: data
   end type

   interface
      module function register_sample(blood_type,sodium,hematocrit) result(sample)
         character(2), intent(in) :: blood_type
         real(rk), intent(in) :: sodium, hematocrit
         type(blood_sample) :: sample
      end function
      module subroutine write_sample(fileunit,sample)
         integer, intent(in) :: fileunit
         type(blood_sample), intent(in) :: sample
      end subroutine

end interface

end module

!> Submodule is not shared, but provided precompiled as a shared library
!>
submodule (blood_sampler) blood_sampler_actions
contains

   ! how we assign ID's, how we validate sample values should be hidden
   ! so no one can tamper with it without having to disassemble the binary objects

   module function register_sample(blood_type,sodium,hematocrit) result(sample)
     character(2), intent(in) :: blood_type
     real(rk), intent(in) :: sodium, hematocrit
     type(blood_sample) :: sample

     integer, save :: id = 0

     id = id + 1

     sample%id = id
     sample%data = blood_sample_private(blood_type,sodium,hematocrit)

   end function

   module subroutine write_sample(fileunit,sample)
      integer, intent(in) :: fileunit
      type(blood_sample), intent(in) :: sample
   
      write(fileunit) sample%id, &
                      sample%data%blood_type, &
                      sample%data%sodium, &
                      sample%data%hematocrit

   end subroutine

end submodule


program blood_laboratory

   use blood_sampler
   implicit none

   type(blood_sample) :: sample(2) 
   integer :: database

   sample(1) = register_sample( &
      blood_type='AB', &
      sodium=130., &
      hematocrit=40.1)

   sample(2) = register_sample( &
      blood_type='00', &
      sodium=135., &
      hematocrit=37.1)

   open(newunit=database,file="blood_database.txt",access='stream')

   call write_sample(database,sample(1))
   call write_sample(database,sample(2))

   close(database)

end program

I haven’t 100% thought of the implications, but it looks like you could very well achieve that with an abstract derived type: that would be defined as an empty type (or at least, only containing the publibly available components), which you then extend to your private type. But, you can release the abstract interface as an API.

module patient_data
   implicit none
   public 

   type, abstract, public :: patient
      character(:), allocatable :: forename, surname
     contains
      procedure(age), deferred :: get_age
   end type

   abstract interface
      !> Age in years
      elemental integer function age(p)
         ! age calculated from contents in patient%info
         ! however we keep it secret how age is actually stored
         class(patient), intent(in) :: p
      end function
   end interface

end module
2 Likes

If Fortran had support for forward declarations then a variation on the C++ pimpl idiom, though with an allocatable component, would be a neat solution. A while back I made another solution based on inheritance here: Compilation time vs. C++ - #9 by plevold

I have to admit that I haven’t had the need to use this in production code yet, but I think it should work fine in many/most situations. I think the main disadvantage is the verbosity. For reference here is the complete example from my previous post:

mytype.f90:

module mytype_mod
    implicit none

    private
    public mytype_t
    public mytype_factory

    type, abstract :: mytype_t
        private
    contains
        procedure(public_sub), deferred :: public_sub
    end type

    interface
        subroutine public_sub(this)
            import mytype_t
            class(mytype_t), intent(inout) :: this
        end subroutine

        module function mytype_factory(i) result(this)
            integer, intent(in) :: i
            class(mytype_t), allocatable :: this
        end function
    end interface

end module

mytype_impl.f90:

submodule(mytype_mod) mytype_impl
    implicit none

    type, extends(mytype_t) :: mytype_impl_t
        integer :: i
    contains
        procedure :: public_sub => public_sub_impl
        procedure :: private_sub
    end type

contains

    module function mytype_factory(i) result(this)
        integer, intent(in) :: i
        class(mytype_t), allocatable :: this

        allocate(this, source=mytype_impl_t(i))
    end function


    subroutine public_sub_impl(this)
        class(mytype_impl_t), intent(inout) :: this

        write(*,*) 'This is public sub for mytype_impl_t with i  = ', this%i
        call this%private_sub()
    end subroutine


    subroutine private_sub(this)
        class(mytype_impl_t), intent(inout) :: this

        write(*,*) 'This is private sub for mytype_impl_t with i = ', this%i
    end subroutine

end submodule

main.f90

program main
    use mytype_mod, only: mytype_t, mytype_factory
    implicit none

    class(mytype_t), allocatable :: mytype

    mytype = mytype_factory(42)
    call mytype%public_sub()
end program

When run, it gives me the following output:

 This is public sub for mytype_impl_t with i  =  42
 This is private sub for mytype_impl_t with i =  42

That’s exactly what I was suggesting. But at the end of the day, if you need a public API you’re most likely going to share code outside of Fortran, so you’ll need a C API for portability…

Here’s a variation using blood_sample as an ‘opaque type’. It is just a ‘handle’ to the actual type. You could implement this as a pointer instead of an integer. The actual type is contained in the submodule. Does this meet your definition of hidden?

!> Visible to clients for integration in blood measurement devices
module blood_sampler

   implicit none
   private

   integer,parameter :: rk = 4

   public :: blood_sample, register_sample, write_sample, rk

   type :: blood_sample
      integer :: id
   end type

   interface
      module function register_sample(blood_type,sodium,hematocrit) result(sample)
         character(2), intent(in) :: blood_type
         real(rk), intent(in) :: sodium, hematocrit
         type(blood_sample) :: sample
      end function

      module subroutine write_sample(fileunit,sample)
         integer, intent(in) :: fileunit
         type(blood_sample), intent(in) :: sample
      end subroutine

end interface

end module

!> Submodule is not shared, but provided precompiled as a shared library
!>
submodule (blood_sampler) blood_sampler_actions
   type :: blood_sample_private
      ! Members should be public if we don't want to use getters
      character(2) :: blood_type
      real(rk) :: sodium
      real(rk) :: hematocrit
   end type

   type(blood_sample_private), allocatable :: samples(:)

contains

   ! how we assign ID's, how we validate sample values should be hidden
   ! so no one can tamper with it without having to disassemble the binary objects

   module function register_sample(blood_type,sodium,hematocrit) result(sample)
     character(2), intent(in) :: blood_type
     real(rk), intent(in) :: sodium, hematocrit
     type(blood_sample) :: sample

     integer, save :: id = 0

     id = id + 1

     sample%id = id
     samples(id) = blood_sample_private(blood_type,sodium,hematocrit)

   end function

   module subroutine write_sample(fileunit,sample)
      integer, intent(in) :: fileunit
      type(blood_sample), intent(in) :: sample
      integer :: id

      id = sample%id
   
      write(fileunit) samples(id), &
                      samples(id)%blood_type, &
                      samples(id)%sodium, &
                      samples(id)%hematocrit

   end subroutine

end submodule


program blood_laboratory

   use blood_sampler
   implicit none

   type(blood_sample) :: sample(2) 
   integer :: database

   sample(1) = register_sample( &
      blood_type='AB', &
      sodium=130., &
      hematocrit=40.1)

   sample(2) = register_sample( &
      blood_type='00', &
      sodium=135., &
      hematocrit=37.1)

   open(newunit=database,file="blood_database.txt",access='stream')

   call write_sample(database,sample(1))
   call write_sample(database,sample(2))

   close(database)

end program


From a security point of view, it doesn’t matter much what you do to hide implementation details if your attacker controls the database file unencrypted AND the binaries. Anyone with enough time/interest and the skill required will be able to extract the data using a rev. eng. tool.

If you really want to protect a fortran program like the one in the example, you shouldn’t trust the client and create a webservice of some sort, where the “only” attack vector will be the user’s input.

Not that this is the only option, but it is by far the simplest.

Adding an encryption method could work, but again if your client controls the binary anything can happen as you can intercept a function call and redirect to a custom function you made to dump the unencrypted info.

1 Like