Using Unicode Characters in Fortran

I’m not sure that some of this is good advice. It seems what you are doing here is akin to stuffing double precision reals into an array of single precision reals… It sort of works under some circumstances, but I wouldn’t recommend it. I think using the selected_char_kind('ISO_10646') is the correct way. See my JSON-Fortran library, which does support unicode. And yes, it isn’t currently supported by ifort (what gives, Intel?), and yes, you have to write multiple versions of routines (but that’s the same way you have to do for different real kinds, so it is not unexpected).

Consider this file (‘unicode.txt’):

:grinning::sunglasses::weary:

And the following code:

program test

use iso_fortran_env

implicit none

integer,parameter :: CK = selected_char_kind('ISO_10646')

character(kind=CK,len=3) :: s
integer :: iunit

open(output_unit,encoding='utf-8')

open(newunit=iunit,file='unicode.txt',status='OLD',encoding='UTF-8')

read(iunit,'(A)') s

write(output_unit,*) s
write(output_unit,*) 'len(s) = ', len(s)
write(output_unit,*) 's(1:1) = ', s(1:1)

end program test

This prints:

😀😎😩
 len(s) =            3
 s(1:1) = 😀

So, notice how the length is 3 and the slicing works correctly.

But, I don’t think Fortran actually supports unicode in source files. For example, when I try to do this:

s = CK_'😀😎😩'

I get the warning “CHARACTER expression will be truncated in assignment (3/12) at (1) [-Wcharacter-truncation]” and s(1:1) will print as gibberish.

3 Likes