Could someone please correct this code?

program utf
use iso_fortran_env, only: output_unit
implicit none

integer, parameter :: ucs4 = selected_char_kind('iso_10646')
character(len=100, kind=ucs4) :: line

open(10, file='test.txt', form='formatted', action='read', encoding='utf-8')
! the compiler did not accept encoding=‘ucs4’

read(10, '(A)') line
write(*, '(A)') line
end program utf

What I am trying to do, is find a way to deal with squiggles in the code for manipulating text like “à l’école, j’ai été …”

For the code

implicit none
integer, parameter :: ucs4 = selected_char_kind("iso_10646")

gfortran version 14.0.1 20240121 gives output 4, but ifort version 2021.6.0 gives -1, meaning that ifort does not support this kind. What compiler and version are you using?

@Patrick , just add this line after open(10... :

open(output_unit, encoding='utf-8')

and the correct codes will be sent to your terminal. It works on mine with GFortran.

But as stated by @Beliavsky , Intel compilers ifort and ifx do not (yet) support ucs4.

What is the actual encoding of the file you are trying to read? If you try to use ucs4, I’d guess it is a full-unicode file, with every character stored in 4 bytes. I guess that it is quite possible, however, that it is a utf-8 file which can be read/written using default characters. Even intrinsic functions work on such strings, like index or len, one has to remember, however, that the length as stored is greater than the number of characters visible.

program char8
  implicit none
  character(len=80) :: line
  character(len=:), allocatable :: s1, s2, s3
  integer :: i, l1, l2, l3

  open(11,file='test.txt', action='read')
  read(11,'(a)') line
  s1 = trim(line)
  read(11,'(a)') line
  s2 = trim(line)
  read(11,'(a)') line
  s3 = trim(line)
  l1 = len(s1)
  l2 = len(s2)
  l3 = len(s3)
  print '(a/ i3,2x,*(a2,z3))', s1, l1, (s1(i:i),ichar(s1(i:i)),i=1,l1)
  print '(a/ i3,2x,*(a2,z3))', s2, l2, (s2(i:i),ichar(s2(i:i)),i=1,l2)
  print '(a/ i3,2x,*(a2,z3))', s3, l3, (s3(i:i),ichar(s3(i:i)),i=1,l3)
  print *, index(s3,s1), index(s3,s1,back=.true.)
end program char8

with the test.txt file containing UTF-8 encoded text:

à l’école
j’ai été

gives the following output (both with gfortran and ifx):

$ gfortran char8.f90 && ./a.out
  2   � C3 � A9
à l’école
 13   � C3 � A0   20 l 6C � E2 � 80 � 99 � C3 � A9 c 63 o 6F l 6C e 65
j’ai été
 12   j 6A � E2 � 80 � 99 a 61 i 69   20 � C3 � A9 t 74 � C3 � A9
           8          11

How do I find out what compiler I am using. I got it from mysys64.

This is what I have now. There are still squiggles

program utf
use iso_fortran_env, only: output_unit
implicit none

integer, parameter :: ucs4 = selected_char_kind('iso_10646')
character(len=100, kind=ucs4) :: line

open(10, file='test.txt',  encoding='utf-8')
open (output_unit,encoding ='utf-8')

end program utf

Re msz59

(Profile - msz59 - Fortran Discourse):At line 10 of file char8.f90 (unit = 11, file = ‘test.txt’)
Fortran runtime error: End of file

Error termination. Backtrace:

Could not print backtrace: libbacktrace could not find executable to open

I get this error as well. I am totally confounded.
K:>type test.txt
This is a testfile containing a squiggle j’ai visit├® l’h├┤pital ├á Paris

It is probably not a Fortran problem, but caused by the configuration of your terminal which need to be configure for French language.

My terminal has these settings concerning language:

$ env | grep LANG

To handle special characters like “à l’école, j’ai été …” in Fortran, which may not be represented correctly using the default character kind, you’re on the right track by using UTF-8 encoding and attempting to specify a UCS-4 (4-byte Unicode) character kind. However, there are a few things to consider and potentially adjust in your code:

  1. Character Kind: UCS-4 (selected_char_kind('iso_10646')) is a good choice for handling a wide range of Unicode characters. It represents each character with 4 bytes, which is suitable for most Unicode characters.
  2. File Encoding: In Fortran, the encoding parameter in the open statement is typically used to specify the character encoding of the file. Unfortunately, not all compilers may support encoding='utf-8'. If your compiler doesn’t support this, it might not correctly interpret UTF-8 encoded files.
  3. Read and Write Statements: When you read from and write to the console or a file, ensure that the system you’re running on can handle UTF-8 or UCS-4 encoded text. This is more about the environment than the Fortran language itself.
  4. Compiler Support: Not all Fortran compilers support UCS-4 or UTF-8 encoding natively. You might need to check your compiler’s documentation for specific support and syntax.

Given these points, here are some suggestions:

  • Check Compiler Documentation: Ensure that your compiler supports UCS-4 and UTF-8. If it does, check the exact syntax and capabilities.
  • Alternative Approach: If UCS-4 is not supported, you might need to use a workaround. One common method is to read and process the text as a byte stream (using character(len=1), for example) and then convert it to the correct characters in your program. This approach is more complex but can be more portable across different compilers.
  • Test with Simple Text First: Start with a simple text file containing special characters to ensure that your read and write procedures work as expected.
  • Environment Considerations: Ensure that the environment where your program runs (like the console or the text editor) supports UTF-8 or UCS-4.

gnome terminal for french language

To configure the GNOME Terminal for French language, you need to change the language settings of your system to French. GNOME Terminal uses the system’s language settings to determine which language to display its interface in. Here’s a general guide on how to do it:

  1. Open System Settings: Go to your system settings. This is usually accessible through the system menu or the application launcher.
  2. Find Language or Region Settings: Look for a section in the system settings that deals with language or regional settings. This might be named “Language & Region”, “Language”, “Regional Settings”, or something similar.
  3. Add or Select French Language: In the language settings, you should have the option to add a new language or select from a list of installed languages. Add or select French (Français). If French is not available, you might need to download it.
  4. Apply the Changes: After selecting French, you will likely need to apply the changes. You may be prompted to log out and log back in, or restart your computer for the changes to take effect.
  5. Check GNOME Terminal: Once your system is set to French, open GNOME Terminal. It should now display its menus and messages in French.

Keep in mind that these steps can vary slightly depending on the version of your operating system and its desktop environment. If you’re using a specific distribution of Linux like Ubuntu, Fedora, or Debian, the exact steps might be slightly different.

If you’re still having issues, you might want to share the specific error messages or behavior you’re encountering, as that can provide more insights into the problem.


Re IanMartinAjzenszmidt.

Thank you for your very comprehensive response. I shall stop attempting to write French text handling applications in Fortran or C. The rigmarole is too overwhelming. I used to write in C#, with no problems setting culture - but I am not returning to that language, and, although I do speak French, I do not want a French display, although I did contemplate that option.

ian@ian-Latitude-E7440:~$ gfortran  frtext.f08 -o frtext
ian@ian-Latitude-E7440:~$ ./frtext
 Je m’appelle Jessica. Je suis une fille, je suis française et j’ai treize ans. Je vais à l’école à Nice, mais j’habite à Cagnes-Sur-Mer. J’ai deux frères. Le premier s’appelle Thomas
ian@ian-Latitude-E7440:~$ cat frtext.f08
PROGRAM FrenchTextEditor
    CHARACTER(LEN=200) :: line
    INTEGER :: iost

    OPEN(UNIT=10, FILE='french1.txt', STATUS='OLD', ACTION='READ')

        READ(10, '(A)', IOSTAT=iost) line
        IF (iost /= 0) EXIT
        PRINT *, line
    END DO

END PROGRAM FrenchTextEditor
ian@ian-Latitude-E7440:~$ cat french1.txt
Je m’appelle Jessica. Je suis une fille, je suis française et j’ai treize ans. Je vais à l’école à Nice, mais j’habite à Cagnes-Sur-Mer. J’ai deux frères. Le premier s’appelle Thomas, il a quatorze ans. Le second s’appelle Yann et il a neuf ans. Mon papa est italien et il est fleuriste. Ma mère est allemande et est avocate. Mes frères et moi parlons français, italien et allemand à la maison. Nous avons une grande maison avec un chien, un poisson et deux chats.

If your compiler is not 15 years behind the times it should be able to run this 3-line program and tell you what compiler it is:

 use,intrinsic:: iso_fortran_env, only: compiler_version
  print *, compiler_version()
  end program

re: Harper

The answer is: GCC version 10.3.0