Using Unicode Characters in Fortran

The problem is that, at least as far as I understand, a “character” is a very vague term and not precisely defined in Unicode. I’ll highly reccoment the Characters section of the UTF-8 Everywhere Manifesto. Some relevant quotes:

(…)

  • User-perceived character — Whatever the end user thinks of as a character. This notion is language dependent. For instance, ‘ch’ is two letters in English and Latin, but considered to be one letter in Czech and Slovak.
  • Grapheme cluster — A sequence of coded characters that ‘should be kept together’.[§2.11] Grapheme clusters approximate the notion of user-perceived characters in a language independent way. They are used for, e.g., cursor movement and selection.
    (…)

‘Character’ may refer to any of the above. The Unicode Standard uses it as a synonym for coded character.[§3.4] When a programming language or a library documentation says ‘character’, it typically means a code unit. When an end user is asked about the number of characters in a string, he will count the user-perceived characters. A programmer might count characters as code units, code points, or grapheme clusters, according to the level of the programmer’s Unicode expertise. For example, this is how Twitter counts characters. In our opinion, a string length function should not necessarily return one for the string ‘:koala:’ to be considered Unicode-compliant.

(Emphasis on the last sentence added by me)