A new document regarding Unicode has been added to the Fortran Wiki that includes links to Fortran modules implementing solutions to some of the remaining issues with Unicode usage in Fortran plus dozens of example programs. A final pass is being made to reduce errata but the section on using the ISO_10646 extension is extensive and replete with detailed examples and references to pertinent sections of the Fortran Standard to explain why some of the information found there is different than several other references.
It is a wiki; so feel free to make corrections or clarifications. Note that the M_unicode module referenced is complete although still being expanded, but the M_ucs4 module is functional but needs further documentation and examples added.
References
- Fortran Wiki article
- The M_unicode repository for using UTF-8 files from Fortran
- The M_ucs4 repository for use with the optional ISO-10646 Fortran extension
- M_isolatin
See Also
- uni.f90 is a stand-alone single source file
that builds a utility program called uni for manipulating Unicode data in UTF-8 files; demonstrating various aspects of M_unicode.
Related …
- M_strings for ASCII string procedures
- M_io for filesystem and I/O related functions
- M_attr for ANSI terminal color and attributes
You can generate an example file using “uni --example” which
among other things can demonstrate turning HTML character entities
Α,α, Β,β, Γ,γ,
Δ,δ, Ε,ε, Ζ,ζ,
Η,η, Θ,θ, Ι,ι,
Κ,κ, Λ,λ, Μ,μ,
Ν,ν, Ξ,ξ, Ο,ο,
Π,π, Ρ,ρ, Σ,σ,
Τ,τ, Υ,υ, Φ,φ,
Χ,χ, Ψ,ψ, Ωω
to utf-8 using
uni --html <greek.html >greek.utf8
Α,α, Β,β, Γ,γ,
Δ,δ, Ε,ε, Ζ,ζ,
Η,η, Θ,θ, Ι,ι,
Κ,κ, Λ,λ, Μ,μ,
Ν,ν, Ξ,ξ, Ο,ο,
Π,π, Ρ,ρ, Σ,σ,
Τ,τ, Υ,υ, Φ,φ,
Χ,χ, Ψ,ψ, Ωω
among other operations it performs, including listing
all the HTML character entities
uni --entities
The help text for the command describes several other capabilities.
Suggestions on what other utf8-related operations would be useful are
welcome.