Tinyfiledialogs in Fortran

tinyfiledialogs on sourceforge.net is a cross-platform C library that provides modal dialogs and file dialogs. I’ve recently completed the interface module to call this code from Fortran.

	! calling function tinyfd_inputbox
	write(*,'(A)') "Enter tinyfd_inputbox()"
	aTitle = "a Title" // char(0)
	aMessage = "a Message" // char(0)
	aDefaultInput = "an Input" // char(0)
	cpointer = tinyfd_inputBox(aTitle, aMessage, c_loc(aDefaultInput) )
	! or for a password box: cpointer = tinyfd_inputbox(atitle, amessage, c_null_ptr )
	if ( c_associated(cpointer) ) then
                    ! Convert C Pointer to Fortran pointer
		call c_f_pointer(cpointer, fpointer) 
                    ! Remove NULL character at the end
		string = fpointer(1:index(fpointer,c_null_char)-1) 
		write (*,'(A)') string

I am not sure what to do for the UTF-16 / wchar_T functions, also offered by tinyfiledialogs in C, as Fortran doesn’t seem to be ready to handle these kind of strings. What do you think ?

edit: tinyfiledialogs handles char strings as UTF-8 (on windows and on unix). On windows, in C, it also offers UTF-16 / wchar_t. I was wondering if these UTF-16 / wchar_t calls may be of any use in Fortran. It seems not, So I should just stick to UTF-8

Fortran supports “the character set UCS-4 as specified in ISO/IEC 10646”, which is UTF-32.

is there any Fortran functions to convert between UTF-32 and UTF-16 ? or is it something I should do in C ?

While the Fortran language provides a way to specify “wide characters”, it does not require that a compiler support them, and some do not. The typical approach is to store these in arrays of integers that are the appropriate width, but you’ll need library routines to handle them.

1 Like

Nag supports 4 character kind types (1,2, 3 and 4 byte), and gfortran supports 2 character kind types. Intel (ifort and ifx), Cray, Nvidia Fortran, and Silverfrost Fortran only support 1 character kind.

1 Like

Thanks for clarifying that.
I have got a question. I have looked for “10646” or “UCS-4” in the draft, and those strings appear in various places, but I did not understood it was not required. How should we read the standard to discriminate what is “required” and what “may be provided”?

It is not explicitly said which should be supported and which just might be. Character sets
1 The character type has a set of values composed of character strings. A character string is a sequence of characters, numbered from left to right 1, 2, 3, … up to the number of characters in the string. The number of characters in the string is called the length of the string. The length is a type parameter; its kind is processor dependent and its value is greater than or equal to zero.
2 The processor shall provide one or more representation methods that define sets of values for data of type character. Each such method is characterized by a value for the (default integer) kind type parameter KIND. The kind type parameter of a representation method is returned by the intrinsic function KIND (16.9.108). The intrinsic function SELECTED_CHAR_KIND (16.9.168) returns a kind value based on the name of a character type. Any character of a particular representation method representable in the processor may occur in a character string of that representation method.
3 The character set specified in ISO/IEC 646:1991 (International Reference Version) is referred to as the ASCII character set and its corresponding representation method is ASCII character kind. The character set UCS-4 as specified in ISO/IEC 10646 is referred to as the ISO 10646 character set and its corresponding representation method is the ISO 10646 character kind.

So the Standard only says “one or more” character kinds must be supported. The above is accompanied by constants in iso_fortran_env module:

  • character_kinds - array of kind values
  • character_storage_size - size of the default character value in bits

selected_char_kind(name) function accepts at least 3 values of its argument - “DEFAULT”, “ASCII”, “ISO_10646”. The last one corresponds to UCS-8 (a.k.a. UTF-32). You can check if the returned value is non-negative (existing kind) and which is DEFAULT (I’d bet ASCII is DEFAULT in 100% of implementations but, if I do not miss something, that is not guaranteed)

1 Like

Yes, the key sentence is:

ISO/IEC 646:1991 (a.k.a. “ASCII” per the standard) defines the old ASCII-7 set of 128 characters; and traditionally CHAR() and ICHAR() would do at least 256 so I was going to bet the default was ‘DEFAULT’ , not ‘ASCII’ but everyone I checked so far returns the same kind for ASCII and DEFAULT, and without switches all the IACHAR and ACHAR functions did at least 256 characters even though technically they just have to handle 128 characters. In retrospect since both are using 1 byte per character I suppose there is no value in differentiating them unless you are on an EBCDIC or DISPLAY CODE platform. So you can say everyone probably defaults to ASCII, or equivalently everyone defaults to DEFAULT. It would kind of be odd if your default was not ‘DEFAULT’, but nothing says ASCII and DEFAULT cannot be the same thing so with everything I checked so far we would have both been right. I am wondering if NAG has a different code for ASCII and DEFAULT even on a Linux platform?

@urbanjost reminded us about ‘the old ASCII-7 set of 128 characters; and traditionally CHAR() and ICHAR() would do at least 256’. The tradition presumably arose when IBM used EBCDIC, which has 256 characters, instead of ASCII.

I think there is no common, “named” 256 character set that would not be “specific”, as the codes 128-255 are assigned to many different ASCII extensions (e.g. ISO-8859-N). So that is probably the reason for DEFAULT being ASCII.

This is sort of a problem, as the Standard leaves the interpretation of codes/characters above the ASCII 127 to the implementation:

16.9.88 IACHAR (C [, KIND])
5 Result Value […] The value of the result is processor dependent if C is not in the ASCII collating sequence.

With similar statement for ACHAR. If DEFAULT is ASCII, these rules apply also to ICHAR/CHAR, so strictly speaking even the proper interpretation of UTF-8 strings is not guaranteed