Any ideas why a write (*,*)' ' is required for variable to have non-nan value

I have a Fortran routine which is passed three values. The first and second are integer*8 variables. The last is an integer.

In order for the second variable passed in to have the value passed in, I have to write a blank character to the screen. If I do a write(*,*)' ' the second variable passed in will have the value passed in. If the write statement isn’t used then the second variable passed in has the value nan, not a number.

What would you do to try and fix this? Have you seen this in your code? Does this error have a name?

I’ve tried replacing the write statement with a do nothing “do” loop but the compiler probably optimizes that out. Without the write, the variable has nan.

I’m running out of tricks to try. … I tried recompiling everything, using a local variable to hold the value, …, and shutting down the computer without success.

FYI: The statement just before this write statement is a C routine call which does a malloc. The statement after the write statement is a C routine call that uses that malloc storage and assigns a value to it.

The code works on 32bit computers. This problem is on a 64bit computer.

This problem might go away on it’s own by tomorrow!

Please don’t flame me

As described all that can be told is that it has multiple errors (for beginners, an INTEGER cannot have a value of Nan if typed and passed correctly, only float values can have a value of Nan). You must post a reproducer code for the question to be intelligible enough to be answered.

If you are not familiar with the common language restrictions use the compiler to tell you as many of them as possible. Use IMPLICIT NONE at the top of the procedure, and place your procedure in a module so you have an explicit interface, and use all the compiler debug switches you can. The developers spent a lot of time adding those because they are very valuable (they are not required by the standard). When you post a short reproducer code is invaluable, and showing some of the output and describing what programming environment you are using (and have available) such as what compiler(s) you are using helps to create a meaningful question. With an explicit interface and IMPLICIT NONE you cannot make some of the errors you are likely making passing and declaring values (or maybe using an incorrect format for).

1 Like

You need to post some code for people to look at. Your description doesn’t make any sense, and there are so many possibilities that likely what you are describing might not even be related to the problem anyway (something often the case when debugging codes).

Well, as others have said, you do not provide the right information for us to understand the situation. Can you post the code - that is, a program that exhibits this behaviour? Merely a description is likely to blur the actual problem.

Hi Steve,
yes, I have seen that behaviour. It was a last attempt from my side to help. I think I will ignore the posts by him/her from now on. They do not want to be helped, they want things to be magically solved for them, it would seem. It is not the first peculiar person I have seen in newsgroups and other fora.

Regards,

Arjen

It is a Heisenbug. They are sometimes hard to find.

2 Likes

You are not solving the underlying problem this way or by adding print output.

We have asked you several times to show some actual code that reproduces the problem you are seeing. It does not need to be the actual code you are working on, but it is simply impossible to help you this way. People are getting tired of this attitude, I am sorry to say. And personally, I am running out of patience as well. You are likely to get ignored in the near future.

I got a private, in itself polite, reply from OP. I have decided not to respond to any of OP’s posts, if there is no change in attitude.

Basically: OP is not willing to spend any time on producing a small program that exhibits the problem they are seeing. Instead, they expect us to divine the reason for the odd behaviour that is apparently occurring in a program of several million lines. And even though we indicate time and again that there is a severe problem, OP seems to think that if the symptom goes away by adding a write statement.

I am sorry, but this was the last drop.

3 Likes

Yes, many times… I also think the error is “Heisenbug” (as mentioned by Ron above)
Heisenbug - Wikipedia

I am afraid there are some memory corruption in the code, e.g. write data to some part of variables or arrays beyond the legitimate size (= in an illegal way), while they not detected by the compiler (e.g. because of the lack of proper check options, or by doing “unsafe + incorrect” operations via foreign language calls). I am afraid this kind of bug is nearly impossible for other people to find the reason, unless they examine the code carefully throughout… If the heisenbug is caused purely by Fortran routines, it might help to attach check options like (for Gfortran)

-fcheck=all -g -fbacktrace -finit-real=snan -finit-integer=-999 -finit-derived -ffpe-trap=invalid,zero,overflow

but this is also useless if the error comes from, e.g., incorrect passing of data via procedure calls with implicit interface (or even foreign function calls).

So… I guess there is no “magical” way to find the reason for this kind of bug… :mage:

1 Like

I simply changed a problematic C float printf specifier (on malloced storage) to an integer specifier and the program now works as it should, problem solved.

1 Like

I can see that that would be a mistake in the C code, but would that corrupt data or instructions in memory? If not, then it is likely that your Heisenbug is still there.