Unbelievable error

GerritGroot · May 21, 2007

Hi all,

I am using Microsoft developers studio and get the following error message in release mode:

run-time error M6110: MATH
- floating-point error: stack overflow

However, if I run in debug mode the error doesn't occur.

So, I placed something like a

WRITE(*,*) iloop,jloop

to see where in my DO loop things go wrong.

What I don't understand is that the index where the code breaks down, DEPENDS on the amount of WRITE(*,*)'s that I use to debug.

THAT'S IMPOSSIBLE, ISN'T IT???

Any help for what could be the cause is very welcome.

Gerrit

P.S. I tried to compile it with g77, but it gives an error when I link the object files.

ArkM · May 21, 2007

Call function/subroutine with wrong type or number arguments (w/o uniform declarations), stack (automatic) storage overwriting (e.g. local array bounds), subroutine/common block name clashes - for example...
Of course, it depends on your Fortran code level - legacy Fortran code (in FTN66/77 style) is not robust.

GerritGroot · May 21, 2007

Eehm, Thanks!

I suppose you mean that the number of arguments given to a routine somewhere is not the same as specified in the routine itself, but that's impossible, the compiler would give an error message.

Or do you mean that I use a common block name that is the same as a subroutine name?

gullipe · May 21, 2007

Hi GerritGroot

I found this explanation on error M6110 in the Microsoft Fortram reference manual: "A floating-point expression caused a stack owerflow on the 8087/287/387 coprocessor or the emulator. Stack-overflow exceptions are trapped up to a limit of seven levels in addition to the eight levels normally supported by the 8087/287/387 coprocessor".

I think that I have encountered problems like this in programs once or twice long time ago. If I added a "dummy" sentence (like "write(*,*)") the program ran OK otherwise it did not. If I remember correctly, I had to change the optimization switch in the linker to get rid of the error. Worth trying...

Best wishes
GulliPe

GerritGroot · May 21, 2007

The optimisation switch in the linker?

Well, I didn't know it existed, I'll try if I can find one and let you know.

The company obliged me to continue with a crappy code, pfff, that's why. In the meantime I already tried stuff like FTNCHEK until there was no message at all anymore, but without result.

Let's see if I can find some linker options.

More experiences with this are welcome.

Thanks,

Gerrit

GerritGroot · May 22, 2007

I modified all kinds of linker options, but didn't find any combination to work though.

At last I found a linker option about the stacksize. Using microsofts developer studio this can be done after building the executable with EDITBIN.EXE /STACK:somenumber MYEXE.EXE

If you do this the stacksize that the executable is going to have at its disposal can be increased. The EDITBIN really does something because the date and time of the executable are updated and for very big stacksizes the exe won't run because of a lack of memory. So EDITBIN.EXE works.

However the error

run-time error M6110: MATH
- floating-point error: stack overflow

keeps breaking my program down, even if I use stacksizes just below the upper limit in EDITBIN.EXE

Any help is very welcome!

Thanks,

Gerrit

ArkM · May 22, 2007

The compiler signals inter-module errors if it knows all routines interfaces. But you may compile old-fashion code with separated source modules w/o interface declarations.

Try to localize this error source. It's impossible to treat the problem with M6110 as a philosophical (math stack overflow: to be or not to be

. Start release module with debugger, add debug output statements from the program end, change opt levels etc.

It's not an unique situation when an error reveals itself in release mode only.

gullipe · May 22, 2007

Hi again

I checked this and it was NOT a linker switch that I used, but a compiler switch to PREVENT optimization:

In Microsoft Fortran is sounded like this:
fl /c /Od xxx.for

Best wishes
GulliPe

GerritGroot · May 22, 2007

Thanks Gullipe,

The code runs without any problem if I take away the optimization flag of the compiler, HOWEVER, the numerical results are COMPLETELY DIFFERENT from the ones in "to be or not to be mode" as ArkM call it. Unfortunately I know that the previous results where correct and the new ones wrong.

So when I compile the code with optimization /O.x the numerical outcome is correct, but I am unable to treat all situations occuring in the physics.

Next, I included an indispensible IF THEN for some exceptions that occur in some situations of the physics behind it and the code gives stack overflow, caused by the included IF THEN.

Next, I did what Gullipe said, and compiled without /O.x optimization. This makes the code running without stack overflow, in spite of the present indispensible IF THEN but with wrong numerical results.

My God! Is there a way to get around this?

GerritGroot · May 22, 2007

Just changed all REALS to DOUBLE PRECISION and now it runs ok.

So it runs in DOUBLE PRECISION, without the optimization /Ox activated and with the necessary IF THEN included.

Strange that optimization during compiling can have such an influence, isn't it.

I have been thinking about normalising all numbers in the code.

ArkM · May 22, 2007

NEVER use single precision (32-bit) arith in serious (real) calculations. The only case: load tons of (experimental) 32-bit data but make ALL calculations with 64-bit arith (promote before the 1st expression evaluated).
It is not guarantee against wrong results but REAL*4 is 99% (or more

guarantee for troubles.
After all some philosophy: dramatical alterations of math code behaviour with different basic real precision may indicate to some problems with math method(s) stability.

GerritGroot · May 23, 2007

Hi ArkM,

Well you say that, but in aerodynamics it's usual to normalise all numbers before starting any calculation. Doing so you can prevent a lot of errors without using DOUBLE PRECISION.

For example, I once had to compute the heatflow q at the nose of a reentry vehicle. Just computing q gave oscillations and numerical errors, but computing q-q0 instead, gave a proper result (obviously there were some numerical derivatives involved)

This code however was not made by the same person from the start.

Anyway, It's solved

ArkM · May 23, 2007

Hi GerritGroot,

I have seen (many year ago

hundreds of thousands lines of this sort codes so my notes are not speculative fabrications (I hope

.

Floating point data are normalised (scaled) by definition. It's scaling by hardware. The q-q0 computation trick from your post is a part of competent computational method approach, it's part of math (not programming / implementation) aspect. Let's remember: 50-60 years ago very complex computations was made on fixed point arith basis (with scaling by hand).

None the less 32-bit (6-7 decimal digits) IEEE arith is too short and error prone for hi-end iterative computations (for example, 1000-10000 hours on superclusters). As usually, it's impossible to provide for every eventuality of the loss of precision (underflow, for example). So 64-bit (15-16 digits) floats move aside red alert line when we have plausible but incorrect results.

Of course, it's not a dogma (it's a good advice

.

Thank you for very interesting comments.
Good luck!

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Unbelievable error

GerritGroot

Technical User

ArkM

IS-IT--Management

GerritGroot

Technical User

gullipe

Programmer

GerritGroot

Technical User

GerritGroot

Technical User

ArkM

IS-IT--Management

gullipe

Programmer

GerritGroot

Technical User

GerritGroot

Technical User

ArkM

IS-IT--Management

GerritGroot

Technical User

ArkM

IS-IT--Management

Similar threads

Part and Inventory Search

Sponsor