Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

FORTRAN versus C number-crunching speed 1

Status
Not open for further replies.

WebDrake

Programmer
Sep 29, 2005
106
PL
Hello all,

I'm working on a number of scientific number-crunching problems where it's quite important that I squeeze a high degree of performance out of my code. I originally learnt to program in C and in general this has been sufficient for my needs---but I haven't needed to really, really stretch performance capabilities until recently.

Anyway, I'm interested in FORTRAN both because it's an interesting language in its own right (and used by quite a few scientific colleagues, so it would be useful to know exactly what their programs are doing), and because, at least historically, its numerical and mathematical aspects were superior to C. However, I've been told by some colleagues that the speed gap is pretty much non-existent these days.

So, the questions I'd like to ask are: first, what is the current situation regarding number-crunching differences between FORTRAN and C? I'm not making extensive use of mathematical functions (sin, cos, etc.) at the present; my code consists of having a few large arrays and needing to do fairly simple arithmetic between individual locations at a time. (So, no vector multiplication, for example; the typical thing is, "Check value of z, if it's greater than 4 set it to zero and add 1 to each of z[j], z[k], z[l] and z[m]".)

Second, what are the major issues I should be aware of when switching from C to FORTRAN? Superficially there appears to be much that is similar but I'd rather not rely on that assumption too heavily.... ;-)

Many thanks,

-- Joe
 
Not sure about the first question: it is probably and compiler architecture dependent.

Major issues:
1) array names - in Fortran, it is the whole array. In C it is a pointer to the array
2) formats - in Fortran, they repeat when they run out. In C, when they run out, it stops
3) no unsigned arithmetic in Fortran
4) F95 - does not allow arrays of pointers or even pointers to pointers. Pointers are very primitive in Fortran.
5) No namelists in C. Don't use this feature a lot but it is really useful
6) multi dimensional array construction is different
7) the concept of character*4 does not exist in C.
8) no macros in Fortran
9) no variable parameters in Fortran
10) no optional/position independent parameters in C.
 
Thanks for that list.

With respect to,
4) F95 - does not allow arrays of pointers or even pointers to pointers. Pointers are very primitive in Fortran.
... does that mean that Fortran handles passing of variables to functions differently? C takes a copy of a variable, right? Which means that unless one has to use data destructively within a function, it speeds things up to pass it a pointer. Does this work differently in Fortran?
 
Sorry - explanation is a bit long winded.

C calls parameters by value. This means that if the value is modified in the code, the modified value does not get passed back to the caller. To pass the value back to the caller, the caller has to provide a pointer to the variable. Then using indirection, the value is returned.

Fortran calls parameters by location reference. Similar to passing a pointer in C without the indirection operator. Having said that, in F90 and F95, they have the INTENT keyword. This works similar to const in C. It stops you from writing to an INTENT(IN) parameter: you can only write to INTENT(OUT) or INTENT(INOUT) parameters.

If you have

fred (x)

and x is a structure, in C, the whole structure will be dumped on the stack. If you modify the structure, the original doesn't get the modified values. In Fortran, a reference to the structure will be dumped on the stack. If you modify the values, the original gets the results: similar to

fred(&x)

in C.

Coral 66 (remember that language? forerunner of Ada, used a lot by the UK MOD (= US DoD)) also uses location reference. It is half way between C and Fortran: you actually specify LOCATION or VALUE in the parameter.
 
I'm a Brit so translating the MOD for me is not required... ;-)

Thanks very much for all the info---a long winded explanation is good if it's needed for clarity!

Coral 66 is a bit before my time, though I'm vaguely aware of its existence. Ada I knew of less through computer science than through being a mathematician and reading about Ada Lovelace herself...

Thanks again!

-- Joe
 
Strictly speaking, Fortran calling convention is not a clear by reference. Changes come into force only after RETURN statement (sometimes this convention called by reference/by value).
It's very uncomfortable to deal with 2D/3D arrays in C, but as usually (if no vector ops) you may write more effective (and more cumbersome;) codes for that case in C.
 
When I did my Computer Science course, the parameter passing conventions were

1) by name - only Algol 60 did this
2) by value (C, C++, Coral 66)
3) by location (Coral 66, C if you pass a pointer)
4) by reference (C++, Algol 68)
5) by location reference (Fortran)

Like ArkM, I'm uncomfortable with multi-dimensional arrays in C. The typical problem I come across when switching between languages is the notation. x[2,3] is legal in C - it means x[3]. Sometimes I can stare at it for ages trying to figure out why the compiler is moaning abot a simple assignment.

The other strangeness in Fortran is that there are no reserved words. You can have an array called if and it would not be flagged as a syntax error.


 
It's a nice point with Fortran parameters. For example, if you pass a var from common block then change it via parameter name then try to access it via common block (before return), you may get an old value in some (standard!)implementations. This approach was implemented in old IBM compilers. Simple parameters were copied in local storage then come back in return points (faster code).

Fortran is not one and only language without reserved words (remember PL/I, for example). Moreover: in Fortran you may insert whitespaces as much as you like (but why?)...
 
One of the joys of FORTRAN parameter passing is this:

call FRED( 3 )

SUBROUTINE FRED ( I )

I = I+ 1

RETURN


And lo! when it returns 3 == 4.

Which is a bit of a bummer really.
 
Thousands have had a breakdown in investigation of that case on old good IBM mainframes. I hope, more advanced Fortran implementations place constant arguments in ROM now...
The most pragmatical approach (I have ever seen) in PL/I:
Code:
CALL FRED(3);  /* OK, pass initialized temporary */
CALL FRED(X);  /* by reference, modify it as you wish */
CALL FRED((X));/* pass temp var, X var unchanged */
 
It is the same in VB and VBScript - putting things in parenthesis passes them by value.
 
How is the PL/I example you give different from C, fundamentally, except for notation?
 
Code:
void NoChange(int x)
{
   x = 20;
}
main ()
{
   int y = 25;
   NoChange (25);
   y  = y+ y;
   printf ("%d\n", y);
   return 0;
}
In C, you will get 25. In the older versions of the Fortran compiler, you may have gotten 40! Putting it in parenthesis would force a call by value.
 
What I was meaning was that of course there is all the notation available in C for you to choose between sending value, reference or whatever---it just depends on whether you send, say, f(x), f(&x), f(*x), ... but of course the thing that is being sent can only be one type, specified in the function itself.

So I guess what you're saying is that FORTRAN and some other languages allow you to determine whether a function is passed location or value at the point where the function is called, rather than having to specify it one way or the other in the function...?

BTW surely you will get 50 out of the C code since you have set y = y+ y; ?? ;-)

The FORTRAN result seems bizarre since you do not touch y at all. So the number 25 is actually being set to have a value of 20? Very odd and incomprehensible. How could this be allowed? What possible justification could it have?

 
It is not allowed but that is what used to happen on the IBM/ICL mainframes. Basically 20=25 so y+y = 50 not 40. The older compilers used to store only one copy of the constant. If that got corrupted, the whole program went bezerk. Nowadays, many assemblers allow inline numbers so that is not so likely to happen.

Calling by value takes a local copy so you can do what you like - it won't change the original value. Putting things in brackets also causes a call by value in some variants of Fortran but check with your compiler: it may not work with all versions.
 
The structual differences are mostly irrelevant. Any real code speed differences can be mitigated through in-line assembler. What did make a difference was the math libraries. Neither language has native math functions. The math functions are usually linked in through a math library that is hand coded and optimized.

Fortran was the preeminent language up to the 70's and 80's because the math routines had been tweaked and optimized for about 20 years. There were a couple, including IMSL and another whose name escapes that were generally spoken about in the same breath as the Fortran version.

Today, math libraries are still the key to higher performance, but the market is more fractured, since there are niche application libraries for specific algorithms such as FFT. For most users, unless they have a sugar-daddy to pay for the development, they must simply get the best library they can and the fastest hardware they can.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top