Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Fortran Lapack slower then Matlab, Why?

Status
Not open for further replies.

DanielCro

Programmer
May 22, 2009
3
HR
Greetings!

When I compile my program with Lapack routines coming from the Intel MKL and with Intel compiler, it gets a bit faster then Matlab.
However, compiling directly from sources (all the necessary Lapack routines are copied to the project folder) with gfortran -O3 *.f90 or
ifort -O3 *.f90
gives me about 6 times slower program!!
Why?
Are there any better ways of compiling? Any additional flags?
I tried following flags with small speed-up:
-march=core2, -ftree-vectorize, -fmath-fast ...

Please ask for additional info if needed!
Thank You!
 
gfortran is based on gcc. It is a generic compiler not particularly optimized.

You might see a difference if your core machine is an AMD but not if it is an Intel. There are AMD developers helping with gcc so any speed optimizations are passed on.

Maybe this time next year when the new compiler is out, you might see a magical difference in speed, now that Intel have a few engineers supporting gcc.
 
Hi and thanks for your help!

What I really wanted to know is why am I many times slower with compiling .f Lapack routines in comparison to using Intel MKL or Matlab itself?
In both cases compiling is done with Intel Fortran Compiler.
Are Intel MKL libraries tweaked at some lower programming level to highly optimize performance or what?
 
Intel MKL is highly optimized for high performance. It makes full use of the Intel instruction set

LAPack is just a bunch of linear algebraic algorithms. They are optimized as far as Fortran is concerned but that is as far as it goes. It is entirely dependent upon the code generation of the compiler. You can only go so far in code generation.

For instance, to do a byte swap in assembler, I'd just use the bswap instruction if I wasn't thinking. If I thought about it, I'd use a rotate instruction, which is a lot faster.

In Fortran, I'd do something like
Code:
character*2 xx
character*1 temp
temp = x(1:1)
x(1:1) = x(2:1)
x(2:1) = temp
I've got no idea how many clock cycles that is but it is the best I can do in Fortran. It is probably about 10 instructions compared to a single one in assembler.
 
Thanks xwb for your help !!
I will stick to Intel MKL... too bad they are not available for free on Windows...

Best regards!
 
Hi.

If I got you correct, you are asking why ifort/imkl is much faster than gfortran/F90 lapack code.

Because this is what intel was trying to do when developing imkl!

There are more reasons than I can list. I would like to point out that the accuracy of two cases that you describe is not the same. You can either try enableing gcc's fast math or applying stricter ieee arithmetic when compiling using intel.

Another tricky issue that comes to my mind is whether you compiled using the multithreaded version of lapack/imkl...

There are a lot of things to address and take into consideration.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top