Fortran Lapack slower then Matlab, Why?

DanielCro · May 22, 2009

Greetings!

When I compile my program with Lapack routines coming from the Intel MKL and with Intel compiler, it gets a bit faster then Matlab.
However, compiling directly from sources (all the necessary Lapack routines are copied to the project folder) with gfortran -O3 *.f90 or
ifort -O3 *.f90
gives me about 6 times slower program!!
Why?
Are there any better ways of compiling? Any additional flags?
I tried following flags with small speed-up:
-march=core2, -ftree-vectorize, -fmath-fast ...

Please ask for additional info if needed!
Thank You!

xwb · Jun 7, 2009

gfortran is based on gcc. It is a generic compiler not particularly optimized.

You might see a difference if your core machine is an AMD but not if it is an Intel. There are AMD developers helping with gcc so any speed optimizations are passed on.

Maybe this time next year when the new compiler is out, you might see a magical difference in speed, now that Intel have a few engineers supporting gcc.

DanielCro · Jun 8, 2009

Hi and thanks for your help!

What I really wanted to know is why am I many times slower with compiling .f Lapack routines in comparison to using Intel MKL or Matlab itself?
In both cases compiling is done with Intel Fortran Compiler.
Are Intel MKL libraries tweaked at some lower programming level to highly optimize performance or what?

xwb · Jun 8, 2009

Intel MKL is highly optimized for high performance. It makes full use of the Intel instruction set

LAPack is just a bunch of linear algebraic algorithms. They are optimized as far as Fortran is concerned but that is as far as it goes. It is entirely dependent upon the code generation of the compiler. You can only go so far in code generation.

For instance, to do a byte swap in assembler, I'd just use the bswap instruction if I wasn't thinking. If I thought about it, I'd use a rotate instruction, which is a lot faster.

In Fortran, I'd do something like

Code:

character*2 xx
character*1 temp
temp = x(1:1)
x(1:1) = x(2:1)
x(2:1) = temp

I've got no idea how many clock cycles that is but it is the best I can do in Fortran. It is probably about 10 instructions compared to a single one in assembler.

DanielCro · Jun 16, 2009

Thanks xwb for your help !!
I will stick to Intel MKL... too bad they are not available for free on Windows...

Best regards!

pattakosn · Jun 22, 2009

Hi.

If I got you correct, you are asking why ifort/imkl is much faster than gfortran/F90 lapack code.

Because this is what intel was trying to do when developing imkl!

There are more reasons than I can list. I would like to point out that the accuracy of two cases that you describe is not the same. You can either try enableing gcc's fast math or applying stricter ieee arithmetic when compiling using intel.

Another tricky issue that comes to my mind is whether you compiled using the multithreaded version of lapack/imkl...

There are a lot of things to address and take into consideration.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Fortran Lapack slower then Matlab, Why?

DanielCro

Programmer

xwb

Programmer

DanielCro

Programmer

xwb

Programmer

DanielCro

Programmer

pattakosn

Programmer

Similar threads

Part and Inventory Search

Sponsor