Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chris Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Increasing double loop efficiency 1

Status
Not open for further replies.

Vahid9

Technical User
Jul 28, 2016
23
CA
I have the following double loop;

Code:
DO i=1,100
  DO j=1,100
    S(i,j)= S(i,j)+ALPHA*exp[(real(i)/b1)**2]*exp[(real(j)/b2)**2]
  ENDDO
ENDDO

Here, S is a 100x100 symmetric REAL array, ALPHA, b1, and b2 are REAL constants.

I am hoping to incorporate LAPACK dgemm or some other library function to increase the efficiency of this calculation. However, dgemm( TRANSA, TRANSB, M, N, K, ALPHA, A, LDA,B, LDB, BETA, C, LDC ) needs A and B arrays as input. I could define A as the first exponential using a DO loop and B as the second exponential using another DO loop but this approach may be less efficient.

What would be the most efficient way to calculate the above double loop in fortran?

Thanks,
Vahid
 
In your case you compute
[pre]S = 1*S + ALPHA*A*B[/pre]
so the call
[pre]DGEMM(TRANSA, TRANSB, M, N, K, ALPHA, A, LDA, B, LDB, BETA, C, LDC )[/pre]
seems to be
[pre]DGEMM('N', 'N', 100, 100, 100, ALPHA, A, 100, B, 100, 1, S, 100)[/pre]

where
S is your 100x100 matrix

A is 100x100 matrix of this form, which only has the first column non-zero, otherwise all columns are zeros:
[pre]
| exp(( 1/b1)**2) 0 0 0 0 ... 0 |
| exp(( 2/b1)**2) 0 0 0 0 ... 0 |
| ..................................... |
| exp((100/b1)**2) 0 0 0 0 ... 0 |
[/pre]

B is 100x100 matrix of this form, which only has the first row non-zero, otherwise all rows are zeros:
[pre]
| exp((1/b2)**2) exp((2/b2)**2) ... exp((100/b2)**2) |
| 0 0 ... 0 |
| ....................................................|
| 0 0 ... 0 |
[/pre]
 
Thanks mikrom for your response. Out of curiosity, wouldn't it be more efficient to define A as a column matrix, B as a row matrix and set K=1? Wouldn't this involve fewer calculations?

Vahid
 
Yes, of course (i didn't even think of that)

A is 100x1 matrix of this form:
[pre]
| exp(( 1/b1)**2) |
| exp(( 2/b1)**2) |
| .................|
| exp((100/b1)**2) |
[/pre]

B is 1x100 matrix of this form:
[pre]
| exp((1/b2)**2) exp((2/b2)**2) ... exp((100/b2)**2) |
[/pre]
Then the call seems to be
[pre]DGEMM('N', 'N', 100, 100, 1, ALPHA, A, 100, B, 1, 1, S, 100)[/pre]
 
I was thinking about, how to use DGEMM for your case.
But now when I think about it again, I have doubt, that DGEMM is faster for your simple case, than your simple 2 loops.
Look at the source o DGEMM how many loops it has:
Maybe DGEMM is efficient for more complicated cases ...
 
I will try both cases using 1) the two do loops, and 2) DGEMM, to see which is faster. It may well be that they are similar in speed.

Thanks,
Vahid
 
I replaced the double loop with DGEMM in my code. The DGEMM is run inside three other loops and the whole code is run in parallel across 4 nodes of 64 cores each.

When using the double DO loop, the total run time is 7h19m. With DGEMM replacing the double loop, the runtime is 3h34m, a significant speed up.

Thanks mikrom for all your help.

Cheers,
Vahid
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top