Compressed Row Storage (CRS) Matrix Addition 1

melmacianalf · Jun 4, 2012

Hello,

I am trying to figure out an algorithm to add two sparse matrices stored according to CRS (compressed row storage) format.

For an introduction to CRS, please refer to

http://web.eecs.utk.edu/~dongarra/etemplates/node373.html

The 3 arrays needed to store the non-zero elements (according to the link above) are val(1:nnz), col_ind(1:nnz), and row_ptr(1:N+1). These can be stored as follows

Code:

!=== Matrix A ===
! A = [
! 1  0  0  2
! 0  3  1  3
! 0  1  5  0
! 2  3  0  7
!]                                                                                                                                                                     
N = size(A,1)

nnz = 0
ipre = 0

do i = 1, N
  do j = 1, N

    if ( abs(A(i,j)) .gt. 5e-16 ) then

      nnz = nnz + 1
      col_ind_A(nnz) = j
      val_A(nnz) = A(i,j)

      if ( nnz .eq. 1) then
        row_ptr_A(i) = nnz
        ipre = i
      else if ( i .ne. ipre) then
        row_ptr_A(i) = nnz
        ipre = i
      end if

    end if

  end do
end do

row_ptr_A(N+1) = nnz + 1

Note that the above code is only to give an idea of the storage procedure. In the actual implementation, one dumps val and col_ind together in a file and row_ptr in a seperate file and reads them again.

I attach a f90 file where two matrices A and B are initialized for a predefined number of non-zero elements. I would appreciate if any one gives an algorithm to get a matrix C = A+B where the addition is done sparsely. Thank you.

melmacianalf · Jun 4, 2012

Follow up:

The algorithm should get only val_C, col_ind_C, and row_ptr_C from the three data arrays for matrices A and B. In other word, storing C(N,N) should be avoided.

salgerman · Jun 4, 2012

Sorry, but since I've been working more in Python than in Fortran, lately (besides, I don't have to compile, etc), I did in Python:

Code:

import numpy as np

valA = np.array([1,2,3,4,5,6,7,8,9])
colA = np.array([0,1,3,0,3,1,2,0,3])
rowA = np.array([0,    3,  5,  7  ])

valB = np.array([9,8,7,6,5,4,3,2,1])
colB = np.array([1,2,0,2,0,3,0,2,3])
rowB = np.array([0,  2,  4,  6    ])

nrows = 4
valC = np.zeros(nrows*nrows)
colC = np.zeros(nrows*nrows)
rowC = np.zeros(nrows)
a = 0 
b = 0 
c = 0 

for n in range(nrows):
   rowC[n] = c

   if n < nrows-1:
      na = rowA[n+1]
      nb = rowB[n+1]
   else:
      na = len(valA)
      nb = len(valB)
   
   while a < na or b < nb:
      if a < na and b < nb:
         if colA[a] == colB[b]:
            valC[c] = valA[a] + valB[b]
            colC[c] = colA[a]            
            a = a + 1
            b = b + 1
            c = c + 1
         elif colA[a] < colB[b]:
            valC[c] = valA[a]
            colC[c] = colA[a]
            a = a + 1
            c = c + 1
         else:  #colA[a] > colB[b]
            valC[c] = valB[b]
            colC[c] = colB[b]
            b = b + 1
            c = c + 1

      elif a < na:
            valC[c] = valA[a]
            colC[c] = colA[a]
            a = a + 1
            c = c + 1

      elif b < nb:
            valC[c] = valB[b]
            colC[c] = colB[b]
            b = b + 1
            c = c + 1
      else:
         pass
            
print valC
print colC
print rowC

salgerman · Jun 4, 2012

Oh, something else...just to confuse you a bit...I am indexing starting at zero, not one.

melmacianalf · Jun 4, 2012

@salgerman,

Thanks for your effors. But two quick clarifications.

The statements

Code:

valC = np.zeros(nrows*nrows) 
colC = np.zeros(nrows*nrows)

already imply quadratic storage. If you look into my example matrices

Code:

I would like to get the data

Code:

val_C =
    [2    3    4    6    1    3    3    1   10    4    3   14]

col_ind_C =
   [1   3   4   2   3   4   1   2   3   1   2   4]

row_ptr_C =
    [1    4    7   10   13]

directly without having to store the full matrix C. I am planning to use it for large applications where my matrix has 10^5 to 10^6 rows and columns.

Another clarification is that by convention the value of nnz+1 is stored at row_ptr(N+1), but you have not stored them. Perhaps, I should also point out to the page on CRS matrix-vector product [URL unfurl="true"]http://web.eecs.utk.edu/~dongarra/etemplates/node382.html[/url]

I think the loop structure

Code:

   for i = 1, n 
       ... 
       for j = row_ptr(i), row_ptr(i+1) - 1
           ...
       end;
   end;

may be used also for CRS addition of two matrices stored by CRS.

melmacianalf · Jun 5, 2012

@salgerman,

Thank you for the code. I think I can make a Fortran version from your version.

salgerman · Jun 5, 2012

melmacianalf:

The python script does produce the output as you want it, except for rowC(nnz+1), of course.

The statements

Code:

valC = np.zeros(nrows*nrows) 
colC = np.zeros(nrows*nrows)

Do not allocate square (doubly indexed) matrices, they simply allocate long (single-indexed) arrays with as many items.

I did not bother to figure out in advance how long my new matrix needs to be, I simply made it the longest it will ever be. You could take a first pass and inspect column numbers to figure out how many per row you are going to need and allocate C matrix accordingly...you go do it [tongue]

So...give me a break!...this is my first 15-minute shot at it without ever having heard about CRS before [bigsmile]

Sure, there is a lot of things that need to be done, but I am just helping here.

At this time, I only focused on a brute force procedure of traversing both arrays step by step.

Having nnz+1 stored at row_ptr(N+1) is a good thing I did not notice...that would eliminate the first "if-then-else" inside my "for" loop and all I need to do is use rowA(n+1) instead of 'na' and rowB(n+1) instead of 'nb'...where probably, in my python script, I would need nnz stored there, instead of nnz+1...due my indexing from zero.

oops...gotta run, now...

melmacianalf · Jun 5, 2012

salgerman:

I already tried the code with large matrices, this time I allocated tmp_valC and tmp_colC with nnz_A + nnz_B, instead of nrows^2. Then I copied nnz_C values to valC and colC and deallocated tmp_valC and tmp_colC.

salgerman · Jun 5, 2012

Even nnz_A+nnz_B may be too many...you probably already know, but you need to take into account the fact that many of the nnz_A items are in the same position as some of the nnz_B ones...so, you really need to collect column numbers per row for both arrays and get rid of repeats.

So, allocated, etc...any more news, have you translated the code to Fortran and have you tried it? did it work?

melmacianalf · Jun 5, 2012

I have been fine tuning the Fortran version for some time. Instead of allocation, I also included an option to dump the three arrays in a two files (val and col in one and row in another) and read them. I also made the routine general to compute C = x * Y + y * B. Soon I will use the code to add two large matrices.

melmacianalf · Jun 5, 2012

actually...C = x*A + y*B...

salgerman · Jun 5, 2012

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Compressed Row Storage (CRS) Matrix Addition 1

melmacianalf

Technical User

melmacianalf

Technical User

salgerman

Programmer

salgerman

Programmer

melmacianalf

Technical User

melmacianalf

Technical User

salgerman

Programmer

melmacianalf

Technical User

salgerman

Programmer

melmacianalf

Technical User

melmacianalf

Technical User

salgerman

Programmer

Similar threads

Part and Inventory Search

Sponsor