Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

MPI_TYPE_CERATE_STRUT to pass a derived type with allocatable array

Status
Not open for further replies.

albi73

Programmer
Sep 16, 2010
18
0
0
IT
I apologize for the length of the title :)

In the code that i'm modifing I have created a class that contains a derived type that in turn contains an allocatable vector of reals (crom) (further :idon (real) and pos(integer)):
Code:
module class_crom
   use random_N
   type crom
      sequence
      real,allocatable,dimension(:) :: gene
      real :: idon
      integer :: pos
   end type crom 
   real, allocatable,dimension(:) :: gene0,D_gene0
   contains
   subroutine popolaz_new (this,i,N_crom,N_geni,mype,chiamate)
      implicit none
      type(crom),intent(out) :: this
      real, allocatable,dimension(:),save :: rnd
      integer, intent(in) :: i,mype,N_geni,N_crom
      integer, intent(inout) :: chiamate
      integer :: k
      
      allocate (this%gene(1:N_geni))
      allocate (rnd(1:N_geni),gene0(1:N_geni),D_gene0(1:N_geni))
      gene0=(/0.,115.,-70.,-65.,-60.,-55.,-50.,-45.,-40.,-35.,-30.,&
      &18.5,18.5,18.5,20.5,20.5,20.5,20.5,20.5,20.5,0.270,0.03,0.03,0.03,0.03/)
      D_gene0=(/7.,10.,30.,45.,45.,45.,45.,45.,45.,45.,45.,12.,12.,12.,10.,10.&
      ,10.,10.,10.,10.,0.035,0.03,0.03,0.03,0.03/) 
      call RANU3(ix+mype,rnd,N_geni,chiamate,ivw,icon) !crea un vettore di numeri casuali
      if (chiamate==0) chiamate=1
      rnd=(rnd*2.0)-1.0
      do k=1,N_geni
         this%gene(k)=gene0(k)+(rnd(k)*D_gene0(k))
      end do
      this%pos=i
      print *,'i=',i,'mype=',mype,'rnd_test=',rnd
      deallocate (rnd,gene0,D_gene0)
   end subroutine
end module
as you saw I have created a "derived data type" and consequently I really would like comunicate between processors by using MPI_TYPE_CREATE_STRUCT ... but the amazing point is that - also if I put the "sequence" instruction into the "type crom" the addresses of the elements ( of the crom type) aren't sequential !. let see these results:

for example running the code I find these addresses:
[data]
cromF(1)%gene(1) 233935744
cromF(1)%gene(2) 233935748 (which it's OK)
cromF(1)%idon 233933496 (which is behind of 240 positions!!!)
cromF(1)%pos 233933500 (that could be ok)
[data]
anyway by using a trick to create the offset (displacement into adresses) by using directly the address gaps i thought to overcome the problem, but at the end the code doesn't work. The only way that makes the code working is to forget the " MPI_TYPE_CREATE_STRUCT" and pass the information starting from the first value of every items declared into the derived type. My question is: why I'm not able to create an correct data mapping by using MPI_TYPE_CREATE_STRUCT ?
this is the part of the code under inquisition! :
Code:
use class_crom
.
.
  type (crom), allocatable, dimension(:) :: cromF,cromF2
.
.
   allocate(cromF(1:N_popo),cromF2(1:N_popo))
   do i=1,N_popo
      allocate(cromF(i)%gene(1:N_geni));cromF(i)%gene(:)=0.0
      allocate(cromF2(i)%gene(1:N_geni));cromF2(i)%gene(:)=0.0
   end do   
   do i=croXpe(mype,1),croXpe(mype,NcroXpe)
      call popolaz_new(cromF(i),i,N_popo,N_geni,mype,chiamate)
      pop(i,:)=cromF(i)%gene(:)
   end do

  
   !pop_T=transpose(pop)
   call mpi_barrier(mpi_comm_world,ierr)
   !call mpi_allgather(pop_T(1,croXpe(mype,1)) ,NcroXpe*N_geni,mpi_real,ric_T,NcroXpe*N_geni,mpi_real,mpi_comm_world,ierr)
   !call mpi_allgather(cromF(croXpe(mype,1)),2,cromtype,cromF2(croXpe(mype,1)),2,cromtype,mpi_comm_world,ierr)
    
   !if (mype==0) then
      call mpi_get_address(cromF(1)%gene(1), disp(0),ierr)
      call mpi_get_address(cromF(1)%idon,    disp(1),ierr)
      call mpi_get_address(cromF(1)%pos,     disp(2),ierr)
      print *,'add_g1 ','add_idon ','add_pos = ',disp(:)
      print *,'distanza adress= ',disp(1)-disp(0),disp(2)-disp(1)
   !end if
   base=disp(0)
   offsets(0)=disp(0)-base; offsets(1)=disp(1)-base; offsets(2)=disp(2)-base
   print *,'offsets= ',offsets(:)
   blockcounts(0)=N_geni; blockcounts(1)=1; blockcounts(2)=1
   oldtypes(0)=MPI_REAL; oldtypes(1)=MPI_REAL; oldtypes(2)=MPI_INTEGER
   !---mpi variables---
   call MPI_TYPE_CREATE_STRUCT(3,blockcounts,offsets,oldtypes,cromtype,ierr)
   print*,'struct_ierr= ',ierr
   call MPI_TYPE_COMMIT(cromtype,ierr)
   !-------------------



   if (mype==0) then
      cromF(1)%idon=0.5600
      !call mpi_send(cromF(1)%gene(1),1,cromtype,1,1,mpi_comm_world,ierr)
      call mpi_send(cromF(1)%gene(1),N_geni,MPI_REAL,1,1,mpi_comm_world,ierr)
      call mpi_send(cromF(1)%idon,1,MPI_REAL,1,1,mpi_comm_world,ierr)
      call mpi_send(cromF(1)%pos,1,MPI_INTEGER,1,1,mpi_comm_world,ierr)
   end if   
   call mpi_barrier(mpi_comm_world,ierr)

   if (mype==1) then
      !call mpi_recv(cromF2(1)%gene(1),1,cromtype,0,1,mpi_comm_world,stat,ierr)
      call mpi_recv(cromF2(1)%gene(1),N_geni,MPI_REAL,0,1,mpi_comm_world,stat,ierr)
      call mpi_recv(cromF2(1)%idon,1,MPI_REAL,0,1,mpi_comm_world,stat,ierr)
      call mpi_recv(cromF2(1)%pos,1,MPI_INTEGER,0,1,mpi_comm_world,stat,ierr)
   end if


 
You cannot expect that the values stored in an allocatable element (or pointer element) of a derived type variable are sequential to the other elements of that derived type, even if you have used the keyword SEQUENCE.

Indeed, the derived type variable is allocated BEFORE its allocatable/pointer elements. So the compiler cannot reserve places for the contents of these elements because it does not know their final dimensions.

I have seen languages able to extend really a variable to put inside allocatable elements. But this is not the case of FORTRAN (ditto in C and C++). Indeed, such feature means that a variable can move in the memory if there is not enough place around. As main consequence the pointer to such variable does not contain its address anymore but the address of the address... Classical languages manage only fixed addresses.

A complete derived type variable is therefore composed of several memory areas :
- the memory area for non allocatable/pointer elements
- a memory area corresponding to each allocated element (declared with the attribute ALLOCATABLE or POINTER)

What is sequentially embedded in the main memory area is the descriptor to each allocatable element. That descriptor is very similar to a FORTRAN pointer. In most implementations, it is a FORTRAN pointer, i.e. a small structure containing the final address of values, the range of each dimension... The only possible simplification compared to a pointer is a simplified stride (the elements of an allocable array are sequential).


 
Dear Fjacq,
I'm grateful to have met a preparate person like you! so i go ahead with my questions :)
What you say makes sense and more, but let me try to understand as better I can; what exactly do you mean for descriptor in the last sentence? is it the name of the derived type, the clearly points to the address? ...

Anyway I yet not able to understand why the following code portion:
Code:
   call mpi_get_address(cromF(1)%gene(1), disp(0),ierr)
   call mpi_get_address(cromF(1)%idon,    disp(1),ierr)
   call mpi_get_address(cromF(1)%pos,     disp(2),ierr)
   base=disp(0)
   offsets(0)=disp(0)-base; offsets(1)=disp(1)-base; offsets(2)=disp(2)-base
   blockcounts(0)=N_geni  ; blockcounts(1)=1       ; blockcounts(2)=1
   oldtypes(0)=MPI_REAL   ; oldtypes(1)=MPI_REAL   ; oldtypes(2)=MPI_INTEGER
   !---mpi derived type variable---
   call MPI_TYPE_CREATE_STRUCT(3,blockcounts,offsets,oldtypes,cromtype,ierr)
   call MPI_TYPE_COMMIT(cromtype,ierr)
if (mype==0) then
      call mpi_send(cromF(1)%gene(1),1,cromtype,1,1,mpi_comm_world,ierr)
end if
if (mype==1) then
      call mpi_recv(cromF2(1)%gene(1),1,cromtype,0,1,mpi_comm_world,stat,ierr)
end if

is not equal to this:
Code:
   if (mype==0) then
      call mpi_send(cromF(1)%gene(1),N_geni,MPI_REAL,1,1,mpi_comm_world,ierr)
      call mpi_send(cromF(1)%idon,1,MPI_REAL,1,1,mpi_comm_world,ierr)
      call mpi_send(cromF(1)%pos,1,MPI_INTEGER,1,1,mpi_comm_world,ierr)
   end if   
   call mpi_barrier(mpi_comm_world,ierr)

   if (mype==1) then
      call mpi_recv(cromF2(1)%gene(1),N_geni,MPI_REAL,0,1,mpi_comm_world,stat,ierr)
      call mpi_recv(cromF2(1)%idon,1,MPI_REAL,0,1,mpi_comm_world,stat,ierr)
      call mpi_recv(cromF2(1)%pos,1,MPI_INTEGER,0,1,mpi_comm_world,stat,ierr)
   end if

the last version is the working one.
the first version passes correctly only the vector cromF(1)%gene:)), the elements %idon and %pos don't pass to cromF2.

In some way the two codifications seem to be the equal; in the first (that doesn't works) the comunication starts from the first DERIVED TYPE element (cromF(1)%gene(1)) and following the costruction of MPI_TYPE_CREATE_STRUCT (....,newtype=cromtype,...), all the other blocks start with the correct address.
In the working code (the second one) again the mpi_send starts from the first elements of the "DERIVED TYPE element" (or blocks).

have i use MPI_LB or MPI_UB:
Because there may be cases in which this default behavior is not correct,
MPI provides a means to set explicit upper and lower bounds which may not
be directly related to the lowest and highest displacement datatype. When
the pseudo type MPI_UB is used, the upper bound will be the value specified
as the displacement of the MPI_UB block.....

but I can't find examples and i havn't really well understood.
thank for you help, Albi
 
Let us take again your example :

Code:
type crom
     sequence
     real,allocatable,dimension(:) :: gene
     real :: idon
     integer :: pos
end type crom

integer    :: n=100
type(crom) :: this

allocate(crom%gene(1:n))

When you declare the variable THIS having the type CROM, the program allocates automatically a contiguous memory zone containing :

- a descriptor for the element gene which will be allocated later on. This descriptor is masked (you cannot access it directly, except perhaps through a debugging tool), but I am pretty sure that this descriptor is itself composed of several fields like :
-- the place to memorize the final address of the vector GENE when this one will be allocated (32bits or 64bits integer depending on your architecture)
-- the lower bound
-- the upper bound
- a 32 bits field to store IDON
- a 32 bits field to store POS

It means that the variable THIS occupies about 160 bits (<=> 5 32bits integers <> 20 bytes) on a 32 bits architecture and still more on a 64 bit architecture where the upper and lower bounds are probably 64 bits integers (about 256 bits).

Declaring GENE pointer instead of allocatable would lead to a variable still a little bit larger because a constant step between elements of the final vector GENE is authorized in that case (this step has to be memorize too).

As you see, the variable THIS is allocated at a fixed size, which depends on the architecture and the compiler, but that size does not take into account the vector GENE itself.

When you allocate the vector GENE belonging to that variable, then you reserve a new memory area which must contain the elements of that vector. In the example, the size of the memory zone is exactly 3200 bits (<=> 400 bytes <=> 10 32bits real values).

But in the same time, the descriptor of the array GENE in the variable THIS is set setup :
- the address field contains now the physical address of the array GENE,
- the lower bound contains the value 1
- the upper bound contains the value 100

Now let us examine you wrong programming :

Code:
   call mpi_get_address(cromF(1)%gene(1), disp(0),ierr)
   call mpi_get_address(cromF(1)%idon,    disp(1),ierr)
   call mpi_get_address(cromF(1)%pos,     disp(2),ierr)
   base=disp(0)
   offsets(0)=disp(0)-base; offsets(1)=disp(1)-base; offsets(2)=disp(2)-base

Let us consider a 32bit architecture. Let us suppose that the variable THIS starts at the address 1000 (in bytes) and the vector GENE starts at the address 8000 (in bytes). Remember again that the memory areas for THIS and GENE are totally disconnected.

- the first call to mpi_get_address returns the address of the first element of GENE, i.e. the value 8000

- the second call returns the address of the field IDON of the variable THIS, i.e that address 8160 (just after the decriptor of GENE)

- the third call returns 8164

The next instructions are clearly wrong. You assume that the address of the vector GENE is the address of the variable THIS. No ! The address of the GENE vector is 8000 and the address of THIS is 1000, which is also the address of the descriptor of GENE ! There are two bases, not one, because the complete THIS variable is composed of two different memory areas.

What you could do is to consider really these two memory areas, possibly in excluding the descriptor which has not to be sent.

Code:
   call mpi_get_address(cromF(1)%%idon, start1,ierr)
   call mpi_get_address(cromF(1)%gene , start2,ierr)

From start1 you need to send 8 bytes (one real + 1 integer)

From start2 you have to send n reals (n being the size of cromF(1)%gene).




 
Sorry, but I did many typos :

- size of the vector GENE : 3200 bits (<=> 400 bytes <=> 100 32bits real values) (I wrote initially 10 real values)
- address of the field IDON : 1160 (I wrote 8160)
- address of the field POS : 1164 (I wrote 8164)
 
ok i have a lot to learn!

anyway in the blockcounts(1:3), the "mpi call" asks the gap from the precedent adress, not the starting position so if i start from position "this%gene(1)" i obtain:
blockcounts(0)=0
blockcounts(1)=-244
blockcounts(2)=-246

probably one needs to consider two differents starting position when one defines two "mpi derived type", one for this%gene and another one for this%idon and this&pos... I will test as soon as possible if it works
At the monent it happens also that this%gene(2), after the call mpi_allgather .... it changes its value and this is incredible becouse i think about it as an intent(in) ....
anyway the time is run away and the new version of my structured genetic code seem to be much more far! sig
 
You cannot start from this%gene(1) to get the position of this%IDON and this%POS :

THEY ARE NOT IN THE SAME MEMORY ZONE !

This is why your second solution, with three calls to mpi_send is perfect ! You could just suppress one of these calls in grouping together this%IDON and this%POS which are sequential but this is not an improvement because the programming would become less clear.

As conclusion, keep the second version which works and is understandable.
 
I read carefully your example again. I think now that my last post in inadequate.

I tried also to understand how MPI_TYPE_CREATE_STRUCT works. I am not familiar with MPI even if I know the principle. I used in the past PVM but I am more involved in OpenMP programming today. And for the future, I will try to learn coarrays.

Finally, I agree with you : you should be able yo use MPI_TYPE_CREATE_STRUCT to send all your data simultaneously.

I hope that you have declared your vectors "disp", "offset" ... correctly : I don't see the interest of starting indexes from zero. But except that, the computation of offsets seems OK (they maybe positive or negative).

The goal of MPI_TYPE_CREATE_STRUCT is to create the handle (integer value) to a new MPI type object. This object type must be validated with MPI_TYPE_COMMIT. You have done that.

But the instruction I really don't understand is the call to MPI_RECV :

Code:
if (mype==1) then
      call mpi_recv(cromF2(1)%gene(1),1,cromtype,0,1,mpi_comm_world,stat,ierr)
end if

Indeed, you have create a new MPI type object (cromtype) base of offsets on cromF1(1). And you want to receive them in cromF2(1).

Unfortunately, you have no chance to success because the offsets in cromF2 cannot not match those of cromF1. These offsets are more or less random because due to ALLOCATE statements.

I think that you need to create a new MPI type cromtype2 based on the offsets in cromF2(1) and to receive the data using that new type.

But I cannot warrant the success. I don't not know MPI enough. What I am sure is that receiving data in cromF2(1) using the object type based on cromF1(1) cannot work.

 
Dear Fjacq,
I have read your reply; what you say make sense, like ever; although I have checked that the addresses of the different elements in cromF and in cromF2 shown the same gap from the base to the elements. Clearly this case could be a lucky case and I'm sure that your suggestion remains the best way to develop this kind of code. Anyway i think that this is not the point, becouse i have yet problems. The last case i tested show the following problems: In the succesion of data i sent, i lost a part that is behind two correct stream of data, summarising:


CromF%gene(1:5)=(/5.5 5.5 5.5 5.5 4.4)
CromF%idon= 5.55555
CromF%pos=2

call MPI_TYPE_CREATE_STRUCT(3 ....
mpi_allgather( ....

CromF2%gene(1:5)=(/5.5 5.5 5.5 5.5 4.4)
CromF%idon= 1.999999999999E-19 !====> the creasy value
CromF%pos=2

And this results also by definig the type in the following way:
Code:
module class_crom
   use random_N
   type crom
      sequence
      real,dimension(1:5) :: gene    !=======> no more allocatable
      real :: idon
      integer :: pos
   end type crom

That assure to have all the type of the structure with sequential addresses.

I'm sorry that i'm not able to be more clear, but i these last days I'm directing all my efforts for an application that i did to try to win an permanet position in job ... a really hard live here in italy ... so i hope to be more clear as soon as possible.
thanks for all suggestions
 
Dear Fjacq,
at the end I have to close this story, also becouse you was very kind gave me a lot of help.
After i solved all what was solvable with your help, the problem came from the MPI library. So for who will use l"am-mpi" watch out, because they have some bugs by using mpi_structure and more generally by using user defined data type.

Now I'm using openmpi - it must considered that lam-mpi are not more supported - and the code works.

Unfortunately I have been compelled to change from Lahey to gfortran because I hasn't able to compile openmpi with lahey. gfortran doesn't support some new characteristics like allocatable elements into user definited type and this made me a bit sad, becouse i don't want to introduce poiters. What is important is that the code is working.

thanks, Albi

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top