Thomas, The struct idea makes perfect sense. As apparently you have multiple local_tlr_lookup the current approach will certainly not work. As you mentioned the allocatable arrays do not have similar relative displacements, and this prevent the derived datatype from being correctly constructed. Moreover, in order to be able to build a consistent datatype with a repetition count, one must suppose that your “nat" is a constant. You then have 2 possible approaches:
1. Get rid of the allocatable arrays in the tlr_lut structure. (this is an easy solution but not very flexible). 2. (Spoiler alert: I have no idea how to implement this in Fortran, but this is a regular trick for C programmers). Reshape your structure to have all fixed size elements in the beginning, and the only variable size one at the end (in your case this require to move the dTLRdr at the end of the struct). Then allocate a memory area large enough to hold all your structures, and the the pointers manually (also each datatype will have to be resized at the correct length). Let me give you an example (in C) typedef struct my_t { int a; double b[1]; } my_t; /* suppose the maximum size of the d component is NMAX (3*3*nat*3), and that * I dare allocate a little extra memory (in case the d’s are not all of the * same size). */ size_t my_struct_size = (sizeof(my_t) + NMAX * sizeof(double)) * nb_elements; my_t* elements = (my_t*)malloc(my_struct_size); /* it is now unsafe to access elements[i] but I can build a datatype that match */ displ[0] = &elements[0].a - &element[0]; displ[1] = &elements[0].d - &element[0]; types[0] = MPI_INT; types[1] = MPI_DOUBLE; bl[0] = 1; bl[1] = NMAX; /* create the base struct bstr*/ MPI_Type_create_resized(bstr, 0, my_size_struct, &my_mpi_dt); /* you can now send using my_mpi_dt and a count */ Good luck ? ;) George. > On Mar 15, 2015, at 19:50 , Thomas Markovich <thomasmarkov...@gmail.com> > wrote: > > Hi George, > > Thanks for taking the time to look at my question! wtlr was a typo when I was > stripping things down for a smaller example... TLR should be a 3x3 matrix > (long range dipole dipole tensor). > > I'm trying to split up the computation of anywhere between 30k and 15m > individual dipole-dipole tensors, and I figured that I'd use a struct to help > with the book keeping. It appears that having an allocatable tensor (dTLRdr) > throws a wrench in the whole thing. When I looked more into the memory layout > with totalview, it looks like the various dTLRdR isn't even contiguously > connected with the rest of the array! This is supported by the fact that my > offsets for each dTLRdr ( offset of tlr_lookup(1)%dTLRdr, > tlr_lookup(2)%dTLRdr, tlr_lookup(3)%dTLRdr) are different. Unfortunately, I'm > not totally sure that it's possible to do what I was trying to do. I'm going > to try a different strategy that sidesteps this issue of derived data types, > though. Perhaps that'll help. > > Best, > Thomas > > On Sun, Mar 15, 2015 at 9:00 PM George Bosilca <bosi...@icl.utk.edu > <mailto:bosi...@icl.utk.edu>> wrote: > Thomas, > > IWhat exactly is 'local_tlr_lookup(1)%wtlr'? > > I think the problem is that your MPI derived datatype use the pointer to the > allocatable arrays instead of using the pointer to the first element of these > arrays. As an example instead of doing > call mpi_get_address(local_tlr_lookup(1)%wtlr, offsets(3), ierr) > I would go for > call mpi_get_address(local_tlr_lookup(1)%TLR(1,1), offsets(3), ierr) > > Then you don't have to subtract offset(1) from the other, but instead you can > go for absolute addressing. Unfortunately this approach is not compatible > with sending multiple structures (aka. using a count != 1), simply because > each struct might have different displacements for the internal allocatable > arrays. > > George. > > > > On Sun, Mar 15, 2015 at 3:45 PM, Thomas Markovich <thomasmarkov...@gmail.com > <mailto:thomasmarkov...@gmail.com>> wrote: > Hi All, > > I'm trying to parallelize my code by distributing the computation of various > elements of a lookup table and then sync that lookup table across all nodes. > To make the code easier to read, and to keep track of everything easier, I've > decided to use a derived data type in fortran defined as follows: > type tlr_lut > sequence > integer p > integer q > real(dp), dimension(3, 3) :: TLR > real(dp), dimension(:, :, :, :) :: dTLRdr > real(dp), dimension(3, 3, 3, 3) :: dTLRdh > integer unique_ind > end type tlr_lut > > and this works quite well in serial. I just have to allocate dTLRdr at run > time. This is because TLR should be size 3x3xNx3, where N is a constant known > at run time but not compile time. I've tried to create a custom data type to > tell open-mpi what the size should be but I'm at a loss for how to deal with > the allocatable array. I've tried something like this: > > type(tlr_lut), dimension(:), allocatable :: tlr_lookup, temp_tlr_lookup > type(tlr_lut), dimension(:), allocatable :: local_tlr_lookup > integer :: datatype, oldtypes(6), blockcounts(6) > INTEGER(KIND=MPI_ADDRESS_KIND) :: offsets(6) > integer :: numtasks, rank, i, ierr > integer :: n, status(mpi_status_size) > > do i = 1, num_pairs, 1 > p = unique_pairs(i)%p > q = unique_pairs(i)%q > cpuid = unique_pairs(i)%cpu > if(cpuid.eq.me_image) then > TLR = 0.0_DP > dTLRdr = 0.0_DP > dTLRdh = 0.0_DP > call mbdvdw_TLR(p, q, TLR, dTLRdr, dTLRdh) > if(.not.allocated(local_tlr_lookup(counter)%dTLRdr)) > allocate(local_tlr_lookup(counter)%dTLRdr(3, 3, nat, 3)) > local_tlr_lookup(counter)%p = p > local_tlr_lookup(counter)%q = q > local_tlr_lookup(counter)%TLR(:, :) = TLR(:, :) > local_tlr_lookup(counter)%dTLRdr(:,:,:,:) = dTLRdR(:,:,:,:) > local_tlr_lookup(counter)%dTLRdh(:,:,:,:) = dTLRdh(:,:,:,:) > end if > end do > > call mpi_get_address(local_tlr_lookup(1)%p, offsets(1), ierr) > call mpi_get_address(local_tlr_lookup(1)%q, offsets(2), ierr) > call mpi_get_address(local_tlr_lookup(1)%wtlr, offsets(3), ierr) > call mpi_get_address(local_tlr_lookup(1)%wdtlrdr, offsets(4), ierr) > call mpi_get_address(local_tlr_lookup(1)%wdtlrdh, offsets(5), ierr) > call mpi_get_address(local_tlr_lookup(1)%unique_ind, offsets(6), ierr) > > do i = 2, size(offsets) > offsets(i) = offsets(i) - offsets(1) > end do > offsets(1) = 0 > > oldtypes = (/mpi_integer, mpi_integer, mpi_real, mpi_real, mpi_real, > mpi_integer/) > blockcounts = (/1, 1, 3*3, 3*3*nat*3, 3*3*3*3, 1/) > > But it didn't seem to work and I'm sorta at a loss. Any suggestions? > > Best, > Thomas > > _______________________________________________ > users mailing list > us...@open-mpi.org <mailto:us...@open-mpi.org> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > <http://www.open-mpi.org/mailman/listinfo.cgi/users> > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/03/26472.php > <http://www.open-mpi.org/community/lists/users/2015/03/26472.php> > > _______________________________________________ > users mailing list > us...@open-mpi.org <mailto:us...@open-mpi.org> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > <http://www.open-mpi.org/mailman/listinfo.cgi/users> > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/03/26473.php > <http://www.open-mpi.org/community/lists/users/2015/03/26473.php>_______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/03/26474.php