I would argue this is a typical user level bug. The major difference between the dist_create and dist_create_adjacent is that in the later each process provides its neighbors in an order that is expected (and that match the info provided to the MPI_Neighbor_alltoallw call. When the topology is created with dist_create, every process will end-up having the correct partial topology, but in an order that doesn't match what the user expected (not in the rank-order of the neighbors). However, I can't find anything in the standard that would require from the MPI library to sort the neighbors. I would assume is the user responsibility, to make sure that they are using the topology in the right order, where the right order is what the communicator really contains and not what the user expect based on prior knowledge.
George. On Fri, Nov 21, 2014 at 3:48 AM, Gilles Gouaillardet < gilles.gouaillar...@iferc.org> wrote: > Ghislain, > > i can confirm there is a bug in mca_topo_base_dist_graph_distribute > > FYI a proof of concept is available at > https://github.com/open-mpi/ompi/pull/283 > and i recommend you use MPI_Dist_graph_create_adjacent if this meets your > needs. > > as a side note, the right way to set the info is > MPI_Info info = MPI_INFO_NULL; > > /* mpich is more picky and crashes with info = NULL */ > > Cheers, > > Gilles > > > On 2014/11/21 18:21, Ghislain Viguier wrote: > > Hi Gilles and Howard, > > The use of MPI_Dist_graph_create_adjacent solves the issue :) > > Thanks for your help! > > Best reagrds, > Ghislain > > 2014-11-21 7:23 GMT+01:00 Gilles Gouaillardet <gilles.gouaillar...@iferc.org > > : > > Hi Ghislain, > > that sound like a but in MPI_Dist_graph_create :-( > > you can use MPI_Dist_graph_create_adjacent instead : > > MPI_Dist_graph_create_ > adjacent(MPI_COMM_WORLD, degrees, &targets[0], > &weights[0], > degrees, &targets[0], &weights[0], info, > rankReordering, &commGraph); > > it does not crash and as far as i understand, it produces correct results, > > according the the mpi standard (example 7.3) that should do the same > thing, that's why > i think there is a bug in MPI_Dist_graph_create > > Cheers, > > Gilles > > > > > On 2014/11/21 2:21, Howard Pritchard wrote: > > Hi Ghislain, > > I tried to run your test with mvapich 1.9 and get a "message truncated" > failure at three ranks. > > Howard > > > 2014-11-20 8:51 GMT-07:00 Ghislain Viguier <ghislain.vigu...@gmail.com> > <ghislain.vigu...@gmail.com> <ghislain.vigu...@gmail.com> > <ghislain.vigu...@gmail.com>: > > > Dear support, > > I'm encountering an issue with the MPI_Neighbor_alltoallw request of > mpi-1.8.3. > I have enclosed a test case with information of my workstation. > > In this test, I define a weighted topology for 5 processes, where the > weight represent the number of buffers to send/receive : > rank > 0 : | x | > 1 : | 2 | x | > 2 : | 1 | 1 | x | > 3 : | 3 | 2 | 3 | x | > 4 : | 5 | 2 | 2 | 2 | x | > > In this topology, the rank 1 will send/receive : > 2 buffers to/from the rank 0, > 1 buffer to/from the rank 2, > 2 buffers to/from the rank 3, > 2 buffers to/from the rank 4, > > The send buffer are defined with the MPI_Type_create_hindexed_block. This > allows to use a same buffer for several communications without duplicating > it (read only). > Here the rank 1 will have 2 send buffers (the max of 2, 1, 2, 2). > The receiver buffer is a contiguous buffer defined with > MPI_Type_contiguous request. > Here, the receiver buffer of the rank 1 is of size : 7 (2+1+2+2) > > This test case succesful for 2 or 3 processes. For 4 processes, the test > fails 1 times for 3 successes. For 5 processes, the test fails all the time. > > The error code is : *** MPI_ERR_IN_STATUS: error code in status > > I don't understand what I am doing wrong. > > Could you please have a look on it? > > Thank you very much. > > Best regards, > Ghislain Viguier > > -- > Ghislain Viguier > Tél. 06 31 95 03 17 > > _______________________________________________ > users mailing listus...@open-mpi.org > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this > post:http://www.open-mpi.org/community/lists/users/2014/11/25850.php > > > > _______________________________________________ > users mailing listus...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/11/25852.php > > > > _______________________________________________ > users mailing listus...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this > post:http://www.open-mpi.org/community/lists/users/2014/11/25853.php > > > > > _______________________________________________ > users mailing listus...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/11/25855.php > > > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/11/25856.php >