Hi Gilles, Thank you for the follow-up. I appreciate the discussion and am glad that you will put this on your agenda.
Jim On Wed, Jan 13, 2016 at 5:28 PM, Gilles Gouaillardet <gil...@rist.or.jp> wrote: > Jim, > > your initial question was > > i think that this is a bug in open-mpi - would you agree? > > and so far, the answer is > > we disagree, this is not an OpenMPI bug, this is the MPI 3.1 standard. > > > and your last question was > > Can you make any argument in support of not allowing it (other that that's > the way you've interpreted the standard)? > > one point was made against supporting MPI_DATATYPE_NULL with zero count on > the mpi forum mailing list : > changing this is not backward compatible since it has the potential to > break existing tools that > correctly assumes (at least for now) that a datatype *cannot* be > MPI_DATATYPE_NULL. > > btw, Intel MPI (impi) is mpich based. so it is very likely this kind of > quite high level stuff is handled the same way. > > > Jeff, > > OpenMPI does not allow MPI_DATATYPE_NULL, and from a performance point of > view, that is a pointer comparison. at first glance, allowing > MPI_DATATYPE_NULL *if and only if* count is zero looks more cpu intensive. > > > George and all. > > Back to OpenMPI, now the question is : > > "Is OpenMPI going to be updated (and when) in order to support an > intuitive and user friendly feature, that is currently explicitly > prohibited by the MPI 3.1 standard, but that might be part of the MPI-4 > standard and that we already know is not backward compatible (*) ? > (*) fwiw, mpich already "implements" this, so backward incompatibility > would only affect tools currently working with OpenMPI but not with mpich." > > > i am a pragmatic guy, so i'd rather go for it, but here is what i am gonna > do : > > unless George vetoes that, i will add this topic to the weekly call > agenda, and wait for the community to make a decision > (e.g. go / no go, and milestone if needed 1.10 series ? 2.0 ? 2.1 ? master > only ?) > > Cheers, > > Gilles > > On 1/14/2016 12:23 AM, Jeff Hammond wrote: > > Bill Gropp's statement on the Forum list is clear: null handles cannot be > used unless explicitly permitted. Unfortunately, there is no exception for > MPI_DATATYPE_NULL when count=0. Hopefully, we will add one in MPI-4. > > While your usage model is perfectly reasonable to me and something that I > would do in the same position, you need to use e.g. MPI_BYTE instead to > comply with the current MPI standard. > > As to why Open-MPI wastes CPU cycles testing for datatype validity when > count=0, that is a question for someone else to answer. Implementations > have no obligation enforce every letter of the MPI standard. > > Jeff > > On Wed, Jan 13, 2016 at 6:11 AM, Jim Edwards <jedwa...@ucar.edu> wrote: > >> It seems to me that when there is a question of interpretation of the >> standard one should ask the consequences of each potential interpretation. >> It just makes sense that MPI_DATATYPE_NULL should be allowed when the >> count is 0, otherwise you need to insert some random datatype just to fill >> the array. >> >> Can you make any argument in support of not allowing it (other than >> that's the way you've interpreted the standard)? >> >> On Tue, Jan 12, 2016 at 10:44 PM, Gilles Gouaillardet < >> <gilles.gouaillar...@gmail.com>gilles.gouaillar...@gmail.com> wrote: >> >>> Thanks Jeff, >>> >>> i found it at http://lists.mpi-forum.org/mpi-forum/2016/01/3152.php >>> >>> i'd like to re-iterate what i wrote earlier about example 4.23 >>> MPI_DATATYPE_NULL is used as a recv type on non root tasks, >>> and per the mpi 3.1 standard, recv type is "significant only at root" >>> >>> in the case of MPI_Gatherv, MPI_DATATYPE_NULL is *not* significant, >>> but in the case of MPI_Alltoallw, it *is* significant. >>> >>> as far as i am concerned, and to say the least, these are two distinct >>> shades of grey. >>> >>> >>> IMHO, it would be more intuitive if the use of MPI_DATATYPE_NULL was >>> allowed with a zero count, and in both MPI_Alltoallw *and* >>> MPI_Sendrecv. >>> >>> >>> i still believe George interpretation is the correct one, and Bill >>> Gropp agreed with him. >>> >>> >>> and btw, is example 4.23 correct ? >>> /* fwiw, i did copy/paste it and found several missing local variable >>> myrank, i, and comm >>> and i'd rather have MPI_COMM_WORLD than comm */ >>> >>> and what if recvcount is negative on non root task ? >>> should it be an error (negative int) or not (not significant value) ? >>> >>> Cheers, >>> >>> Gilles >>> >>> >>> On Wed, Jan 13, 2016 at 2:15 PM, Jeff Hammond <jeff.scie...@gmail.com> >>> wrote: >>> > There's a thread about this on the MPI Forum mailing list already ;-) >>> > >>> > Jeff >>> > >>> > >>> > On Tuesday, January 12, 2016, Gilles Gouaillardet <gil...@rist.or.jp> >>> wrote: >>> >> >>> >> Jim, >>> >> >>> >> if i understand correctly, George point is that OpenMPI is currently >>> >> correct with respect to the MPI standard : >>> >> MPI_DATATYPE_NULL is *not* a predefined datatype, hence it is not >>> >> (expected to be) a committed datatype, >>> >> and hence it cannot be used in MPI_Alltoallw (regardless the >>> corresponding >>> >> count is zero). >>> >> >>> >> an other way to put this is mpich could/should have failed and/or you >>> were >>> >> lucky it worked. >>> >> >>> >> George and Jeff, >>> >> >>> >> do you feel any need to ask MPI Forum to clarify this point ? >>> >> >>> >> >>> >> Cheers, >>> >> >>> >> Gilles >>> >> >>> >> On 1/13/2016 12:14 PM, Jim Edwards wrote: >>> >> >>> >> Sorry there was a mistake in that code, >>> >> stypes and rtypes should be of type MPI_Datatype not integer >>> >> however the result is the same. >>> >> >>> >> *** An error occurred in MPI_Alltoallw >>> >> >>> >> *** reported by process [204406785,1] >>> >> >>> >> *** on communicator MPI_COMM_WORLD >>> >> >>> >> *** MPI_ERR_TYPE: invalid datatype >>> >> >>> >> >>> >> >>> >> >>> >> On Tue, Jan 12, 2016 at 7:55 PM, Jim Edwards <jedwa...@ucar.edu> >>> wrote: >>> >>> >>> >>> Maybe the example is too simple. Here is another one which >>> >>> when run on two tasks sends two integers from each task to >>> >>> task 0. Task 1 receives nothing. This works with mpich and impi >>> >>> but fails with openmpi. >>> >>> >>> >>> #include <stdio.h> >>> >>> #include <mpi.h> >>> >>> >>> >>> my_mpi_test(int rank, int ntasks) >>> >>> { >>> >>> MPI_Datatype stype, rtype; >>> >>> int sbuf[2]; >>> >>> int rbuf[4]; >>> >>> >>> >>> int slen[ntasks], sdisp[ntasks], stypes[ntasks], rlen[ntasks], >>> >>> rdisp[ntasks], rtypes[ntasks]; >>> >>> sbuf[0]=rank; >>> >>> sbuf[1]=ntasks+rank; >>> >>> slen[0]=2; >>> >>> slen[1]=0; >>> >>> stypes[0]=MPI_INT; >>> >>> stypes[1]=MPI_DATATYPE_NULL; >>> >>> sdisp[0]=0; >>> >>> sdisp[1]=4; >>> >>> if(rank==0){ >>> >>> rlen[0]=2; >>> >>> rlen[1]=2; >>> >>> rtypes[0]=MPI_INT; >>> >>> rtypes[1]=MPI_INT; >>> >>> rdisp[0]=0; >>> >>> rdisp[1]=8; >>> >>> >>> >>> }else{ >>> >>> rlen[0]=0; >>> >>> rlen[1]=0; >>> >>> rtypes[0]=MPI_DATATYPE_NULL; >>> >>> rtypes[1]=MPI_DATATYPE_NULL; >>> >>> rdisp[0]=0; >>> >>> rdisp[1]=0; >>> >>> } >>> >>> >>> >>> MPI_Alltoallw(sbuf, slen, sdisp, stypes, rbuf, rlen, rdisp, rtypes, >>> >>> MPI_COMM_WORLD); >>> >>> if(rank==0){ >>> >>> printf("%d %d %d %d\n",rbuf[0],rbuf[1],rbuf[2],rbuf[3]); >>> >>> } >>> >>> >>> >>> int main(int argc, char **argv) >>> >>> { >>> >>> int rank, ntasks; >>> >>> >>> >>> MPI_Init(&argc, &argv); >>> >>> >>> >>> MPI_Comm_rank(MPI_COMM_WORLD,&rank); >>> >>> MPI_Comm_size(MPI_COMM_WORLD, &ntasks); >>> >>> >>> >>> printf("rank %d ntasks %d\n",rank, ntasks); >>> >>> >>> >>> my_mpi_test(rank,ntasks); >>> >>> >>> >>> >>> >>> MPI_Finalize(); >>> >>> } >>> >>> >>> >>> >>> >>> >>> >> >>> >> >>> >> >>> >> -- >>> >> Jim Edwards >>> >> >>> >> CESM Software Engineer >>> >> National Center for Atmospheric Research >>> >> Boulder, CO >>> >> >>> >> >>> >> _______________________________________________ >>> >> users mailing list >>> >> us...@open-mpi.org >>> >> Subscription: <http://www.open-mpi.org/mailman/listinfo.cgi/users> >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >> Link to this post: >>> >> http://www.open-mpi.org/community/lists/users/2016/01/28258.php >>> >> >>> >> >>> > >>> > >>> > -- >>> > Jeff Hammond >>> > jeff.scie...@gmail.com >>> > http://jeffhammond.github.io/ >>> > >>> > _______________________________________________ >>> > users mailing list >>> > us...@open-mpi.org >>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>> > Link to this post: >>> > http://www.open-mpi.org/community/lists/users/2016/01/28261.php >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>> Link to this post: >>> <http://www.open-mpi.org/community/lists/users/2016/01/28262.php> >>> http://www.open-mpi.org/community/lists/users/2016/01/28262.php >>> >> >> >> >> -- >> Jim Edwards >> >> CESM Software Engineer >> National Center for Atmospheric Research >> Boulder, CO >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2016/01/28266.php >> > > > > -- > Jeff Hammond > jeff.scie...@gmail.com > http://jeffhammond.github.io/ > > > _______________________________________________ > users mailing listus...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > > Link to this post: > http://www.open-mpi.org/community/lists/users/2016/01/28267.php > > > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2016/01/28270.php > -- Jim Edwards CESM Software Engineer National Center for Atmospheric Research Boulder, CO