To follow up for the web archives... We fixed this bug off-list. It will be included in 1.6.5 and (likely) 1.7.2.
On Apr 5, 2013, at 3:18 PM, Eric Chamberland <eric.chamberl...@giref.ulaval.ca> wrote: > Hi again, > > I have attached a very small example which raise the assertion. > > The problem is arising from a process which does not have any element to > write in the file (and then in the MPI_File_set_view)... > > You can see this "bug" with openmpi 1.6.3, 1.6.4 and 1.7.0 configured with: > > ./configure --enable-mem-debug --enable-mem-profile --enable-memchecker > --with-mpi-param-check --enable-debug > > Just compile the given example (idx_null.cc) as-is with > > mpicxx -o idx_null idx_null.cc > > and run with 3 processes: > > mpirun -n 3 idx_null > > You can modify the example by commenting "#define WITH_ZERO_ELEMNT_BUG" to > see that everything is going well when all processes have something to write. > > There is no "bug" if you use openmpi 1.6.3 (and higher) without the debugging > options. > > Also, all is working well with mpich-3.0.3 configured with: > > ./configure --enable-g=yes > > > So, is this a wrong "assert" in openmpi? > > Is there a real problem to use this code in a "release" mode? > > Thanks, > > Eric > > On 04/05/2013 12:57 PM, Eric Chamberland wrote: >> Hi all, >> >> I have a well working (large) code which is using openmpi 1.6.3 (see >> config.log here: >> http://www.giref.ulaval.ca/~ericc/bug_openmpi/config.log_nodebug) >> >> (I have used it for reading with MPI I/O with success over 1500 procs >> with very large files) >> >> However, when I use openmpi compiled with "debug" options: >> >> ./configure --enable-mem-debug --enable-mem-profile --enable-memchecker >> --with-mpi-param-check --enable-debug --prefix=/opt/openmpi-1.6.3_debug >> (se other config.log here: >> http://www.giref.ulaval.ca/~ericc/bug_openmpi/config.log_debug) the code >> is aborting with an assertion on a very small example on 2 processors. >> (the same very small example is working well without the debug mode) >> >> Here is the assertion causing an abort: >> >> =================================== >> >> openmpi-1.6.3/opal/datatype/opal_datatype.h: >> >> static inline int32_t >> opal_datatype_is_contiguous_memory_layout( const opal_datatype_t* >> datatype, int32_t count ) >> { >> if( !(datatype->flags & OPAL_DATATYPE_FLAG_CONTIGUOUS) ) return 0; >> if( (count == 1) || (datatype->flags & OPAL_DATATYPE_FLAG_NO_GAPS) >> ) return 1; >> >> >> /* This is the assertion: */ >> >> assert( (OPAL_PTRDIFF_TYPE)datatype->size != (datatype->ub - >> datatype->lb) ); >> >> return 0; >> } >> >> =================================== >> >> Does anyone can tell me what does this mean? >> >> It happens while writing a file with MPI I/O when I am calling for the >> fourth time a "MPI_File_set_view"... with different types of >> MPI_Datatype created with "MPI_Type_indexed". >> >> I am trying to reproduce the bug with a very small example to be send >> here, but if anyone has a hint to give me... >> (I would like: this assert is not good! just ignore it ;-) ) >> >> Thanks, >> >> Eric >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > <idx_null.cc>_______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/