To follow up for the web archives...

We fixed this bug off-list.  It will be included in 1.6.5 and (likely) 1.7.2.


On Apr 5, 2013, at 3:18 PM, Eric Chamberland <eric.chamberl...@giref.ulaval.ca> 
wrote:

> Hi again,
> 
> I have attached a very small example which raise the assertion.
> 
> The problem is arising from a process which does not have any element to 
> write in the file (and then in the MPI_File_set_view)...
> 
> You can see this "bug" with openmpi 1.6.3, 1.6.4 and 1.7.0 configured with:
> 
> ./configure --enable-mem-debug --enable-mem-profile --enable-memchecker
> --with-mpi-param-check --enable-debug
> 
> Just compile the given example (idx_null.cc) as-is with
> 
> mpicxx -o idx_null idx_null.cc
> 
> and run with 3 processes:
> 
> mpirun -n 3 idx_null
> 
> You can modify the example by commenting "#define WITH_ZERO_ELEMNT_BUG" to 
> see that everything is going well when all processes have something to write.
> 
> There is no "bug" if you use openmpi 1.6.3 (and higher) without the debugging 
> options.
> 
> Also, all is working well with mpich-3.0.3 configured with:
> 
> ./configure --enable-g=yes
> 
> 
> So, is this a wrong "assert" in openmpi?
> 
> Is there a real problem to use this code in a "release" mode?
> 
> Thanks,
> 
> Eric
> 
> On 04/05/2013 12:57 PM, Eric Chamberland wrote:
>> Hi all,
>> 
>> I have a well working (large) code which is using openmpi 1.6.3 (see
>> config.log here:
>> http://www.giref.ulaval.ca/~ericc/bug_openmpi/config.log_nodebug)
>> 
>> (I have used it for reading with MPI I/O with success over 1500 procs
>> with very large files)
>> 
>> However, when I use openmpi compiled with "debug" options:
>> 
>> ./configure --enable-mem-debug --enable-mem-profile --enable-memchecker
>> --with-mpi-param-check --enable-debug --prefix=/opt/openmpi-1.6.3_debug
>> (se other config.log here:
>> http://www.giref.ulaval.ca/~ericc/bug_openmpi/config.log_debug) the code
>> is aborting with an assertion on a very small example on 2 processors.
>> (the same very small example is working well without the debug mode)
>> 
>> Here is the assertion causing an abort:
>> 
>> ===================================
>> 
>> openmpi-1.6.3/opal/datatype/opal_datatype.h:
>> 
>> static inline int32_t
>> opal_datatype_is_contiguous_memory_layout( const opal_datatype_t*
>> datatype, int32_t count )
>> {
>>     if( !(datatype->flags & OPAL_DATATYPE_FLAG_CONTIGUOUS) ) return 0;
>>     if( (count == 1) || (datatype->flags & OPAL_DATATYPE_FLAG_NO_GAPS)
>> ) return 1;
>> 
>> 
>> /* This is the assertion:  */
>> 
>>     assert( (OPAL_PTRDIFF_TYPE)datatype->size != (datatype->ub -
>> datatype->lb) );
>> 
>>     return 0;
>> }
>> 
>> ===================================
>> 
>> Does anyone can tell me what does this mean?
>> 
>> It happens while writing a file with MPI I/O when I am calling for the
>> fourth time a "MPI_File_set_view"... with different types of
>> MPI_Datatype created with "MPI_Type_indexed".
>> 
>> I am trying to reproduce the bug with a very small example to be send
>> here, but if anyone has a hint to give me...
>> (I would like: this assert is not good! just ignore it ;-) )
>> 
>> Thanks,
>> 
>> Eric
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> <idx_null.cc>_______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to