Please send all the information listed here: http://www.open-mpi.org/community/help/
I am able to run your test program with no problem, so I'm not quite sure what the issue is...? If op->o_func.intrinsic.fns[27] initially points to a valid value and then later it points to 0, that could imply that there is memory corruption occurring in your application somewhere. Have you tried running through a memory-checking debugger? On May 6, 2011, at 9:56 AM, hi wrote: > I am observing crash in MPI_Allreduce() call from my actual application. > After debugging I found that MPI_Allreduce() with MPI_DOUBLE_PRECISION > returns NULL for following code in op.h > > if (0 != (op->o_flags & OMPI_OP_FLAGS_INTRINSIC)) { > op->o_func.intrinsic.fns[ompi_op_ddt_map[dtype->id]](source, target, > &count, &dtype, > > op->o_func.intrinsic.modules[ompi_op_ddt_map[dtype->id]]); > > where, o_func.intrinsic.fns[27] points to 0. > On further debugging, I found that it is making call to > mca_coll_basic_reduce_lin_intra(); see below trace... > >> libmpid.dll!ompi_op_reduce(ompi_op_t * op, void * source, void * >> target, int count, ompi_datatype_t * dtype) Line 500 C++ > libmpid.dll!mca_coll_basic_reduce_lin_intra(void * sbuf, void * > rbuf, int count, ompi_datatype_t * dtype, ompi_op_t * op, int root, > ompi_communicator_t * comm, mca_coll_base_module_2_0_0_t * module) > Line 249 C++ > libmpid.dll!mca_coll_sync_reduce(void * sbuf, void * rbuf, int > count, ompi_datatype_t * dtype, ompi_op_t * op, int root, > ompi_communicator_t * comm, mca_coll_base_module_2_0_0_t * module) > Line 45 + 0xd4 bytes C++ > libmpid.dll!mca_coll_basic_allreduce_intra(void * sbuf, void * rbuf, > int count, ompi_datatype_t * dtype, ompi_op_t * op, > ompi_communicator_t * comm, mca_coll_base_module_2_0_0_t * module) > Line 57 + 0x58 bytes C++ > libmpid.dll!MPI_Allreduce(void * sendbuf, void * recvbuf, int count, > ompi_datatype_t * datatype, ompi_op_t * op, ompi_communicator_t * > comm) Line 107 + 0x5c bytes C++ > libmpi_f77d.dll!mpi_allreduce_f(char * sendbuf, char * recvbuf, int > * count, int * datatype, int * op, int * comm, int * ierr) Line 79 + > 0x34 bytes C++ > libmpi_f77d.dll!MPI_ALLREDUCE(char * sendbuf, char * recvbuf, int * > count, int * datatype, int * op, int * comm, int * ierr) Line 53 + > 0x67 bytes C++ > > > Now to simulate this problem, the attached test program works fine but > I observed completely different callstack see attached images... > > Just for information: I am executing my application using following command: > c:/openmpi/bin/orterun -mca mca_component_show_load_errors 0 --prefix > ... -x ... -x ... --machinefile ... -np 2 myApplication > > And test program using following command: > c:/openmpi/bin/mpirun mar_f_dp.exe > > > Please let me know based on what criteria "coll_reduce" is pointing to > "mca_coll_basic_allreduce_intra() or mca_coll_self_allreduce_intra(); > this would help me to debug my application further. > > Thank you in advance. > -Hiral > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/