Hi Jeff, Test program works fine; but you can notice the difference between the callstack images of test program and of my actual application.
In test program it calls mca_coll_self_allreduce_intra while in my application it calls mca_coll_basic_allreduce_intra. So I want to know which parameter or setting makes call to mca_coll_basic_allreduce_intra compared to mca_coll_self_allreduce.intra; if you can comment on this would be helpful. Just for more information: op->o_func.intrinsic.fns[27] points to 0 when using MPI_Allreduce(...,...,...,MPI_DOUBLE_PRECISION, MPI_SUM,...,...) Thank you. -Hiral