Hi, Could you please download latest mxm from http://www.mellanox.com/products/mxm/ and retry? The mxm version which comes with OFED 1.5.3 was tested with OMPI 1.6.0.
Regards M On Wed, Aug 22, 2012 at 2:22 PM, Pavel Mezentsev <pavel.mezent...@gmail.com>wrote: > I've tried to launch the application on nodes with QDR Infiniband. The > first attempt with 2 processes worked, but the following was printed to the > output: > [1345633953.436676] [b01:2523 :0] mpool.c:99 MXM ERROR Invalid > mempool parameter(s) > [1345633953.436676] [b01:2522 :0] mpool.c:99 MXM ERROR Invalid > mempool parameter(s) > -------------------------------------------------------------------------- > MXM was unable to create an endpoint. Please make sure that the network > link is > active on the node and the hardware is functioning. > > Error: Invalid parameter > > -------------------------------------------------------------------------- > > The results from this launch didn't differ from the results of the launch > without MXM. > > Then I've tried to launch it with 256 processes, but got the same message > from each process and then the application crashed. After that I'm > observing the same behavior as with FDR: application hangs in > the beginning. > > Best regards, Pavel Mezentsev. > > > 2012/8/22 Pavel Mezentsev <pavel.mezent...@gmail.com> > >> Hello! >> >> I've built openmpi 1.6.1rc3 with support of MXM. But when I try to launch >> an application using this mtl it hangs and can't figure out why. >> >> If I launch it with np below 128 then everything works fine since mxm >> isn't used. I've tried setting the threshold to 0 and launching 2 processes >> with the same result: hangs on startup. >> What could be causing this problem? >> >> Here is the command I execute: >> /opt/openmpi/1.6.1/mxm-test/bin/mpirun \ >> -np $NP \ >> -hostfile hosts_fdr2 \ >> --mca mtl mxm \ >> --mca btl ^tcp \ >> --mca mtl_mxm_np 0 \ >> -x OMP_NUM_THREADS=$NT \ >> -x LD_LIBRARY_PATH \ >> --bind-to-core \ >> -npernode 16 \ >> --mca coll_fca_np 0 -mca coll_fca_enable 0 \ >> ./IMB-MPI1 -npmin $NP Allreduce Reduce Barrier Bcast >> Allgather Allgatherv >> >> I'm performing the tests on nodes with Intel SB processors and FDR. >> Openmpi was configured with the following parameters: >> CC=icc CXX=icpc F77=ifort FC=ifort ./configure >> --prefix=/opt/openmpi/1.6.1rc3/mxm-test --with-mxm=/opt/mellanox/mxm >> --with-fca=/opt/mellanox/fca --with-knem=/usr/share/knem >> I'm using the latest ofed from mellanox: 1.5.3-3.1.0 on centos 6.1 with >> default kernel: 2.6.32-131.0.15. >> The compilation with default mxm (1.0.601) failed so I installed the >> latest version from mellanox: 1.1.1227 >> >> Best regards, Pavel Mezentsev. >> > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >