Have you followed installation steps from README (Also here for reference http://bgate.mellanox.com/products/hpcx/README.txt)
... * Load OpenMPI/OpenSHMEM v1.8 based package: % source $HPCX_HOME/hpcx-init.sh % hpcx_load % env | grep HPCX % mpirun -np 2 $HPCX_MPI_TESTS_DIR/examples/hello_usempi % oshrun -np 2 $HPCX_MPI_TESTS_DIR/examples/hello_oshmem % hpcx_unload 3. Load HPCX environment from modules * Load OpenMPI/OpenSHMEM based package: % module use $HPCX_HOME/modulefiles % module load hpcx % mpirun -np 2 $HPCX_MPI_TESTS_DIR/examples/hello_c % oshrun -np 2 $HPCX_MPI_TESTS_DIR/examples/hello_oshmem % module unload hpcx ... On Tue, Apr 14, 2015 at 5:42 AM, Subhra Mazumdar <subhramazumd...@gmail.com> wrote: > I am using 2.4-1.0.0 mellanox ofed. > > I downloaded mofed tarball > hpcx-v1.2.0-325-gcc-MLNX_OFED_LINUX-2.4-1.0.0-redhat6.5.tar and extracted > it. It has mxm directory. > > hpcx-v1.2.0-325-[root@JARVICE ~]# ls > hpcx-v1.2.0-325-gcc-MLNX_OFED_LINUX-2.4-1.0.0-redhat6.5 > archive fca hpcx-init-ompi-mellanox-v1.8.sh ibprof modulefiles > ompi-mellanox-v1.8 sources VERSION > bupc-master hcoll hpcx-init.sh knem mxm > README.txt utils > > I tried using LD_PRELOAD for libmxm, but getting a different error stack > now as following > > [root@JARVICE ~]# ./openmpi-1.8.4/openmpinstall/bin/mpirun > --allow-run-as-root --mca mtl mxm -x > LD_PRELOAD="./openmpi-1.8.4/openmpinstall/lib/libmpi.so.1 > ./hpcx-v1.2.0-325-gcc-MLNX_OFED_LINUX-2.4-1.0.0-redhat6.5/mxm/lib/libmxm.so.2" > -n 1 ./backend localhost : -x > LD_PRELOAD="./openmpi-1.8.4/openmpinstall/lib/libmpi.so.1 > ./hpcx-v1.2.0-325-gcc-MLNX_OFED_LINUX-2.4-1.0.0-redhat6.5/mxm/lib/libmxm.so.2 > ./libci.so" -n 1 ./app2 > i am backend > [JARVICE:00564] mca: base: components_open: component pml / cm open > function failed > [JARVICE:564 :0] Caught signal 11 (Segmentation fault) > [JARVICE:00565] mca: base: components_open: component pml / cm open > function failed > [JARVICE:565 :0] Caught signal 11 (Segmentation fault) > ==== backtrace ==== > 2 0x000000000005640c mxm_handle_error() > /scrap/jenkins/workspace/hpc-power-pack/label/r-vmb-rhel6-u5-x86-64-MOFED-CHECKER/hpcx_root/src/hpcx-v1.2.0-325-gcc-MLNX_OFED_LINUX-2.4-1.0.0-redhat6.5/mxm-v3.2/src/mxm/util/debug/debug.c:641 > 3 0x000000000005657c mxm_error_signal_handler() > /scrap/jenkins/workspace/hpc-power-pack/label/r-vmb-rhel6-u5-x86-64-MOFED-CHECKER/hpcx_root/src/hpcx-v1.2.0-325-gcc-MLNX_OFED_LINUX-2.4-1.0.0-redhat6.5/mxm-v3.2/src/mxm/util/debug/debug.c:616 > 4 0x00000000000329a0 killpg() ??:0 > 5 0x0000000000045491 mca_base_components_close() ??:0 > 6 0x000000000004e99a mca_base_framework_close() ??:0 > 7 0x0000000000045431 mca_base_component_close() ??:0 > 8 0x000000000004515c mca_base_framework_components_open() ??:0 > 9 0x00000000000a0de9 mca_pml_base_open() pml_base_frame.c:0 > 10 0x000000000004eb1c mca_base_framework_open() ??:0 > 11 0x0000000000043eb3 ompi_mpi_init() ??:0 > 12 0x0000000000067cb0 PMPI_Init_thread() ??:0 > 13 0x0000000000404fdf main() /root/rain_ib/backend/backend.c:1237 > 14 0x000000000001ed1d __libc_start_main() ??:0 > 15 0x0000000000402db9 _start() ??:0 > =================== > -------------------------------------------------------------------------- > A requested component was not found, or was unable to be opened. This > means that this component is either not installed or is unable to be > used on your system (e.g., sometimes this means that shared libraries > that the component requires are unable to be found/loaded). Note that > Open MPI stopped checking at the first component that it did not find. > > Host: JARVICE > Framework: mtl > Component: mxm > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > mpirun noticed that process rank 0 with PID 564 on node JARVICE exited on > signal 11 (Segmentation fault). > -------------------------------------------------------------------------- > [JARVICE:00562] 1 more process has sent help message help-mca-base.txt / > find-available:not-valid > [JARVICE:00562] Set MCA parameter "orte_base_help_aggregate" to 0 to see > all help / error messages > > > Subhra > > > On Sun, Apr 12, 2015 at 10:48 PM, Mike Dubman <mi...@dev.mellanox.co.il> > wrote: > >> seems like mxm was not found in your ld_library_path. >> >> what mofed version do you use? >> does it have /opt/mellanox/mxm in it? >> You could just run mpirun from HPCX package which looks for mxm >> internally and recompile ompi as mentioned in README. >> >> On Mon, Apr 13, 2015 at 3:24 AM, Subhra Mazumdar < >> subhramazumd...@gmail.com> wrote: >> >>> Hi, >>> >>> I used mxm mtl as follows but getting segfault. It says mxm component >>> not found but I have compiled openmpi with mxm. Any idea what I might be >>> missing? >>> >>> [root@JARVICE ~]# ./openmpi-1.8.4/openmpinstall/bin/mpirun >>> --allow-run-as-root --mca pml cm --mca mtl mxm -n 1 -x >>> LD_PRELOAD=./openmpi-1.8.4/openmpinstall/lib/libmpi.so.1 ./backend >>> localhosst : -n 1 -x LD_PRELOAD="./libci.so >>> ./openmpi-1.8.4/openmpinstall/lib/libmpi.so.1" ./app2 >>> i am backend >>> [JARVICE:08398] *** Process received signal *** >>> [JARVICE:08398] Signal: Segmentation fault (11) >>> [JARVICE:08398] Signal code: Address not mapped (1) >>> [JARVICE:08398] Failing at address: 0x10 >>> [JARVICE:08398] [ 0] /lib64/libpthread.so.0(+0xf710)[0x7ff8d0ddb710] >>> [JARVICE:08398] [ 1] >>> /root/openmpi-1.8.4/openmpinstall/lib/libopen-pal.so.6(mca_base_components_close+0x21)[0x7ff8cf9ae491] >>> [JARVICE:08398] [ 2] >>> /root/openmpi-1.8.4/openmpinstall/lib/libopen-pal.so.6(mca_base_framework_close+0x6a)[0x7ff8cf9b799a] >>> [JARVICE:08398] [ 3] >>> /root/openmpi-1.8.4/openmpinstall/lib/libopen-pal.so.6(mca_base_component_close+0x21)[0x7ff8cf9ae431] >>> [JARVICE:08398] [ 4] >>> /root/openmpi-1.8.4/openmpinstall/lib/libopen-pal.so.6(mca_base_framework_components_open+0x11c)[0x7ff8cf9ae15c] >>> [JARVICE:08398] [ 5] >>> ./openmpi-1.8.4/openmpinstall/lib/libmpi.so.1(+0xa0de9)[0x7ff8d1089de9] >>> [JARVICE:08398] [ 6] >>> /root/openmpi-1.8.4/openmpinstall/lib/libopen-pal.so.6(mca_base_framework_open+0x7c)[0x7ff8cf9b7b1c] >>> [JARVICE:08398] [ 7] [JARVICE:08398] mca: base: components_open: >>> component pml / cm open function failed >>> >>> ./openmpi-1.8.4/openmpinstall/lib/libmpi.so.1(ompi_mpi_init+0x4b3)[0x7ff8d102ceb3] >>> [JARVICE:08398] [ 8] >>> ./openmpi-1.8.4/openmpinstall/lib/libmpi.so.1(PMPI_Init_thread+0x100)[0x7ff8d1050cb0] >>> [JARVICE:08398] [ 9] ./backend[0x404fdf] >>> [JARVICE:08398] [10] >>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7ff8cfeded1d] >>> [JARVICE:08398] [11] ./backend[0x402db9] >>> [JARVICE:08398] *** End of error message *** >>> >>> -------------------------------------------------------------------------- >>> A requested component was not found, or was unable to be opened. This >>> means that this component is either not installed or is unable to be >>> used on your system (e.g., sometimes this means that shared libraries >>> that the component requires are unable to be found/loaded). Note that >>> Open MPI stopped checking at the first component that it did not find. >>> >>> Host: JARVICE >>> Framework: mtl >>> Component: mxm >>> >>> -------------------------------------------------------------------------- >>> >>> -------------------------------------------------------------------------- >>> mpirun noticed that process rank 0 with PID 8398 on node JARVICE exited >>> on signal 11 (Segmentation fault). >>> >>> -------------------------------------------------------------------------- >>> >>> >>> Subhra. >>> >>> >>> On Fri, Apr 10, 2015 at 12:12 AM, Mike Dubman <mi...@dev.mellanox.co.il> >>> wrote: >>> >>>> no need IPoIB, mxm uses native IB. >>>> >>>> Please see HPCX (pre-compiled ompi, integrated with MXM and FCA) README >>>> file for details how to compile/select. >>>> >>>> The default transport is UD for internode communication and >>>> shared-memory for intra-node. >>>> >>>> http://bgate,mellanox.com/products/hpcx/ >>>> >>>> Also, mxm included in the Mellanox OFED. >>>> >>>> On Fri, Apr 10, 2015 at 5:26 AM, Subhra Mazumdar < >>>> subhramazumd...@gmail.com> wrote: >>>> >>>>> Hi, >>>>> >>>>> Does ipoib need to be configured on the ib cards for mxm (I have a >>>>> separate ethernet connection too)? Also are there special flags in mpirun >>>>> to select from UD/RC/DC? What is the default? >>>>> >>>>> Thanks, >>>>> Subhra. >>>>> >>>>> >>>>> On Tue, Mar 31, 2015 at 9:46 AM, Mike Dubman <mi...@dev.mellanox.co.il >>>>> > wrote: >>>>> >>>>>> Hi, >>>>>> mxm uses IB rdma/roce technologies. Once can select UD/RC/DC >>>>>> transports to be used in mxm. >>>>>> >>>>>> By selecting mxm, all MPI p2p routines will be mapped to appropriate >>>>>> mxm functions. >>>>>> >>>>>> M >>>>>> >>>>>> On Mon, Mar 30, 2015 at 7:32 PM, Subhra Mazumdar < >>>>>> subhramazumd...@gmail.com> wrote: >>>>>> >>>>>>> Hi MIke, >>>>>>> >>>>>>> Does the mxm mtl use infiniband rdma? Also from programming >>>>>>> perspective, do I need to use anything else other than >>>>>>> MPI_Send/MPI_Recv? >>>>>>> >>>>>>> Thanks, >>>>>>> Subhra. >>>>>>> >>>>>>> >>>>>>> On Sun, Mar 29, 2015 at 11:14 PM, Mike Dubman < >>>>>>> mi...@dev.mellanox.co.il> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> openib btl does not support this thread model. >>>>>>>> You can use OMPI w/ mxm (-mca mtl mxm) and multiple thread mode lin >>>>>>>> 1.8 x series or (-mca pml yalla) in the master branch. >>>>>>>> >>>>>>>> M >>>>>>>> >>>>>>>> On Mon, Mar 30, 2015 at 9:09 AM, Subhra Mazumdar < >>>>>>>> subhramazumd...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> Can MPI_THREAD_MULTIPLE and openib btl work together in open mpi >>>>>>>>> 1.8.4? If so are there any command line options needed during run >>>>>>>>> time? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Subhra. >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> users mailing list >>>>>>>>> us...@open-mpi.org >>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>>>> Link to this post: >>>>>>>>> http://www.open-mpi.org/community/lists/users/2015/03/26574.php >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> Kind Regards, >>>>>>>> >>>>>>>> M. >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> users mailing list >>>>>>>> us...@open-mpi.org >>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>>> Link to this post: >>>>>>>> http://www.open-mpi.org/community/lists/users/2015/03/26575.php >>>>>>>> >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> users mailing list >>>>>>> us...@open-mpi.org >>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>> Link to this post: >>>>>>> http://www.open-mpi.org/community/lists/users/2015/03/26580.php >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> Kind Regards, >>>>>> >>>>>> M. >>>>>> >>>>>> _______________________________________________ >>>>>> users mailing list >>>>>> us...@open-mpi.org >>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>> Link to this post: >>>>>> http://www.open-mpi.org/community/lists/users/2015/03/26584.php >>>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> users mailing list >>>>> us...@open-mpi.org >>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>> Link to this post: >>>>> http://www.open-mpi.org/community/lists/users/2015/04/26663.php >>>>> >>>> >>>> >>>> >>>> -- >>>> >>>> Kind Regards, >>>> >>>> M. >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> Link to this post: >>>> http://www.open-mpi.org/community/lists/users/2015/04/26665.php >>>> >>> >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>> Link to this post: >>> http://www.open-mpi.org/community/lists/users/2015/04/26686.php >>> >> >> >> >> -- >> >> Kind Regards, >> >> M. >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2015/04/26688.php >> > > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/04/26711.php > -- Kind Regards, M.