Have you followed installation steps from README (Also here for reference
http://bgate.mellanox.com/products/hpcx/README.txt)

...

* Load OpenMPI/OpenSHMEM v1.8 based package:

    % source $HPCX_HOME/hpcx-init.sh
    % hpcx_load
    % env | grep HPCX
    % mpirun -np 2 $HPCX_MPI_TESTS_DIR/examples/hello_usempi
    % oshrun -np 2 $HPCX_MPI_TESTS_DIR/examples/hello_oshmem
    % hpcx_unload

3. Load HPCX environment from modules

* Load OpenMPI/OpenSHMEM based package:

    % module use $HPCX_HOME/modulefiles
    % module load hpcx
    % mpirun -np 2 $HPCX_MPI_TESTS_DIR/examples/hello_c
    % oshrun -np 2 $HPCX_MPI_TESTS_DIR/examples/hello_oshmem
    % module unload hpcx

...

On Tue, Apr 14, 2015 at 5:42 AM, Subhra Mazumdar <subhramazumd...@gmail.com>
wrote:

> I am using 2.4-1.0.0 mellanox ofed.
>
> I downloaded mofed tarball
> hpcx-v1.2.0-325-gcc-MLNX_OFED_LINUX-2.4-1.0.0-redhat6.5.tar and extracted
> it. It has mxm directory.
>
> hpcx-v1.2.0-325-[root@JARVICE ~]# ls
> hpcx-v1.2.0-325-gcc-MLNX_OFED_LINUX-2.4-1.0.0-redhat6.5
> archive      fca    hpcx-init-ompi-mellanox-v1.8.sh  ibprof  modulefiles
> ompi-mellanox-v1.8  sources  VERSION
> bupc-master  hcoll  hpcx-init.sh                     knem    mxm
> README.txt          utils
>
> I tried using LD_PRELOAD for libmxm, but getting a different error stack
> now as following
>
> [root@JARVICE ~]# ./openmpi-1.8.4/openmpinstall/bin/mpirun
> --allow-run-as-root --mca mtl mxm -x
> LD_PRELOAD="./openmpi-1.8.4/openmpinstall/lib/libmpi.so.1
> ./hpcx-v1.2.0-325-gcc-MLNX_OFED_LINUX-2.4-1.0.0-redhat6.5/mxm/lib/libmxm.so.2"
> -n 1 ./backend  localhost : -x
> LD_PRELOAD="./openmpi-1.8.4/openmpinstall/lib/libmpi.so.1
> ./hpcx-v1.2.0-325-gcc-MLNX_OFED_LINUX-2.4-1.0.0-redhat6.5/mxm/lib/libmxm.so.2
> ./libci.so" -n 1 ./app2
>  i am backend
> [JARVICE:00564] mca: base: components_open: component pml / cm open
> function failed
> [JARVICE:564  :0] Caught signal 11 (Segmentation fault)
> [JARVICE:00565] mca: base: components_open: component pml / cm open
> function failed
> [JARVICE:565  :0] Caught signal 11 (Segmentation fault)
> ==== backtrace ====
>  2 0x000000000005640c mxm_handle_error()
> /scrap/jenkins/workspace/hpc-power-pack/label/r-vmb-rhel6-u5-x86-64-MOFED-CHECKER/hpcx_root/src/hpcx-v1.2.0-325-gcc-MLNX_OFED_LINUX-2.4-1.0.0-redhat6.5/mxm-v3.2/src/mxm/util/debug/debug.c:641
>  3 0x000000000005657c mxm_error_signal_handler()
> /scrap/jenkins/workspace/hpc-power-pack/label/r-vmb-rhel6-u5-x86-64-MOFED-CHECKER/hpcx_root/src/hpcx-v1.2.0-325-gcc-MLNX_OFED_LINUX-2.4-1.0.0-redhat6.5/mxm-v3.2/src/mxm/util/debug/debug.c:616
>  4 0x00000000000329a0 killpg()  ??:0
>  5 0x0000000000045491 mca_base_components_close()  ??:0
>  6 0x000000000004e99a mca_base_framework_close()  ??:0
>  7 0x0000000000045431 mca_base_component_close()  ??:0
>  8 0x000000000004515c mca_base_framework_components_open()  ??:0
>  9 0x00000000000a0de9 mca_pml_base_open()  pml_base_frame.c:0
> 10 0x000000000004eb1c mca_base_framework_open()  ??:0
> 11 0x0000000000043eb3 ompi_mpi_init()  ??:0
> 12 0x0000000000067cb0 PMPI_Init_thread()  ??:0
> 13 0x0000000000404fdf main()  /root/rain_ib/backend/backend.c:1237
> 14 0x000000000001ed1d __libc_start_main()  ??:0
> 15 0x0000000000402db9 _start()  ??:0
> ===================
> --------------------------------------------------------------------------
> A requested component was not found, or was unable to be opened.  This
> means that this component is either not installed or is unable to be
> used on your system (e.g., sometimes this means that shared libraries
> that the component requires are unable to be found/loaded).  Note that
> Open MPI stopped checking at the first component that it did not find.
>
> Host:      JARVICE
> Framework: mtl
> Component: mxm
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> mpirun noticed that process rank 0 with PID 564 on node JARVICE exited on
> signal 11 (Segmentation fault).
> --------------------------------------------------------------------------
> [JARVICE:00562] 1 more process has sent help message help-mca-base.txt /
> find-available:not-valid
> [JARVICE:00562] Set MCA parameter "orte_base_help_aggregate" to 0 to see
> all help / error messages
>
>
> Subhra
>
>
> On Sun, Apr 12, 2015 at 10:48 PM, Mike Dubman <mi...@dev.mellanox.co.il>
> wrote:
>
>> seems like mxm was not found in your ld_library_path.
>>
>> what mofed version do you use?
>> does it have /opt/mellanox/mxm in it?
>> You could just run mpirun from HPCX package which looks for mxm
>> internally and recompile ompi as mentioned in README.
>>
>> On Mon, Apr 13, 2015 at 3:24 AM, Subhra Mazumdar <
>> subhramazumd...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I used mxm mtl as follows but getting segfault. It says mxm component
>>> not found but I have compiled openmpi with mxm. Any idea what I might be
>>> missing?
>>>
>>> [root@JARVICE ~]# ./openmpi-1.8.4/openmpinstall/bin/mpirun
>>> --allow-run-as-root --mca pml cm --mca mtl mxm -n 1 -x
>>> LD_PRELOAD=./openmpi-1.8.4/openmpinstall/lib/libmpi.so.1 ./backend
>>> localhosst : -n 1 -x LD_PRELOAD="./libci.so
>>> ./openmpi-1.8.4/openmpinstall/lib/libmpi.so.1" ./app2
>>>  i am backend
>>> [JARVICE:08398] *** Process received signal ***
>>> [JARVICE:08398] Signal: Segmentation fault (11)
>>> [JARVICE:08398] Signal code: Address not mapped (1)
>>> [JARVICE:08398] Failing at address: 0x10
>>> [JARVICE:08398] [ 0] /lib64/libpthread.so.0(+0xf710)[0x7ff8d0ddb710]
>>> [JARVICE:08398] [ 1]
>>> /root/openmpi-1.8.4/openmpinstall/lib/libopen-pal.so.6(mca_base_components_close+0x21)[0x7ff8cf9ae491]
>>> [JARVICE:08398] [ 2]
>>> /root/openmpi-1.8.4/openmpinstall/lib/libopen-pal.so.6(mca_base_framework_close+0x6a)[0x7ff8cf9b799a]
>>> [JARVICE:08398] [ 3]
>>> /root/openmpi-1.8.4/openmpinstall/lib/libopen-pal.so.6(mca_base_component_close+0x21)[0x7ff8cf9ae431]
>>> [JARVICE:08398] [ 4]
>>> /root/openmpi-1.8.4/openmpinstall/lib/libopen-pal.so.6(mca_base_framework_components_open+0x11c)[0x7ff8cf9ae15c]
>>> [JARVICE:08398] [ 5]
>>> ./openmpi-1.8.4/openmpinstall/lib/libmpi.so.1(+0xa0de9)[0x7ff8d1089de9]
>>> [JARVICE:08398] [ 6]
>>> /root/openmpi-1.8.4/openmpinstall/lib/libopen-pal.so.6(mca_base_framework_open+0x7c)[0x7ff8cf9b7b1c]
>>> [JARVICE:08398] [ 7] [JARVICE:08398] mca: base: components_open:
>>> component pml / cm open function failed
>>>
>>> ./openmpi-1.8.4/openmpinstall/lib/libmpi.so.1(ompi_mpi_init+0x4b3)[0x7ff8d102ceb3]
>>> [JARVICE:08398] [ 8]
>>> ./openmpi-1.8.4/openmpinstall/lib/libmpi.so.1(PMPI_Init_thread+0x100)[0x7ff8d1050cb0]
>>> [JARVICE:08398] [ 9] ./backend[0x404fdf]
>>> [JARVICE:08398] [10]
>>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7ff8cfeded1d]
>>> [JARVICE:08398] [11] ./backend[0x402db9]
>>> [JARVICE:08398] *** End of error message ***
>>>
>>> --------------------------------------------------------------------------
>>> A requested component was not found, or was unable to be opened.  This
>>> means that this component is either not installed or is unable to be
>>> used on your system (e.g., sometimes this means that shared libraries
>>> that the component requires are unable to be found/loaded).  Note that
>>> Open MPI stopped checking at the first component that it did not find.
>>>
>>> Host:      JARVICE
>>> Framework: mtl
>>> Component: mxm
>>>
>>> --------------------------------------------------------------------------
>>>
>>> --------------------------------------------------------------------------
>>> mpirun noticed that process rank 0 with PID 8398 on node JARVICE exited
>>> on signal 11 (Segmentation fault).
>>>
>>> --------------------------------------------------------------------------
>>>
>>>
>>> Subhra.
>>>
>>>
>>> On Fri, Apr 10, 2015 at 12:12 AM, Mike Dubman <mi...@dev.mellanox.co.il>
>>> wrote:
>>>
>>>> no need IPoIB, mxm uses native IB.
>>>>
>>>> Please see HPCX (pre-compiled ompi, integrated with MXM and FCA) README
>>>> file for details how to compile/select.
>>>>
>>>> The default transport is UD for internode communication and
>>>> shared-memory for intra-node.
>>>>
>>>> http://bgate,mellanox.com/products/hpcx/
>>>>
>>>> Also, mxm included in the Mellanox OFED.
>>>>
>>>> On Fri, Apr 10, 2015 at 5:26 AM, Subhra Mazumdar <
>>>> subhramazumd...@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Does ipoib need to be configured on the ib cards for mxm (I have a
>>>>> separate ethernet connection too)? Also are there special flags in mpirun
>>>>> to select from UD/RC/DC? What is the default?
>>>>>
>>>>> Thanks,
>>>>> Subhra.
>>>>>
>>>>>
>>>>> On Tue, Mar 31, 2015 at 9:46 AM, Mike Dubman <mi...@dev.mellanox.co.il
>>>>> > wrote:
>>>>>
>>>>>> Hi,
>>>>>> mxm uses IB rdma/roce technologies. Once can select UD/RC/DC
>>>>>> transports to be used in mxm.
>>>>>>
>>>>>> By selecting mxm, all MPI p2p routines will be mapped to appropriate
>>>>>> mxm functions.
>>>>>>
>>>>>> M
>>>>>>
>>>>>> On Mon, Mar 30, 2015 at 7:32 PM, Subhra Mazumdar <
>>>>>> subhramazumd...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi MIke,
>>>>>>>
>>>>>>> Does the mxm mtl use infiniband rdma? Also from programming
>>>>>>> perspective, do I need to use anything else other than 
>>>>>>> MPI_Send/MPI_Recv?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Subhra.
>>>>>>>
>>>>>>>
>>>>>>> On Sun, Mar 29, 2015 at 11:14 PM, Mike Dubman <
>>>>>>> mi...@dev.mellanox.co.il> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>> openib btl does not support this thread model.
>>>>>>>> You can use OMPI w/ mxm (-mca mtl mxm) and multiple thread mode lin
>>>>>>>> 1.8 x series or (-mca pml yalla) in the master branch.
>>>>>>>>
>>>>>>>> M
>>>>>>>>
>>>>>>>> On Mon, Mar 30, 2015 at 9:09 AM, Subhra Mazumdar <
>>>>>>>> subhramazumd...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> Can MPI_THREAD_MULTIPLE and openib btl work together in open mpi
>>>>>>>>> 1.8.4? If so are there any command line options needed during run 
>>>>>>>>> time?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Subhra.
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> us...@open-mpi.org
>>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>> Link to this post:
>>>>>>>>> http://www.open-mpi.org/community/lists/users/2015/03/26574.php
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> Kind Regards,
>>>>>>>>
>>>>>>>> M.
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> us...@open-mpi.org
>>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>> Link to this post:
>>>>>>>> http://www.open-mpi.org/community/lists/users/2015/03/26575.php
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org
>>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>> Link to this post:
>>>>>>> http://www.open-mpi.org/community/lists/users/2015/03/26580.php
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Kind Regards,
>>>>>>
>>>>>> M.
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> Link to this post:
>>>>>> http://www.open-mpi.org/community/lists/users/2015/03/26584.php
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> Link to this post:
>>>>> http://www.open-mpi.org/community/lists/users/2015/04/26663.php
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Kind Regards,
>>>>
>>>> M.
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> Link to this post:
>>>> http://www.open-mpi.org/community/lists/users/2015/04/26665.php
>>>>
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/users/2015/04/26686.php
>>>
>>
>>
>>
>> --
>>
>> Kind Regards,
>>
>> M.
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2015/04/26688.php
>>
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/04/26711.php
>



-- 

Kind Regards,

M.

Reply via email to