**xpmem kernel module. On Fri, Jun 1, 2018 at 3:16 PM, Joshua Ladd <jladd.m...@gmail.com> wrote:
> Hi, Marcin > > Sorry for the late response (somehow this one got lost in the clutter). We > added support for shmem_ptr in the UCX SPML in Open MPI 3.0. However, in > order to use it, you must install the Knem kernel module ( > https://github.com/hjelmn/xpmem). > > Best, > > Josh > > On Wed, Apr 18, 2018 at 4:01 AM, marcin.krotkiewski < > marcin.krotkiew...@gmail.com> wrote: > >> Hi, >> >> I'm running the below example from the OpenMPI documentation: >> >> #include <mpp/shmem.h> >> #include <stdio.h> >> >> main() >> { >> static int bigd[100]; >> int *ptr; >> int i; >> shmem_init(); >> if (shmem_my_pe() == 0) { >> /* initialize PE 1’s bigd array */ >> ptr = shmem_ptr(bigd, 1); >> if(!ptr){ >> fprintf(stderr, "get external pointer failed!\n"); >> shmem_global_exit(-1); >> } >> for (i=0; i<100; i++) >> *ptr++ = i+1; >> } >> shmem_barrier_all(); >> if (shmem_my_pe() == 1) { >> printf("bigd on PE 1 is:\n"); >> for (i=0; i<100; i++) >> printf(" %d\n",bigd[i]); >> printf("\n"); >> } >> } >> >> but shmem_ptr always returns NULL for me. I tried with OpenMPI versions >> from 2.0.1 up to 3.1.0rc4, compiled with HPCX 2.1, running on a ConnectX-4 >> system. This is the command line: >> >> $ shmemrun -mca spml ucx -mca spml_base_verbose 100 -np 2 -map-by node >> -report-bindings ./a.out >> >> [c11-1:36505] MCW rank 0 bound to socket 0[core 0[hwt 0-1]]: >> [BB/../../../../../../../../../../../../../../..][../../../. >> ./../../../../../../../../../../../..] >> [c11-2:105580] MCW rank 1 bound to socket 0[core 0[hwt 0-1]]: >> [BB/../../../../../../../../../../../../../../..][../../../. >> ./../../../../../../../../../../../..] >> [c11-1:36522] mca: base: components_register: registering framework spml >> components >> [c11-1:36522] mca: base: components_register: found loaded component ucx >> [c11-1:36522] mca: base: components_register: component ucx register >> function successful >> [c11-1:36522] mca: base: components_open: opening spml components >> [c11-1:36522] mca: base: components_open: found loaded component ucx >> [c11-2:105590] mca: base: components_register: registering framework spml >> components >> [c11-2:105590] mca: base: components_register: found loaded component ucx >> [c11-2:105590] mca: base: components_register: component ucx register >> function successful >> [c11-2:105590] mca: base: components_open: opening spml components >> [c11-2:105590] mca: base: components_open: found loaded component ucx >> [c11-1:36522] mca: base: components_open: component ucx open function >> successful >> [c11-2:105590] mca: base: components_open: component ucx open function >> successful >> [c11-1:36522] base/spml_base_select.c:107 - mca_spml_base_select() >> select: initializing spml component ucx >> [c11-1:36522] spml_ucx_component.c:173 - mca_spml_ucx_component_init() in >> ucx, my priority is 21 >> [c11-2:105590] base/spml_base_select.c:107 - mca_spml_base_select() >> select: initializing spml component ucx >> [c11-2:105590] spml_ucx_component.c:173 - mca_spml_ucx_component_init() >> in ucx, my priority is 21 >> [c11-1:36522] spml_ucx_component.c:184 - mca_spml_ucx_component_init() >> *** ucx initialized **** >> [c11-1:36522] base/spml_base_select.c:119 - mca_spml_base_select() >> select: init returned priority 21 >> [c11-1:36522] base/spml_base_select.c:160 - mca_spml_base_select() >> selected ucx best priority 21 >> [c11-1:36522] base/spml_base_select.c:194 - mca_spml_base_select() >> select: component ucx selected >> [c11-1:36522] spml_ucx.c:82 - mca_spml_ucx_enable() *** ucx ENABLED **** >> [c11-2:105590] spml_ucx_component.c:184 - mca_spml_ucx_component_init() >> *** ucx initialized **** >> [c11-2:105590] base/spml_base_select.c:119 - mca_spml_base_select() >> select: init returned priority 21 >> [c11-2:105590] base/spml_base_select.c:160 - mca_spml_base_select() >> selected ucx best priority 21 >> [c11-2:105590] base/spml_base_select.c:194 - mca_spml_base_select() >> select: component ucx selected >> [c11-2:105590] spml_ucx.c:82 - mca_spml_ucx_enable() *** ucx ENABLED **** >> [c11-1:36522] spml_ucx.c:305 - mca_spml_ucx_add_procs() *** ADDED PROCS >> *** >> [c11-2:105590] spml_ucx.c:305 - mca_spml_ucx_add_procs() *** ADDED PROCS >> *** >> shared_mr flags are not supported >> shared_mr flags are not supported >> get external pointer failed! >> >> >> So it looks like everything is fine - maybe except the 'shared_mr flags >> are not supported' message. >> >> Does anyone have ideas why I get NULL? The same happens if I start two >> ranks on the same compute node, and if I use shmem_malloc'ed pointer >> instead of a static array. >> >> Thank you, >> >> Marcin >> >> _______________________________________________ >> users mailing list >> users@lists.open-mpi.org >> https://lists.open-mpi.org/mailman/listinfo/users > > >
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users