Nope - works that way too (running from rhc001): $ mpirun -H rhc002:24 --map-by ppr:1:socket:pe=1 date Wed Dec 20 02:08:18 PST 2017 Wed Dec 20 02:08:18 PST 2017 $
> On Dec 20, 2017, at 1:51 AM, Siegmar Gross > <siegmar.gr...@informatik.hs-fulda.de> wrote: > > Hi Ralph, > > the problem occurs when you add --host with a different machine. Without > --host or with "--host <same machine>" everything works well. > > pc03 fd1026 111 which mpiexec > /usr/local/openmpi-3.0.0_64_gcc/bin/mpiexec > > pc03 fd1026 112 mpiexec -np 2 --map-by ppr:1:socket:pe=1 date > [pc03:09373] SETTING BINDING TO CORE > Wed Dec 20 10:44:21 CET 2017 > Wed Dec 20 10:44:21 CET 2017 > > pc03 fd1026 113 mpiexec -np 2 --host pc03:2 --map-by ppr:1:socket:pe=1 date > [pc03:09385] SETTING BINDING TO CORE > Wed Dec 20 10:44:43 CET 2017 > Wed Dec 20 10:44:43 CET 2017 > > pc03 fd1026 114 mpiexec -np 2 --host pc02:2 --map-by ppr:1:socket:pe=1 date > [pc03:09395] SETTING BINDING TO CORE > [pc02:08340] SETTING BINDING TO CORE > -------------------------------------------------------------------------- > The request to bind processes could not be completed due to > an internal error - the locale of the following process was > not set by the mapper code: > ... > > > Kind regards > > Siegmar > > > On 12/20/17 09:22, r...@open-mpi.org <mailto:r...@open-mpi.org> wrote: >> I just checked the head of both the master and 3.0.x branches, and they both >> work fine: >> $ mpirun --map-by ppr:1:socket:pe=1 date >> [rhc001:139231] SETTING BINDING TO CORE >> [rhc002.cluster:203672] SETTING BINDING TO CORE >> Wed Dec 20 00:20:55 PST 2017 >> Wed Dec 20 00:20:55 PST 2017 >> Tue Dec 19 18:37:03 PST 2017 >> Tue Dec 19 18:37:03 PST 2017 >> $ >> I’ll remove the debug, but it looks like this was already fixed. >>> On Dec 19, 2017, at 10:49 PM, Siegmar Gross >>> <siegmar.gr...@informatik.hs-fulda.de >>> <mailto:siegmar.gr...@informatik.hs-fulda.de> >>> <mailto:siegmar.gr...@informatik.hs-fulda.de >>> <mailto:siegmar.gr...@informatik.hs-fulda.de>>> wrote: >>> >>> Hi, >>> >>> I've installed openmpi-v3.0.0 on my "SUSE Linux Enterprise Server 12.3 >>> (x86_64)" with gcc-6.4.0. Today I discovered that I get an error for >>> --map-by that I don't >>> get with older versions. >>> >>> >>> loki fd1026 115 which mpiexec >>> /usr/local/openmpi-2.0.3_64_gcc/bin/mpiexec >>> loki fd1026 116 mpiexec --host pc02:2,pc03:2 --map-by ppr:1:socket:pe=1 date >>> Wed Dec 20 07:41:00 CET 2017 >>> ,... >>> >>> loki fd1026 107 which mpiexec >>> /usr/local/openmpi-2.1.2_64_gcc/bin/mpiexec >>> loki fd1026 108 mpiexec --host pc02:2,pc03:2 --map-by ppr:1:socket:pe=1 date >>> Wed Dec 20 07:41:27 CET 2017 >>> ... >>> >>> loki fd1026 107 which mpiexec >>> /usr/local/openmpi-3.0.0_64_gcc/bin/mpiexec >>> loki fd1026 108 mpiexec --host pc02:2,pc03:2 --map-by ppr:1:socket:pe=1 date >>> [loki:32662] SETTING BINDING TO CORE >>> [pc02:04420] SETTING BINDING TO CORE >>> [pc03:04788] SETTING BINDING TO CORE >>> -------------------------------------------------------------------------- >>> The request to bind processes could not be completed due to >>> an internal error - the locale of the following process was >>> not set by the mapper code: >>> >>> Process: [[57386,1],3] >>> >>> Please contact the OMPI developers for assistance. Meantime, >>> you will still be able to run your application without binding >>> by specifying "--bind-to none" on your command line. >>> -------------------------------------------------------------------------- >>> -------------------------------------------------------------------------- >>> ORTE has lost communication with a remote daemon. >>> >>> HNP daemon : [[57386,0],0] on node loki >>> Remote daemon: [[57386,0],2] on node pc03 >>> >>> This is usually due to either a failure of the TCP network >>> connection to the node, or possibly an internal failure of >>> the daemon itself. We cannot recover from this failure, and >>> therefore will terminate the job. >>> -------------------------------------------------------------------------- >>> [loki:32662] 1 more process has sent help message help-orte-rmaps-base.txt >>> / rmaps:no-locale >>> [loki:32662] Set MCA parameter "orte_base_help_aggregate" to 0 to see all >>> help / error messages >>> loki fd1026 109 >>> >>> >>> >>> I would be grateful, if somebody can fix the problem. Do you need anything >>> else? Thank you very much for any help in advance. >>> >>> >>> Kind regards >>> >>> Siegmar >>> _______________________________________________ >>> users mailing list >>> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org> >>> <mailto:users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>> >>> https://lists.open-mpi.org/mailman/listinfo/users >>> <https://lists.open-mpi.org/mailman/listinfo/users> >> _______________________________________________ >> users mailing list >> users@lists.open-mpi.org <mailto:users@lists.open-mpi.org> >> https://lists.open-mpi.org/mailman/listinfo/users >> <https://lists.open-mpi.org/mailman/listinfo/users> > _______________________________________________ > users mailing list > users@lists.open-mpi.org <mailto:users@lists.open-mpi.org> > https://lists.open-mpi.org/mailman/listinfo/users > <https://lists.open-mpi.org/mailman/listinfo/users>
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users