Hmmm...afraid there isn't much I can offer here, Siegmar. For whatever reason, hwloc is indicating it cannot bind processes on that architecture.
On Feb 9, 2014, at 12:08 PM, Siegmar Gross <siegmar.gr...@informatik.hs-fulda.de> wrote: > Hi Ralph, > > thank you very much for your reply. I have changed my rankfile. > > rank 0=rs0 slot=0:0-1 > rank 1=rs0 slot=1 > rank 2=rs1 slot=0 > rank 3=rs1 slot=1 > > Now I get the following output. > > rs0 openmpi_1.7.x_or_newer 108 mpiexec --report-bindings \ > --use-hwthread-cpus -np 4 -rf rf_rs0_rs1 hostname > -------------------------------------------------------------------------- > Open MPI tried to bind a new process, but something went wrong. The > process was killed without launching the target application. Your job > will now abort. > > Local host: rs0 > Application name: /usr/local/bin/hostname > Error message: hwloc indicates cpu binding cannot be enforced > Location: > ../../../../../openmpi-1.7.4/orte/mca/odls/default/odls_default_module.c:499 > -------------------------------------------------------------------------- > rs0 openmpi_1.7.x_or_newer 109 > > > Kind regards > > Siegmar > > > > >>> today I tested rankfiles once more. The good news first: openmpi-1.7.4 >>> now supports my Sun M4000 server with Sparc VII processors on the >>> command line. >>> >>> rs0 openmpi_1.7.x_or_newer 104 mpiexec --report-bindings -np 4 \ >>> --bind-to hwthread hostname >>> [rs0.informatik.hs-fulda.de:06051] MCW rank 1 bound to >>> socket 0[core 1[hwt 0]]: [../B./../..][../../../..] >>> [rs0.informatik.hs-fulda.de:06051] MCW rank 2 bound to >>> socket 1[core 4[hwt 0]]: [../../../..][B./../../..] >>> [rs0.informatik.hs-fulda.de:06051] MCW rank 3 bound to >>> socket 1[core 5[hwt 0]]: [../../../..][../B./../..] >>> [rs0.informatik.hs-fulda.de:06051] MCW rank 0 bound to >>> socket 0[core 0[hwt 0]]: [B./../../..][../../../..] >>> rs0.informatik.hs-fulda.de >>> rs0.informatik.hs-fulda.de >>> rs0.informatik.hs-fulda.de >>> rs0.informatik.hs-fulda.de >>> rs0 openmpi_1.7.x_or_newer 105 >>> >>> Thank you very much for solving this problem. Unfortunately I still >>> have a problem with a rankfile. Contents of my rankfile: >>> >>> rank 0=rs0 slot=0:0-7 >>> rank 1=rs0 slot=1 >>> rank 2=rs1 slot=0 >>> rank 3=rs1 slot=1 >>> >> >> >> Here's your problem - you told us socket 0, cores 0-7. However, if >> you look at your topology, you only have *4* cores in socket 0 >> >> >>> >>> rs0 openmpi_1.7.x_or_newer 105 mpiexec --report-bindings \ >>> --use-hwthread-cpus -np 4 -rf rf_rs0_rs1 hostname >>> [rs0.informatik.hs-fulda.de:06060] [[7659,0],0] ORTE_ERROR_LOG: Not >>> found in file >>> .../openmpi-1.7.4/orte/mca/rmaps/rank_file/rmaps_rank_file.c >>> at line 283 >>> [rs0.informatik.hs-fulda.de:06060] [[7659,0],0] ORTE_ERROR_LOG: Not >>> found in file >>> .../openmpi-1.7.4/orte/mca/rmaps/base/rmaps_base_map_job.c >>> at line 284 >>> rs0 openmpi_1.7.x_or_newer 106 >>> >>> >>> rs0 openmpi_1.7.x_or_newer 110 mpiexec --report-bindings \ >>> --display-allocation --mca rmaps_base_verbose_100 \ >>> --use-hwthread-cpus -np 4 -rf rf_rs0_rs1 hostname >>> >>> ====================== ALLOCATED NODES ====================== >>> rs0: slots=2 max_slots=0 slots_inuse=0 >>> rs1: slots=2 max_slots=0 slots_inuse=0 >>> ================================================================= >>> [rs0.informatik.hs-fulda.de:06074] [[7677,0],0] ORTE_ERROR_LOG: Not found >>> in > file >>> ../../../../../openmpi-1.7.4/orte/mca/rmaps/rank_file/rmaps_rank_file.c at > line 283 >>> [rs0.informatik.hs-fulda.de:06074] [[7677,0],0] ORTE_ERROR_LOG: Not found >>> in > file >>> ../../../../openmpi-1.7.4/orte/mca/rmaps/base/rmaps_base_map_job.c at line > 284 >>> rs0 openmpi_1.7.x_or_newer 111 >>> >>> >>> rs0 openmpi_1.7.x_or_newer 111 mpiexec --report-bindings > --display-allocation --mca ess_base_verbose 5 --use-hwthread-cpus -np >>> 4 -rf rf_rs0_rs1 hostname >>> [rs0.informatik.hs-fulda.de:06078] mca:base:select:( ess) Querying > component [env] >>> [rs0.informatik.hs-fulda.de:06078] mca:base:select:( ess) Skipping > component [env]. Query failed to return a module >>> [rs0.informatik.hs-fulda.de:06078] mca:base:select:( ess) Querying > component [hnp] >>> [rs0.informatik.hs-fulda.de:06078] mca:base:select:( ess) Query of > component [hnp] set priority to 100 >>> [rs0.informatik.hs-fulda.de:06078] mca:base:select:( ess) Querying > component [singleton] >>> [rs0.informatik.hs-fulda.de:06078] mca:base:select:( ess) Skipping > component [singleton]. Query failed to return a module >>> [rs0.informatik.hs-fulda.de:06078] mca:base:select:( ess) Querying > component [tool] >>> [rs0.informatik.hs-fulda.de:06078] mca:base:select:( ess) Skipping > component [tool]. Query failed to return a module >>> [rs0.informatik.hs-fulda.de:06078] mca:base:select:( ess) Selected > component [hnp] >>> [rs0.informatik.hs-fulda.de:06078] [[INVALID],INVALID] Topology Info: >>> [rs0.informatik.hs-fulda.de:06078] Type: Machine Number of child objects: 1 >>> Name=NULL >>> total=33554432KB >>> Backend=Solaris >>> OSName=SunOS >>> OSRelease=5.10 >>> OSVersion=Generic_150400-04 >>> Architecture=sun4u >>> Cpuset: 0x0000ffff >>> Online: 0x0000ffff >>> Allowed: 0x0000ffff >>> Bind CPU proc: TRUE >>> Bind CPU thread: TRUE >>> Bind MEM proc: TRUE >>> Bind MEM thread: TRUE >>> Type: NUMANode Number of child objects: 2 >>> Name=NULL >>> local=33554432KB >>> total=33554432KB >>> Cpuset: 0x0000ffff >>> Online: 0x0000ffff >>> Allowed: 0x0000ffff >>> Type: Socket Number of child objects: 4 >>> Name=NULL >>> CPUType=sparcv9 >>> CPUModel=SPARC64_VII >>> Cpuset: 0x000000ff >>> Online: 0x000000ff >>> Allowed: 0x000000ff >>> Type: Core Number of child objects: 2 >>> Name=NULL >>> Cpuset: 0x00000003 >>> Online: 0x00000003 >>> Allowed: 0x00000003 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000001 >>> Online: 0x00000001 >>> Allowed: 0x00000001 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000002 >>> Online: 0x00000002 >>> Allowed: 0x00000002 >>> Type: Core Number of child objects: 2 >>> Name=NULL >>> Cpuset: 0x0000000c >>> Online: 0x0000000c >>> Allowed: 0x0000000c >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000004 >>> Online: 0x00000004 >>> Allowed: 0x00000004 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000008 >>> Online: 0x00000008 >>> Allowed: 0x00000008 >>> Type: Core Number of child objects: 2 >>> Name=NULL >>> Cpuset: 0x00000030 >>> Online: 0x00000030 >>> Allowed: 0x00000030 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000010 >>> Online: 0x00000010 >>> Allowed: 0x00000010 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000020 >>> Online: 0x00000020 >>> Allowed: 0x00000020 >>> Type: Core Number of child objects: 2 >>> Name=NULL >>> Cpuset: 0x000000c0 >>> Online: 0x000000c0 >>> Allowed: 0x000000c0 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000040 >>> Online: 0x00000040 >>> Allowed: 0x00000040 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000080 >>> Online: 0x00000080 >>> Allowed: 0x00000080 >>> Type: Socket Number of child objects: 4 >>> Name=NULL >>> CPUType=sparcv9 >>> CPUModel=SPARC64_VII >>> Cpuset: 0x0000ff00 >>> Online: 0x0000ff00 >>> Allowed: 0x0000ff00 >>> Type: Core Number of child objects: 2 >>> Name=NULL >>> Cpuset: 0x00000300 >>> Online: 0x00000300 >>> Allowed: 0x00000300 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000100 >>> Online: 0x00000100 >>> Allowed: 0x00000100 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000200 >>> Online: 0x00000200 >>> Allowed: 0x00000200 >>> Type: Core Number of child objects: 2 >>> Name=NULL >>> Cpuset: 0x00000c00 >>> Online: 0x00000c00 >>> Allowed: 0x00000c00 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000400 >>> Online: 0x00000400 >>> Allowed: 0x00000400 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000800 >>> Online: 0x00000800 >>> Allowed: 0x00000800 >>> Type: Core Number of child objects: 2 >>> Name=NULL >>> Cpuset: 0x00003000 >>> Online: 0x00003000 >>> Allowed: 0x00003000 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00001000 >>> Online: 0x00001000 >>> Allowed: 0x00001000 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00002000 >>> Online: 0x00002000 >>> Allowed: 0x00002000 >>> Type: Core Number of child objects: 2 >>> Name=NULL >>> Cpuset: 0x0000c000 >>> Online: 0x0000c000 >>> Allowed: 0x0000c000 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00004000 >>> Online: 0x00004000 >>> Allowed: 0x00004000 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00008000 >>> Online: 0x00008000 >>> Allowed: 0x00008000 >>> [rs1.informatik.hs-fulda.de:09657] mca:base:select:( ess) Querying > component [env] >>> [rs1.informatik.hs-fulda.de:09657] mca:base:select:( ess) Query of > component [env] set priority to 20 >>> [rs1.informatik.hs-fulda.de:09657] mca:base:select:( ess) Selected > component [env] >>> [rs1.informatik.hs-fulda.de:09657] ess:env set name to [[7673,0],1] >>> [rs1.informatik.hs-fulda.de:09657] [[7673,0],1] Topology Info: >>> [rs1.informatik.hs-fulda.de:09657] Type: Machine Number of child objects: 1 >>> Name=NULL >>> total=33554432KB >>> Backend=Solaris >>> OSName=SunOS >>> OSRelease=5.10 >>> OSVersion=Generic_150400-04 >>> Architecture=sun4u >>> Cpuset: 0x0000ffff >>> Online: 0x0000ffff >>> Allowed: 0x0000ffff >>> Bind CPU proc: TRUE >>> Bind CPU thread: TRUE >>> Bind MEM proc: TRUE >>> Bind MEM thread: TRUE >>> Type: NUMANode Number of child objects: 2 >>> Name=NULL >>> local=33554432KB >>> total=33554432KB >>> Cpuset: 0x0000ffff >>> Online: 0x0000ffff >>> Allowed: 0x0000ffff >>> Type: Socket Number of child objects: 4 >>> Name=NULL >>> CPUType=sparcv9 >>> CPUModel=SPARC64_VII >>> Cpuset: 0x000000ff >>> Online: 0x000000ff >>> Allowed: 0x000000ff >>> Type: Core Number of child objects: 2 >>> Name=NULL >>> Cpuset: 0x00000003 >>> Online: 0x00000003 >>> Allowed: 0x00000003 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000001 >>> Online: 0x00000001 >>> Allowed: 0x00000001 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000002 >>> Online: 0x00000002 >>> Allowed: 0x00000002 >>> Type: Core Number of child objects: 2 >>> Name=NULL >>> Cpuset: 0x0000000c >>> Online: 0x0000000c >>> Allowed: 0x0000000c >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000004 >>> Online: 0x00000004 >>> Allowed: 0x00000004 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000008 >>> Online: 0x00000008 >>> Allowed: 0x00000008 >>> Type: Core Number of child objects: 2 >>> Name=NULL >>> Cpuset: 0x00000030 >>> Online: 0x00000030 >>> Allowed: 0x00000030 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000010 >>> Online: 0x00000010 >>> Allowed: 0x00000010 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000020 >>> Online: 0x00000020 >>> Allowed: 0x00000020 >>> Type: Core Number of child objects: 2 >>> Name=NULL >>> Cpuset: 0x000000c0 >>> Online: 0x000000c0 >>> Allowed: 0x000000c0 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000040 >>> Online: 0x00000040 >>> Allowed: 0x00000040 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000080 >>> Online: 0x00000080 >>> Allowed: 0x00000080 >>> Type: Socket Number of child objects: 4 >>> Name=NULL >>> CPUType=sparcv9 >>> CPUModel=SPARC64_VII >>> Cpuset: 0x0000ff00 >>> Online: 0x0000ff00 >>> Allowed: 0x0000ff00 >>> Type: Core Number of child objects: 2 >>> Name=NULL >>> Cpuset: 0x00000300 >>> Online: 0x00000300 >>> Allowed: 0x00000300 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000100 >>> Online: 0x00000100 >>> Allowed: 0x00000100 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000200 >>> Online: 0x00000200 >>> Allowed: 0x00000200 >>> Type: Core Number of child objects: 2 >>> Name=NULL >>> Cpuset: 0x00000c00 >>> Online: 0x00000c00 >>> Allowed: 0x00000c00 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000400 >>> Online: 0x00000400 >>> Allowed: 0x00000400 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000800 >>> Online: 0x00000800 >>> Allowed: 0x00000800 >>> Type: Core Number of child objects: 2 >>> Name=NULL >>> Cpuset: 0x00003000 >>> Online: 0x00003000 >>> Allowed: 0x00003000 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00001000 >>> Online: 0x00001000 >>> Allowed: 0x00001000 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00002000 >>> Online: 0x00002000 >>> Allowed: 0x00002000 >>> Type: Core Number of child objects: 2 >>> Name=NULL >>> Cpuset: 0x0000c000 >>> Online: 0x0000c000 >>> Allowed: 0x0000c000 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00004000 >>> Online: 0x00004000 >>> Allowed: 0x00004000 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00008000 >>> Online: 0x00008000 >>> Allowed: 0x00008000 >>> >>> ====================== ALLOCATED NODES ====================== >>> rs0: slots=2 max_slots=0 slots_inuse=0 >>> rs1: slots=2 max_slots=0 slots_inuse=0 >>> ================================================================= >>> [rs0.informatik.hs-fulda.de:06078] [[7673,0],0] ORTE_ERROR_LOG: Not found >>> in > file >>> ../../../../../openmpi-1.7.4/orte/mca/rmaps/rank_file/rmaps_rank_file.c at > line 283 >>> [rs0.informatik.hs-fulda.de:06078] [[7673,0],0] ORTE_ERROR_LOG: Not found >>> in > file >>> ../../../../openmpi-1.7.4/orte/mca/rmaps/base/rmaps_base_map_job.c at line > 284 >>> [rs1.informatik.hs-fulda.de:09657] [[7673,0],1] setting up session dir with >>> tmpdir: UNDEF >>> host rs1 >>> rs0 openmpi_1.7.x_or_newer 112 >>> >>> >>> >>> >>> rs0 openmpi_1.7.x_or_newer 113 mpiexec --report-bindings > --display-allocation --mca plm_base_verbose 100 --use-hwthread-cpus >>> -np 4 -rf rf_rs0_rs1 hostname >>> [rs0.informatik.hs-fulda.de:06088] mca: base: components_register: > registering plm components >>> [rs0.informatik.hs-fulda.de:06088] mca: base: components_register: found > loaded component rsh >>> [rs0.informatik.hs-fulda.de:06088] mca: base: components_register: >>> component > rsh register function successful >>> [rs0.informatik.hs-fulda.de:06088] mca: base: components_open: opening plm > components >>> [rs0.informatik.hs-fulda.de:06088] mca: base: components_open: found loaded > component rsh >>> [rs0.informatik.hs-fulda.de:06088] mca: base: components_open: component >>> rsh > open function successful >>> [rs0.informatik.hs-fulda.de:06088] mca:base:select: Auto-selecting plm > components >>> [rs0.informatik.hs-fulda.de:06088] mca:base:select:( plm) Querying > component [rsh] >>> [rs0.informatik.hs-fulda.de:06088] [[INVALID],INVALID] plm:rsh_lookup on > agent ssh : rsh path NULL >>> [rs0.informatik.hs-fulda.de:06088] mca:base:select:( plm) Query of > component [rsh] set priority to 10 >>> [rs0.informatik.hs-fulda.de:06088] mca:base:select:( plm) Selected > component [rsh] >>> [rs0.informatik.hs-fulda.de:06088] plm:base:set_hnp_name: initial bias 6088 > nodename hash 3909477186 >>> [rs0.informatik.hs-fulda.de:06088] plm:base:set_hnp_name: final jobfam 7567 >>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:rsh_setup on agent ssh >>> : > rsh path NULL >>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:base:receive start comm >>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:base:setup_job >>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:base:setup_vm >>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:base:setup_vm creating > map >>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] setup:vm: working unmanaged > allocation >>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] using rankfile rf_rs0_rs1 >>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] checking node rs0 >>> >>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] ignoring myself >>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] checking node rs1 >>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:base:setup_vm add new > daemon [[7567,0],1] >>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:base:setup_vm assigning > new daemon [[7567,0],1] to node rs1 >>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:rsh: launching vm >>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:rsh: local shell: 2 > (tcsh) >>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:rsh: assuming same > remote shell as local shell >>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:rsh: remote shell: 2 > (tcsh) >>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:rsh: final template > argv: >>> /usr/local/bin/ssh <template> orted -mca orte_report_bindings 1 -mca > ess env -mca orte_ess_jobid 495910912 -mca >>> orte_ess_vpid <template> -mca orte_ess_num_procs 2 -mca orte_hnp_uri >>> "495910912.0;tcp://193.174.26.198,192.168.128.1,10.1.1.2:43810" >>> --tree-spawn > --mca plm_base_verbose 100 -mca plm rsh -mca >>> orte_rankfile rf_rs0_rs1 -mca hwloc_base_use_hwthreads_as_cpus 1 -mca > orte_display_alloc 1 -mca hwloc_base_report_bindings 1 >>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:rsh:launch daemon 0 not > a child of mine >>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:rsh: adding node rs1 to > launch list >>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:rsh: activating launch > event >>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:rsh: recording launch >>> of > daemon [[7567,0],1] >>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:rsh: executing: > (/usr/local/bin/ssh) [/usr/local/bin/ssh rs1 orted -mca >>> orte_report_bindings 1 -mca ess env -mca orte_ess_jobid 495910912 -mca > orte_ess_vpid 1 -mca orte_ess_num_procs 2 -mca >>> orte_hnp_uri >>> "495910912.0;tcp://193.174.26.198,192.168.128.1,10.1.1.2:43810" > --tree-spawn --mca plm_base_verbose 100 -mca plm >>> rsh -mca orte_rankfile rf_rs0_rs1 -mca hwloc_base_use_hwthreads_as_cpus 1 > -mca orte_display_alloc 1 -mca >>> hwloc_base_report_bindings 1] >>> Warning: untrusted X11 forwarding setup failed: xauth key data not generated >>> Warning: No xauth data; using fake authentication data for X11 forwarding. >>> [rs1.informatik.hs-fulda.de:09721] mca: base: components_register: > registering plm components >>> [rs1.informatik.hs-fulda.de:09721] mca: base: components_register: found > loaded component rsh >>> [rs1.informatik.hs-fulda.de:09721] mca: base: components_register: >>> component > rsh register function successful >>> [rs1.informatik.hs-fulda.de:09721] mca: base: components_open: opening plm > components >>> [rs1.informatik.hs-fulda.de:09721] mca: base: components_open: found loaded > component rsh >>> [rs1.informatik.hs-fulda.de:09721] mca: base: components_open: component >>> rsh > open function successful >>> [rs1.informatik.hs-fulda.de:09721] mca:base:select: Auto-selecting plm > components >>> [rs1.informatik.hs-fulda.de:09721] mca:base:select:( plm) Querying > component [rsh] >>> [rs1.informatik.hs-fulda.de:09721] [[7567,0],1] plm:rsh_lookup on agent ssh > : rsh path NULL >>> [rs1.informatik.hs-fulda.de:09721] mca:base:select:( plm) Query of > component [rsh] set priority to 10 >>> [rs1.informatik.hs-fulda.de:09721] mca:base:select:( plm) Selected > component [rsh] >>> [rs1.informatik.hs-fulda.de:09721] [[7567,0],1] plm:rsh_setup on agent ssh >>> : > rsh path NULL >>> [rs1.informatik.hs-fulda.de:09721] [[7567,0],1] plm:base:receive start comm >>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] >>> plm:base:orted_report_launch > from daemon [[7567,0],1] >>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] >>> plm:base:orted_report_launch > from daemon [[7567,0],1] on node rs1 >>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] RECEIVED TOPOLOGY FROM NODE > rs1 >>> [rs0.informatik.hs-fulda.de:06088] Type: Machine Number of child objects: 1 >>> Name=NULL >>> total=33554432KB >>> Backend=Solaris >>> OSName=SunOS >>> OSRelease=5.10 >>> OSVersion=Generic_150400-04 >>> Architecture=sun4u >>> Cpuset: 0x0000ffff >>> Online: 0x0000ffff >>> Allowed: 0x0000ffff >>> Bind CPU proc: TRUE >>> Bind CPU thread: TRUE >>> Bind MEM proc: TRUE >>> Bind MEM thread: TRUE >>> Type: NUMANode Number of child objects: 2 >>> Name=NULL >>> local=33554432KB >>> total=33554432KB >>> Cpuset: 0x0000ffff >>> Online: 0x0000ffff >>> Allowed: 0x0000ffff >>> Type: Socket Number of child objects: 4 >>> Name=NULL >>> CPUType=sparcv9 >>> CPUModel=SPARC64_VII >>> Cpuset: 0x000000ff >>> Online: 0x000000ff >>> Allowed: 0x000000ff >>> Type: Core Number of child objects: 2 >>> Name=NULL >>> Cpuset: 0x00000003 >>> Online: 0x00000003 >>> Allowed: 0x00000003 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000001 >>> Online: 0x00000001 >>> Allowed: 0x00000001 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000002 >>> Online: 0x00000002 >>> Allowed: 0x00000002 >>> Type: Core Number of child objects: 2 >>> Name=NULL >>> Cpuset: 0x0000000c >>> Online: 0x0000000c >>> Allowed: 0x0000000c >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000004 >>> Online: 0x00000004 >>> Allowed: 0x00000004 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000008 >>> Online: 0x00000008 >>> Allowed: 0x00000008 >>> Type: Core Number of child objects: 2 >>> Name=NULL >>> Cpuset: 0x00000030 >>> Online: 0x00000030 >>> Allowed: 0x00000030 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000010 >>> Online: 0x00000010 >>> Allowed: 0x00000010 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000020 >>> Online: 0x00000020 >>> Allowed: 0x00000020 >>> Type: Core Number of child objects: 2 >>> Name=NULL >>> Cpuset: 0x000000c0 >>> Online: 0x000000c0 >>> Allowed: 0x000000c0 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000040 >>> Online: 0x00000040 >>> Allowed: 0x00000040 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000080 >>> Online: 0x00000080 >>> Allowed: 0x00000080 >>> Type: Socket Number of child objects: 4 >>> Name=NULL >>> CPUType=sparcv9 >>> CPUModel=SPARC64_VII >>> Cpuset: 0x0000ff00 >>> Online: 0x0000ff00 >>> Allowed: 0x0000ff00 >>> Type: Core Number of child objects: 2 >>> Name=NULL >>> Cpuset: 0x00000300 >>> Online: 0x00000300 >>> Allowed: 0x00000300 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000100 >>> Online: 0x00000100 >>> Allowed: 0x00000100 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000200 >>> Online: 0x00000200 >>> Allowed: 0x00000200 >>> Type: Core Number of child objects: 2 >>> Name=NULL >>> Cpuset: 0x00000c00 >>> Online: 0x00000c00 >>> Allowed: 0x00000c00 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000400 >>> Online: 0x00000400 >>> Allowed: 0x00000400 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00000800 >>> Online: 0x00000800 >>> Allowed: 0x00000800 >>> Type: Core Number of child objects: 2 >>> Name=NULL >>> Cpuset: 0x00003000 >>> Online: 0x00003000 >>> Allowed: 0x00003000 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00001000 >>> Online: 0x00001000 >>> Allowed: 0x00001000 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00002000 >>> Online: 0x00002000 >>> Allowed: 0x00002000 >>> Type: Core Number of child objects: 2 >>> Name=NULL >>> Cpuset: 0x0000c000 >>> Online: 0x0000c000 >>> Allowed: 0x0000c000 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00004000 >>> Online: 0x00004000 >>> Allowed: 0x00004000 >>> Type: PU Number of child objects: 0 >>> Name=NULL >>> Cpuset: 0x00008000 >>> Online: 0x00008000 >>> Allowed: 0x00008000 >>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] TOPOLOGY MATCHES - > DISCARDING >>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] >>> plm:base:orted_report_launch > completed for daemon [[7567,0],1] at contact >>> 495910912.1;tcp://193.174.26.199,192.168.128.2,10.1.1.2:37231 >>> >>> ====================== ALLOCATED NODES ====================== >>> rs0: slots=2 max_slots=0 slots_inuse=0 >>> rs1: slots=2 max_slots=0 slots_inuse=0 >>> ================================================================= >>> [rs1.informatik.hs-fulda.de:09721] [[7567,0],1] plm:rsh: remote spawn called >>> [rs1.informatik.hs-fulda.de:09721] [[7567,0],1] plm:rsh: remote spawn - >>> have > no children! >>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] ORTE_ERROR_LOG: Not found >>> in > file >>> ../../../../../openmpi-1.7.4/orte/mca/rmaps/rank_file/rmaps_rank_file.c at > line 283 >>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] ORTE_ERROR_LOG: Not found >>> in > file >>> ../../../../openmpi-1.7.4/orte/mca/rmaps/base/rmaps_base_map_job.c at line > 284 >>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:base:orted_cmd sending > orted_exit commands >>> [rs0.informatik.hs-fulda.de:06088] [[7567,0],0] plm:base:receive stop comm >>> [rs0.informatik.hs-fulda.de:06088] mca: base: close: component rsh closed >>> [rs0.informatik.hs-fulda.de:06088] mca: base: close: unloading component rsh >>> [rs1.informatik.hs-fulda.de:09721] [[7567,0],1] plm:base:receive stop comm >>> [rs1.informatik.hs-fulda.de:09721] mca: base: close: component rsh closed >>> [rs1.informatik.hs-fulda.de:09721] mca: base: close: unloading component rsh >>> rs0 openmpi_1.7.x_or_newer 114 >>> >>> >>> >>> >>> I still have the problem that I get no output if I mix little and >>> big endian machines, which works for openmpi-1.6.x. >>> >>> linpc1 openmpi_1.7.x_or_newer 112 mpiexec -report-bindings -np 4 \ >>> -rf rf_linpc_sunpc_tyr hostname >>> linpc1 openmpi_1.7.x_or_newer 113 >>> >>> >>> >>> linpc1 openmpi_1.7.x_or_newer 188 mpiexec -report-bindings > --display-allocation --mca plm_base_verbose 100 -np 1 -rf >>> rf_linpc_sunpc_tyr hostname >>> [linpc1:20650] mca: base: components_register: registering plm components >>> [linpc1:20650] mca: base: components_register: found loaded component rsh >>> [linpc1:20650] mca: base: components_register: component rsh register > function successful >>> [linpc1:20650] mca: base: components_register: found loaded component slurm >>> [linpc1:20650] mca: base: components_register: component slurm register > function successful >>> [linpc1:20650] mca: base: components_open: opening plm components >>> [linpc1:20650] mca: base: components_open: found loaded component rsh >>> [linpc1:20650] mca: base: components_open: component rsh open function > successful >>> [linpc1:20650] mca: base: components_open: found loaded component slurm >>> [linpc1:20650] mca: base: components_open: component slurm open function > successful >>> [linpc1:20650] mca:base:select: Auto-selecting plm components >>> [linpc1:20650] mca:base:select:( plm) Querying component [rsh] >>> [linpc1:20650] [[INVALID],INVALID] plm:rsh_lookup on agent ssh : rsh path > NULL >>> [linpc1:20650] mca:base:select:( plm) Query of component [rsh] set >>> priority > to 10 >>> [linpc1:20650] mca:base:select:( plm) Querying component [slurm] >>> [linpc1:20650] mca:base:select:( plm) Skipping component [slurm]. Query > failed to return a module >>> [linpc1:20650] mca:base:select:( plm) Selected component [rsh] >>> [linpc1:20650] mca: base: close: component slurm closed >>> [linpc1:20650] mca: base: close: unloading component slurm >>> [linpc1:20650] plm:base:set_hnp_name: initial bias 20650 nodename hash > 3902177415 >>> [linpc1:20650] plm:base:set_hnp_name: final jobfam 14523 >>> [linpc1:20650] [[14523,0],0] plm:rsh_setup on agent ssh : rsh path NULL >>> [linpc1:20650] [[14523,0],0] plm:base:receive start comm >>> [linpc1:20650] [[14523,0],0] plm:base:setup_job >>> [linpc1:20650] [[14523,0],0] plm:base:setup_vm >>> [linpc1:20650] [[14523,0],0] plm:base:setup_vm creating map >>> [linpc1:20650] [[14523,0],0] setup:vm: working unmanaged allocation >>> [linpc1:20650] [[14523,0],0] using rankfile rf_linpc_sunpc_tyr >>> [linpc1:20650] [[14523,0],0] checking node linpc0 >>> [linpc1:20650] [[14523,0],0] checking node linpc1 >>> [linpc1:20650] [[14523,0],0] ignoring myself >>> [linpc1:20650] [[14523,0],0] checking node sunpc1 >>> [linpc1:20650] [[14523,0],0] checking node tyr >>> [linpc1:20650] [[14523,0],0] plm:base:setup_vm add new daemon [[14523,0],1] >>> [linpc1:20650] [[14523,0],0] plm:base:setup_vm assigning new daemon > [[14523,0],1] to node linpc0 >>> [linpc1:20650] [[14523,0],0] plm:base:setup_vm add new daemon [[14523,0],2] >>> [linpc1:20650] [[14523,0],0] plm:base:setup_vm assigning new daemon > [[14523,0],2] to node sunpc1 >>> [linpc1:20650] [[14523,0],0] plm:base:setup_vm add new daemon [[14523,0],3] >>> [linpc1:20650] [[14523,0],0] plm:base:setup_vm assigning new daemon > [[14523,0],3] to node tyr >>> [linpc1:20650] [[14523,0],0] plm:rsh: launching vm >>> [linpc1:20650] [[14523,0],0] plm:rsh: local shell: 2 (tcsh) >>> [linpc1:20650] [[14523,0],0] plm:rsh: assuming same remote shell as local > shell >>> [linpc1:20650] [[14523,0],0] plm:rsh: remote shell: 2 (tcsh) >>> [linpc1:20650] [[14523,0],0] plm:rsh: final template argv: >>> /usr/local/bin/ssh <template> orted -mca orte_report_bindings 1 -mca > ess env -mca orte_ess_jobid 951779328 -mca >>> orte_ess_vpid <template> -mca orte_ess_num_procs 4 -mca orte_hnp_uri > "951779328.0;tcp://193.174.26.208:46876" --tree-spawn >>> --mca plm_base_verbose 100 -mca plm rsh -mca hwloc_base_report_bindings 1 > -mca orte_display_alloc 1 -mca orte_rankfile >>> rf_linpc_sunpc_tyr >>> [linpc1:20650] [[14523,0],0] plm:rsh:launch daemon 0 not a child of mine >>> [linpc1:20650] [[14523,0],0] plm:rsh: adding node linpc0 to launch list >>> [linpc1:20650] [[14523,0],0] plm:rsh: adding node sunpc1 to launch list >>> [linpc1:20650] [[14523,0],0] plm:rsh:launch daemon 3 not a child of mine >>> [linpc1:20650] [[14523,0],0] plm:rsh: activating launch event >>> [linpc1:20650] [[14523,0],0] plm:rsh: recording launch of daemon > [[14523,0],1] >>> [linpc1:20650] [[14523,0],0] plm:rsh: recording launch of daemon > [[14523,0],2] >>> [linpc1:20650] [[14523,0],0] plm:rsh: executing: (/usr/local/bin/ssh) > [/usr/local/bin/ssh sunpc1 orted -mca >>> orte_report_bindings 1 -mca ess env -mca orte_ess_jobid 951779328 -mca > orte_ess_vpid 2 -mca orte_ess_num_procs 4 -mca >>> orte_hnp_uri "951779328.0;tcp://193.174.26.208:46876" --tree-spawn --mca > plm_base_verbose 100 -mca plm rsh -mca >>> hwloc_base_report_bindings 1 -mca orte_display_alloc 1 -mca orte_rankfile > rf_linpc_sunpc_tyr] >>> [linpc1:20650] [[14523,0],0] plm:rsh: executing: (/usr/local/bin/ssh) > [/usr/local/bin/ssh linpc0 orted -mca >>> orte_report_bindings 1 -mca ess env -mca orte_ess_jobid 951779328 -mca > orte_ess_vpid 1 -mca orte_ess_num_procs 4 -mca >>> orte_hnp_uri "951779328.0;tcp://193.174.26.208:46876" --tree-spawn --mca > plm_base_verbose 100 -mca plm rsh -mca >>> hwloc_base_report_bindings 1 -mca orte_display_alloc 1 -mca orte_rankfile > rf_linpc_sunpc_tyr] >>> Warning: untrusted X11 forwarding setup failed: xauth key data not generated >>> Warning: No xauth data; using fake authentication data for X11 forwarding. >>> X11 forwarding request failed on channel 0 >>> Warning: untrusted X11 forwarding setup failed: xauth key data not generated >>> Warning: No xauth data; using fake authentication data for X11 forwarding. >>> [sunpc1:09408] mca: base: components_register: registering plm components >>> [sunpc1:09408] mca: base: components_register: found loaded component rsh >>> [sunpc1:09408] mca: base: components_register: component rsh register > function successful >>> [sunpc1:09408] mca: base: components_open: opening plm components >>> [sunpc1:09408] mca: base: components_open: found loaded component rsh >>> [sunpc1:09408] mca: base: components_open: component rsh open function > successful >>> [sunpc1:09408] mca:base:select: Auto-selecting plm components >>> [sunpc1:09408] mca:base:select:( plm) Querying component [rsh] >>> [sunpc1:09408] [[14523,0],2] plm:rsh_lookup on agent ssh : rsh path NULL >>> [sunpc1:09408] mca:base:select:( plm) Query of component [rsh] set >>> priority > to 10 >>> [sunpc1:09408] mca:base:select:( plm) Selected component [rsh] >>> [sunpc1:09408] [[14523,0],2] plm:rsh_setup on agent ssh : rsh path NULL >>> [sunpc1:09408] [[14523,0],2] plm:base:receive start comm >>> [linpc1:20650] [[14523,0],0] plm:base:orted_report_launch from daemon > [[14523,0],2] >>> [linpc1:20650] [[14523,0],0] plm:base:orted_report_launch from daemon > [[14523,0],2] on node sunpc1 >>> [linpc1:20650] [[14523,0],0] plm:base:orted_report_launch completed for > daemon [[14523,0],2] at contact >>> 951779328.2;tcp://193.174.26.210:33215 >>> [sunpc1:09408] [[14523,0],2] plm:rsh: remote spawn called >>> [sunpc1:09408] [[14523,0],2] plm:rsh: remote spawn - have no children! >>> [linpc0:32306] mca: base: components_register: registering plm components >>> [linpc0:32306] mca: base: components_register: found loaded component rsh >>> [linpc0:32306] mca: base: components_register: component rsh register > function successful >>> [linpc0:32306] mca: base: components_open: opening plm components >>> [linpc0:32306] mca: base: components_open: found loaded component rsh >>> [linpc0:32306] mca: base: components_open: component rsh open function > successful >>> [linpc0:32306] mca:base:select: Auto-selecting plm components >>> [linpc0:32306] mca:base:select:( plm) Querying component [rsh] >>> [linpc0:32306] [[14523,0],1] plm:rsh_lookup on agent ssh : rsh path NULL >>> [linpc0:32306] mca:base:select:( plm) Query of component [rsh] set >>> priority > to 10 >>> [linpc0:32306] mca:base:select:( plm) Selected component [rsh] >>> [linpc0:32306] [[14523,0],1] plm:rsh_setup on agent ssh : rsh path NULL >>> [linpc0:32306] [[14523,0],1] plm:base:receive start comm >>> [linpc1:20650] [[14523,0],0] plm:base:orted_report_launch from daemon > [[14523,0],1] >>> [linpc1:20650] [[14523,0],0] plm:base:orted_report_launch from daemon > [[14523,0],1] on node linpc0 >>> [linpc1:20650] [[14523,0],0] RECEIVED TOPOLOGY FROM NODE linpc0 >>> [linpc1:20650] Type: Machine Number of child objects: 2 >>> Name=NULL >>> total=8387048KB >>> DMIProductName="Sun Ultra 40 Workstation" >>> DMIProductVersion=11 >>> DMIBoardVendor="Sun Microsystems" >>> DMIBoardName="Sun Ultra 40 Workstation" >>> DMIBoardVersion=50 >>> DMIBoardAssetTag= >>> DMIChassisVendor="Sun Microsystems" >>> DMIChassisType=17 >>> DMIChassisVersion=01 >>> DMIChassisAssetTag= >>> DMIBIOSVendor="Phoenix Technologies Ltd." >>> DMIBIOSVersion="1.70 " >>> DMIBIOSDate=02/15/2008 >>> DMISysVendor="Sun Microsystems" >>> Backend=Linux >>> OSName=Linux >>> OSRelease=3.1.10-1.16-desktop >>> OSVersion="#1 SMP PREEMPT Wed Jun 27 05:21:40 UTC 2012 (d016078)" >>> Architecture=x86_64 >>> Cpuset: 0x0000000f >>> Online: 0x0000000f >>> Allowed: 0x0000000f >>> Bind CPU proc: TRUE >>> Bind CPU thread: TRUE >>> Bind MEM proc: FALSE >>> Bind MEM thread: TRUE >>> Type: NUMANode Number of child objects: 2 >>> Name=NULL >>> local=4192744KB >>> total=4192744KB >>> Cpuset: 0x00000003 >>> Online: 0x00000003 >>> Allowed: 0x00000003 >>> Type: Socket Number of child objects: 2 >>> Name=NULL >>> CPUModel="Dual Core AMD Opteron(tm) Processor 280" >>> Cpuset: 0x00000003 >>> Online: 0x00000003 >>> Allowed: 0x00000003 >>> Type: L2Cache Number of child objects: 1 >>> Name=NULL >>> size=1024KB >>> linesize=64 >>> ways=16 >>> Cpuset: 0x00000001 >>> Online: 0x00000001 >>> Allowed: 0x00000001 >>> Type: L1dCache Number of child objects: 1 >>> Name=NULL >>> size=64KB >>> linesize=64 >>> ways=2 >>> Cpuset: 0x00000001 >>> Online: 0x00000001 >>> Allowed: 0x00000001 >>> Type: Core Number of child objects: 1 >>> Name=NULL >>> Cpuset: 0x00000001 >>> Online: 0x00000001 >>> Allowed: 0x00000001 >>> Type: PU Number of child > objects: 0 >>> Name=NULL >>> Cpuset: 0x00000001 >>> Online: 0x00000001 >>> Allowed: 0x00000001 >>> Type: L2Cache Number of child objects: 1 >>> Name=NULL >>> size=1024KB >>> linesize=64 >>> ways=16 >>> Cpuset: 0x00000002 >>> Online: 0x00000002 >>> Allowed: 0x00000002 >>> Type: L1dCache Number of child objects: 1 >>> Name=NULL >>> size=64KB >>> linesize=64 >>> ways=2 >>> Cpuset: 0x00000002 >>> Online: 0x00000002 >>> Allowed: 0x00000002 >>> Type: Core Number of child objects: 1 >>> Name=NULL >>> Cpuset: 0x00000002 >>> Online: 0x00000002 >>> Allowed: 0x00000002 >>> Type: PU Number of child > objects: 0 >>> Name=NULL >>> Cpuset: 0x00000002 >>> Online: 0x00000002 >>> Allowed: 0x00000002 >>> Type: Bridge Host->PCI Number of child objects: 4 >>> Name=NULL >>> buses=0000:[00-03] >>> Type: PCI 10de:0053 Number of child objects: 1 >>> Name=nVidia Corporation CK804 IDE >>> busid=0000:00:06.0 >>> class=0101(IDE) >>> PCIVendor="nVidia Corporation" >>> PCIDevice="CK804 IDE" >>> Type: Block Number of child objects: 0 >>> Name=sr0 >>> Type: PCI 10de:0055 Number of child objects: 1 >>> Name=nVidia Corporation CK804 Serial ATA > Controller >>> busid=0000:00:07.0 >>> class=0101(IDE) >>> PCIVendor="nVidia Corporation" >>> PCIDevice="CK804 Serial ATA Controller" >>> Type: Block Number of child objects: 0 >>> Name=sda >>> Type: PCI 10de:0054 Number of child objects: 0 >>> Name=nVidia Corporation CK804 Serial ATA > Controller >>> busid=0000:00:08.0 >>> class=0101(IDE) >>> PCIVendor="nVidia Corporation" >>> PCIDevice="CK804 Serial ATA Controller" >>> Type: PCI 10de:029d Number of child objects: 2 >>> Name=nVidia Corporation G71GL [Quadro FX > 3500] >>> busid=0000:03:00.0 >>> class=0300(VGA) >>> PCIVendor="nVidia Corporation" >>> PCIDevice="G71GL [Quadro FX 3500]" >>> Type: GPU Number of child objects: 0 >>> Name=controlD64 >>> Type: GPU Number of child objects: 0 >>> Name=card0 >>> Type: NUMANode Number of child objects: 2 >>> Name=NULL >>> local=4194304KB >>> total=4194304KB >>> Cpuset: 0x0000000c >>> Online: 0x0000000c >>> Allowed: 0x0000000c >>> Type: Socket Number of child objects: 2 >>> Name=NULL >>> CPUModel="Dual Core AMD Opteron(tm) Processor 280" >>> Cpuset: 0x0000000c >>> Online: 0x0000000c >>> Allowed: 0x0000000c >>> Type: L2Cache Number of child objects: 1 >>> Name=NULL >>> size=1024KB >>> linesize=64 >>> ways=16 >>> Cpuset: 0x00000004 >>> Online: 0x00000004 >>> Allowed: 0x00000004 >>> Type: L1dCache Number of child objects: 1 >>> Name=NULL >>> size=64KB >>> linesize=64 >>> ways=2 >>> Cpuset: 0x00000004 >>> Online: 0x00000004 >>> Allowed: 0x00000004 >>> Type: Core Number of child objects: 1 >>> Name=NULL >>> Cpuset: 0x00000004 >>> Online: 0x00000004 >>> Allowed: 0x00000004 >>> Type: PU Number of child > objects: 0 >>> Name=NULL >>> Cpuset: 0x00000004 >>> Online: 0x00000004 >>> Allowed: 0x00000004 >>> Type: L2Cache Number of child objects: 1 >>> Name=NULL >>> size=1024KB >>> linesize=64 >>> ways=16 >>> Cpuset: 0x00000008 >>> Online: 0x00000008 >>> Allowed: 0x00000008 >>> Type: L1dCache Number of child objects: 1 >>> Name=NULL >>> size=64KB >>> linesize=64 >>> ways=2 >>> Cpuset: 0x00000008 >>> Online: 0x00000008 >>> Allowed: 0x00000008 >>> Type: Core Number of child objects: 1 >>> Name=NULL >>> Cpuset: 0x00000008 >>> Online: 0x00000008 >>> Allowed: 0x00000008 >>> Type: PU Number of child > objects: 0 >>> Name=NULL >>> Cpuset: 0x00000008 >>> Online: 0x00000008 >>> Allowed: 0x00000008 >>> Type: Bridge Host->PCI Number of child objects: 2 >>> Name=NULL >>> buses=0000:[80-82] >>> Type: PCI 10de:0054 Number of child objects: 0 >>> Name=nVidia Corporation CK804 Serial ATA > Controller >>> busid=0000:80:07.0 >>> class=0101(IDE) >>> PCIVendor="nVidia Corporation" >>> PCIDevice="CK804 Serial ATA Controller" >>> Type: PCI 10de:0055 Number of child objects: 0 >>> Name=nVidia Corporation CK804 Serial ATA > Controller >>> busid=0000:80:08.0 >>> class=0101(IDE) >>> PCIVendor="nVidia Corporation" >>> PCIDevice="CK804 Serial ATA Controller" >>> [linpc1:20650] [[14523,0],0] NEW TOPOLOGY - ADDING >>> [linpc1:20650] [[14523,0],0] plm:base:orted_report_launch completed for > daemon [[14523,0],1] at contact >>> 951779328.1;tcp://193.174.26.214,192.168.1.1:57891 >>> [linpc0:32306] [[14523,0],1] plm:rsh: remote spawn called >>> [linpc0:32306] [[14523,0],1] plm:rsh: local shell: 2 (tcsh) >>> [linpc0:32306] [[14523,0],1] plm:rsh: assuming same remote shell as local > shell >>> [linpc0:32306] [[14523,0],1] plm:rsh: remote shell: 2 (tcsh) >>> [linpc0:32306] [[14523,0],1] plm:rsh: final template argv: >>> /usr/local/bin/ssh <template> orted -mca orte_report_bindings 1 -mca > ess env -mca orte_ess_jobid 951779328 -mca >>> orte_ess_vpid <template> -mca orte_ess_num_procs 4 -mca orte_parent_uri > "951779328.1;tcp://193.174.26.214,192.168.1.1:57891" >>> -mca orte_hnp_uri "951779328.0;tcp://193.174.26.208:46876" --mca > plm_base_verbose 100 -mca hwloc_base_report_bindings 1 -mca >>> orte_display_alloc 1 -mca orte_rankfile rf_linpc_sunpc_tyr -mca plm rsh >>> [linpc0:32306] [[14523,0],1] plm:rsh: activating launch event >>> [linpc0:32306] [[14523,0],1] plm:rsh: recording launch of daemon > [[14523,0],3] >>> [linpc0:32306] [[14523,0],1] plm:rsh: executing: (/usr/local/bin/ssh) > [/usr/local/bin/ssh tyr orted -mca orte_report_bindings >>> 1 -mca ess env -mca orte_ess_jobid 951779328 -mca orte_ess_vpid 3 -mca > orte_ess_num_procs 4 -mca orte_parent_uri >>> "951779328.1;tcp://193.174.26.214,192.168.1.1:57891" -mca orte_hnp_uri > "951779328.0;tcp://193.174.26.208:46876" --mca >>> plm_base_verbose 100 -mca hwloc_base_report_bindings 1 -mca > orte_display_alloc 1 -mca orte_rankfile rf_linpc_sunpc_tyr -mca >>> plm rsh --tree-spawn] >>> Warning: untrusted X11 forwarding setup failed: xauth key data not generated >>> Warning: No xauth data; using fake authentication data for X11 forwarding. >>> [tyr.informatik.hs-fulda.de:23227] mca: base: components_register: > registering plm components >>> [tyr.informatik.hs-fulda.de:23227] mca: base: components_register: found > loaded component rsh >>> [tyr.informatik.hs-fulda.de:23227] mca: base: components_register: >>> component > rsh register function successful >>> [tyr.informatik.hs-fulda.de:23227] mca: base: components_open: opening plm > components >>> [tyr.informatik.hs-fulda.de:23227] mca: base: components_open: found loaded > component rsh >>> [tyr.informatik.hs-fulda.de:23227] mca: base: components_open: component >>> rsh > open function successful >>> [tyr.informatik.hs-fulda.de:23227] mca:base:select: Auto-selecting plm > components >>> [tyr.informatik.hs-fulda.de:23227] mca:base:select:( plm) Querying > component [rsh] >>> [tyr.informatik.hs-fulda.de:23227] [[14523,0],3] plm:rsh_lookup on agent >>> ssh > : rsh path NULL >>> [tyr.informatik.hs-fulda.de:23227] mca:base:select:( plm) Query of > component [rsh] set priority to 10 >>> [tyr.informatik.hs-fulda.de:23227] mca:base:select:( plm) Selected > component [rsh] >>> [tyr.informatik.hs-fulda.de:23227] [[14523,0],3] plm:rsh_setup on agent ssh > : rsh path NULL >>> [tyr.informatik.hs-fulda.de:23227] [[14523,0],3] plm:base:receive start comm >>> [tyr.informatik.hs-fulda.de:23227] [[14523,0],3] plm:base:receive stop comm >>> [tyr.informatik.hs-fulda.de:23227] mca: base: close: component rsh closed >>> [tyr.informatik.hs-fulda.de:23227] mca: base: close: unloading component rsh >>> [linpc0:32306] [[14523,0],1] daemon 3 failed with status 1 >>> [linpc1:20650] [[14523,0],0] plm:base:orted_cmd sending orted_exit commands >>> [linpc1:20650] [[14523,0],0] plm:base:receive stop comm >>> [linpc1:20650] mca: base: close: component rsh closed >>> [linpc1:20650] mca: base: close: unloading component rsh >>> linpc1 openmpi_1.7.x_or_newer 189 [sunpc1:09408] [[14523,0],2] > plm:base:receive stop comm >>> [sunpc1:09408] mca: base: close: component rsh closed >>> [sunpc1:09408] mca: base: close: unloading component rsh >>> [linpc0:32306] [[14523,0],1] plm:base:receive stop comm >>> [linpc0:32306] mca: base: close: component rsh closed >>> [linpc0:32306] mca: base: close: unloading component rsh >>> >>> linpc1 openmpi_1.7.x_or_newer 189 >>> >>> >>> >>> linpc1 openmpi_1.7.x_or_newer 189 mpiexec -report-bindings > --display-allocation --mca rmaps_base_verbose_100 -np 1 -rf >>> rf_linpc_sunpc_tyr hostname >>> >>> ====================== ALLOCATED NODES ====================== >>> linpc1: slots=1 max_slots=0 slots_inuse=0 >>> ================================================================= >>> -------------------------------------------------------------------------- >>> mpiexec was unable to find the specified executable file, and therefore >>> did not launch the job. This error was first reported for process >>> rank 0; it may have occurred for other processes as well. >>> >>> NOTE: A common cause for this error is misspelling a mpiexec command >>> line parameter option (remember that mpiexec interprets the first >>> unrecognized command line token as the executable). >>> >>> Node: linpc1 >>> Executable: 1 >>> -------------------------------------------------------------------------- >>> linpc1 openmpi_1.7.x_or_newer 190 >>> >>> >>> >>> >>> Kind regards >>> >>> Siegmar >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users