On 11-Aug-09, at 6:16 AM, Jeff Squyres wrote:

This means that OMPI is finding an mca_iof_proxy.la file at run time from a prior version of Open MPI. You might want to use "find" or "locate" to search your nodes and find it. I suspect that you somehow have an OMPI 1.3.x install that overlaid an install of a prior OMPI version installation.


OK, right you were - the old file was in my new install directory. I didn't erase /usr/local/openmpi before re-running the install...

However, after reinstalling on the nodes (but not cleaning out /usr/ lib on all the nodes) I still have the following:

Thanks,  Jody


saturna.cluster:17660] mca:base:select:(  plm) Querying component [rsh]
[saturna.cluster:17660] mca:base:select:( plm) Query of component [rsh] set priority to 10 [saturna.cluster:17660] mca:base:select:( plm) Querying component [slurm] [saturna.cluster:17660] mca:base:select:( plm) Skipping component [slurm]. Query failed to return a module
[saturna.cluster:17660] mca:base:select:(  plm) Querying component [tm]
[saturna.cluster:17660] mca:base:select:( plm) Skipping component [tm]. Query failed to return a module [saturna.cluster:17660] mca:base:select:( plm) Querying component [xgrid] [saturna.cluster:17660] mca:base:select:( plm) Skipping component [xgrid]. Query failed to return a module
[saturna.cluster:17660] mca:base:select:(  plm) Selected component [rsh]
[saturna.cluster:17660] plm:base:set_hnp_name: initial bias 17660 nodename hash 1656374957
[saturna.cluster:17660] plm:base:set_hnp_name: final jobfam 24811
[saturna.cluster:17660] [[24811,0],0] plm:base:receive start comm
[saturna.cluster:17660] mca:base:select:( odls) Querying component [default] [saturna.cluster:17660] mca:base:select:( odls) Query of component [default] set priority to 1 [saturna.cluster:17660] mca:base:select:( odls) Selected component [default]
[saturna.cluster:17660] [[24811,0],0] plm:rsh: setting up job [24811,1]
[saturna.cluster:17660] [[24811,0],0] plm:base:setup_job for job [24811,1]
[saturna.cluster:17660] [[24811,0],0] plm:rsh: local shell: 0 (bash)
[saturna.cluster:17660] [[24811,0],0] plm:rsh: assuming same remote shell as local shell
[saturna.cluster:17660] [[24811,0],0] plm:rsh: remote shell: 0 (bash)
[saturna.cluster:17660] [[24811,0],0] plm:rsh: final template argv:
/usr/bin/ssh <template> PATH=/usr/local/openmpi/bin:$PATH ; export PATH ; LD_LIBRARY_PATH=/usr/local/openmpi/lib:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH ; /usr/local/openmpi/bin/orted --debug-daemons -mca ess env -mca orte_ess_jobid 1626013696 -mca orte_ess_vpid <template> -mca orte_ess_num_procs 3 --hnp-uri "1626013696.0;tcp:// 142.104.154.96:49710;tcp://192.168.2.254:49710" -mca plm_base_verbose 5 -mca odls_base_verbose 5 [saturna.cluster:17660] [[24811,0],0] plm:rsh: launching on node xserve01 [saturna.cluster:17660] [[24811,0],0] plm:rsh: recording launch of daemon [[24811,0],1] [saturna.cluster:17660] [[24811,0],0] plm:rsh: executing: (//usr/bin/ ssh) [/usr/bin/ssh xserve01 PATH=/usr/local/openmpi/bin:$PATH ; export PATH ; LD_LIBRARY_PATH=/usr/local/openmpi/lib: $LD_LIBRARY_PATH ; export LD_LIBRARY_PATH ; /usr/local/openmpi/bin/ orted --debug-daemons -mca ess env -mca orte_ess_jobid 1626013696 -mca orte_ess_vpid 1 -mca orte_ess_num_procs 3 --hnp-uri "1626013696.0;tcp://142.104.154.96:49710;tcp://192.168.2.254:49710" - mca plm_base_verbose 5 -mca odls_base_verbose 5]
Daemon was launched on xserve01.cluster - beginning to initialize
[xserve01.cluster:42519] mca:base:select:( odls) Querying component [default] [xserve01.cluster:42519] mca:base:select:( odls) Query of component [default] set priority to 1 [xserve01.cluster:42519] mca:base:select:( odls) Selected component [default]
Daemon [[24811,0],1] checking in as pid 42519 on host xserve01.cluster
Daemon [[24811,0],1] not using static ports
[saturna.cluster:17660] [[24811,0],0] plm:rsh: launching on node xserve02 [saturna.cluster:17660] [[24811,0],0] plm:rsh: recording launch of daemon [[24811,0],2] [saturna.cluster:17660] [[24811,0],0] plm:rsh: executing: (//usr/bin/ ssh) [/usr/bin/ssh xserve02 PATH=/usr/local/openmpi/bin:$PATH ; export PATH ; LD_LIBRARY_PATH=/usr/local/openmpi/lib: $LD_LIBRARY_PATH ; export LD_LIBRARY_PATH ; /usr/local/openmpi/bin/ orted --debug-daemons -mca ess env -mca orte_ess_jobid 1626013696 -mca orte_ess_vpid 2 -mca orte_ess_num_procs 3 --hnp-uri "1626013696.0;tcp://142.104.154.96:49710;tcp://192.168.2.254:49710" - mca plm_base_verbose 5 -mca odls_base_verbose 5]
Daemon was launched on xserve02.local - beginning to initialize
[xserve02.local:42180] mca:base:select:( odls) Querying component [default] [xserve02.local:42180] mca:base:select:( odls) Query of component [default] set priority to 1 [xserve02.local:42180] mca:base:select:( odls) Selected component [default]
Daemon [[24811,0],2] checking in as pid 42180 on host xserve02.local
Daemon [[24811,0],2] not using static ports
[saturna.cluster:17660] [[24811,0],0] plm:base:daemon_callback
[saturna.cluster:17660] progressed_wait: base/ plm_base_launch_support.c 459 [saturna.cluster:17660] defining message event: base/ plm_base_launch_support.c 423 [saturna.cluster:17660] [[24811,0],0] plm:base:orted_report_launch from daemon [[24811,0],1] [saturna.cluster:17660] [[24811,0],0] plm:base:orted_report_launch completed for daemon [[24811,0],1] [saturna.cluster:17660] defining message event: base/ plm_base_launch_support.c 423 [saturna.cluster:17660] [[24811,0],0] plm:base:orted_report_launch from daemon [[24811,0],2] [xserve01.cluster:42519] [[24811,0],1] orted: up and running - waiting for commands! [saturna.cluster:17660] [[24811,0],0] plm:base:orted_report_launch completed for daemon [[24811,0],2]
[saturna.cluster:17660] [[24811,0],0] plm:base:daemon_callback completed
[saturna.cluster:17660] [[24811,0],0] plm:base:launch_apps for job [24811,1] [xserve02.local:42180] [[24811,0],2] orted: up and running - waiting for commands!
[saturna.cluster:17660] defining message event: grpcomm_bad_module.c 183
[saturna.cluster:17660] [[24811,0],0] plm:base:report_launched for job [24811,1] [saturna.cluster:17660] progressed_wait: base/ plm_base_launch_support.c 712 [saturna.cluster:17660] [[24811,0],0] orte:daemon:cmd:processor called by [[24811,0],0] for tag 1 [saturna.cluster:17660] [[24811,0],0] node[0].name saturna daemon 0 arch ffc90200 [saturna.cluster:17660] [[24811,0],0] node[1].name xserve01 daemon 1 arch ffc90200 [saturna.cluster:17660] [[24811,0],0] node[2].name xserve02 daemon 2 arch ffc90200 [saturna.cluster:17660] [[24811,0],0] orted_cmd: received add_local_procs
[saturna.cluster:17660] [[24811,0],0] odls:constructing child list
[saturna.cluster:17660] [[24811,0],0] odls:construct_child_list unpacking data to launch job [24811,1] [saturna.cluster:17660] [[24811,0],0] odls:construct_child_list adding new jobdat for job [24811,1] [saturna.cluster:17660] [[24811,0],0] odls:construct_child_list unpacking 1 app_contexts [saturna.cluster:17660] [[24811,0],0] odls:constructing child list - checking proc 0 on node 1 with daemon 1 [saturna.cluster:17660] [[24811,0],0] odls:constructing child list - checking proc 1 on node 2 with daemon 2 [saturna.cluster:17660] [[24811,0],0] odls:constructing child list - checking proc 2 on node 1 with daemon 1 [saturna.cluster:17660] [[24811,0],0] odls:constructing child list - checking proc 3 on node 2 with daemon 2 [saturna.cluster:17660] [[24811,0],0] odls:constructing child list - checking proc 4 on node 1 with daemon 1 [saturna.cluster:17660] [[24811,0],0] odls:constructing child list - checking proc 5 on node 2 with daemon 2 [saturna.cluster:17660] [[24811,0],0] odls:constructing child list - checking proc 6 on node 1 with daemon 1 [saturna.cluster:17660] [[24811,0],0] odls:constructing child list - checking proc 7 on node 2 with daemon 2 [saturna.cluster:17660] [[24811,0],0] odls:constructing child list - checking proc 8 on node 1 with daemon 1 [saturna.cluster:17660] [[24811,0],0] odls:constructing child list - checking proc 9 on node 2 with daemon 2 [saturna.cluster:17660] [[24811,0],0] odls:constructing child list - checking proc 10 on node 1 with daemon 1 [saturna.cluster:17660] [[24811,0],0] odls:constructing child list - checking proc 11 on node 2 with daemon 2 [saturna.cluster:17660] [[24811,0],0] odls:constructing child list - checking proc 12 on node 1 with daemon 1 [saturna.cluster:17660] [[24811,0],0] odls:constructing child list - checking proc 13 on node 2 with daemon 2 [saturna.cluster:17660] [[24811,0],0] odls:constructing child list - checking proc 14 on node 1 with daemon 1 [saturna.cluster:17660] [[24811,0],0] odls:constructing child list - checking proc 15 on node 2 with daemon 2 [saturna.cluster:17660] [[24811,0],0] odls:construct:child: num_participating 2 [saturna.cluster:17660] [[24811,0],0] odls:launch found 4 processors for 0 children and set oversubscribed to false [saturna.cluster:17660] [[24811,0],0] odls:launch reporting job [24811,1] launch status [saturna.cluster:17660] defining message event: base/ odls_base_default_fns.c 1219
[saturna.cluster:17660] [[24811,0],0] odls:launch setting waitpids
[saturna.cluster:17660] [[24811,0],0] orte:daemon:send_relay
[saturna.cluster:17660] [[24811,0],0] orte:daemon:send_relay sending relay msg to 1 [saturna.cluster:17660] [[24811,0],0] orte:daemon:send_relay sending relay msg to 2 [saturna.cluster:17660] [[24811,0],0] plm:base:app_report_launch from daemon [[24811,0],0] [saturna.cluster:17660] [[24811,0],0] plm:base:app_report_launch completed processing [xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: received message from [[24811,0],0]
[xserve01.cluster:42519] defining message event: orted/orted_comm.c 159
[xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: reissued recv
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor called by [[24811,0],0] for tag 1 [xserve01.cluster:42519] [[24811,0],1] node[0].name saturna daemon 0 arch ffc90200 [xserve01.cluster:42519] [[24811,0],1] node[1].name xserve01 daemon 1 arch ffc90200 [xserve01.cluster:42519] [[24811,0],1] node[2].name xserve02 daemon 2 arch ffc90200 [xserve01.cluster:42519] [[24811,0],1] orted_cmd: received add_local_procs
[xserve01.cluster:42519] [[24811,0],1] odls:constructing child list
[xserve01.cluster:42519] [[24811,0],1] odls:construct_child_list unpacking data to launch job [24811,1] [xserve01.cluster:42519] [[24811,0],1] odls:construct_child_list adding new jobdat for job [24811,1] [xserve01.cluster:42519] [[24811,0],1] odls:construct_child_list unpacking 1 app_contexts [xserve02.local:42180] [[24811,0],2] orted_recv_cmd: received message from [[24811,0],0]
[xserve02.local:42180] defining message event: orted/orted_comm.c 159
[xserve02.local:42180] [[24811,0],2] orted_recv_cmd: reissued recv
[xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor called by [[24811,0],0] for tag 1 [xserve02.local:42180] [[24811,0],2] node[0].name saturna daemon 0 arch ffc90200 [xserve02.local:42180] [[24811,0],2] node[1].name xserve01 daemon 1 arch ffc90200 [xserve02.local:42180] [[24811,0],2] node[2].name xserve02 daemon 2 arch ffc90200
[xserve02.local:42180] [[24811,0],2] orted_cmd: received add_local_procs
[xserve02.local:42180] [[24811,0],2] odls:constructing child list
[xserve02.local:42180] [[24811,0],2] odls:construct_child_list unpacking data to launch job [24811,1] [xserve02.local:42180] [[24811,0],2] odls:construct_child_list adding new jobdat for job [24811,1] [xserve01.cluster:42519] [[24811,0],1] odls:constructing child list - checking proc 0 on node 1 with daemon 1 [xserve01.cluster:42519] [[24811,0],1] odls:constructing child list - found proc 0 for me! [xserve01.cluster:42519] [[24811,0],1] odls:constructing child list - checking proc 1 on node 2 with daemon 2 [xserve01.cluster:42519] [[24811,0],1] odls:constructing child list - checking proc 2 on node 1 with daemon 1 [xserve01.cluster:42519] [[24811,0],1] odls:constructing child list - found proc 2 for me! [xserve01.cluster:42519] [[24811,0],1] odls:constructing child list - checking proc 3 on node 2 with daemon 2 [xserve01.cluster:42519] [[24811,0],1] odls:constructing child list - checking proc 4 on node 1 with daemon 1 [xserve01.cluster:42519] [[24811,0],1] odls:constructing child list - found proc 4 for me! [xserve01.cluster:42519] [[24811,0],1] odls:constructing child list - checking proc 5 on node 2 with daemon 2 [xserve01.cluster:42519] [[24811,0],1] odls:constructing child list - checking proc 6 on node 1 with daemon 1 [xserve01.cluster:42519] [[24811,0],1] odls:constructing child list - found proc 6 for me! [xserve01.cluster:42519] [[24811,0],1] odls:constructing child list - checking proc 7 on node 2 with daemon 2 [xserve01.cluster:42519] [[24811,0],1] odls:constructing child list - checking proc 8 on node 1 with daemon 1 [xserve01.cluster:42519] [[24811,0],1] odls:constructing child list - found proc 8 for me! [xserve01.cluster:42519] [[24811,0],1] odls:constructing child list - checking proc 9 on node 2 with daemon 2 [xserve01.cluster:42519] [[24811,0],1] odls:constructing child list - checking proc 10 on node 1 with daemon 1 [xserve01.cluster:42519] [[24811,0],1] odls:constructing child list - found proc 10 for me! [xserve01.cluster:42519] [[24811,0],1] odls:constructing child list - checking proc 11 on node 2 with daemon 2 [xserve01.cluster:42519] [[24811,0],1] odls:constructing child list - checking proc 12 on node 1 with daemon 1 [xserve01.cluster:42519] [[24811,0],1] odls:constructing child list - found proc 12 for me! [xserve02.local:42180] [[24811,0],2] odls:construct_child_list unpacking 1 app_contexts [xserve01.cluster:42519] [[24811,0],1] odls:constructing child list - checking proc 13 on node 2 with daemon 2 [xserve01.cluster:42519] [[24811,0],1] odls:constructing child list - checking proc 14 on node 1 with daemon 1 [xserve01.cluster:42519] [[24811,0],1] odls:constructing child list - found proc 14 for me! [xserve01.cluster:42519] [[24811,0],1] odls:constructing child list - checking proc 15 on node 2 with daemon 2 [xserve01.cluster:42519] [[24811,0],1] odls:construct:child: num_participating 1 [xserve01.cluster:42519] [[24811,0],1] odls:launch found 16 processors for 8 children and set oversubscribed to false [xserve02.local:42180] [[24811,0],2] odls:constructing child list - checking proc 0 on node 1 with daemon 1 [xserve02.local:42180] [[24811,0],2] odls:constructing child list - checking proc 1 on node 2 with daemon 2 [xserve02.local:42180] [[24811,0],2] odls:constructing child list - found proc 1 for me! [xserve02.local:42180] [[24811,0],2] odls:constructing child list - checking proc 2 on node 1 with daemon 1 [xserve02.local:42180] [[24811,0],2] odls:constructing child list - checking proc 3 on node 2 with daemon 2 [xserve02.local:42180] [[24811,0],2] odls:constructing child list - found proc 3 for me! [xserve02.local:42180] [[24811,0],2] odls:constructing child list - checking proc 4 on node 1 with daemon 1 [xserve02.local:42180] [[24811,0],2] odls:constructing child list - checking proc 5 on node 2 with daemon 2 [xserve02.local:42180] [[24811,0],2] odls:constructing child list - found proc 5 for me! [xserve02.local:42180] [[24811,0],2] odls:constructing child list - checking proc 6 on node 1 with daemon 1 [xserve02.local:42180] [[24811,0],2] odls:constructing child list - checking proc 7 on node 2 with daemon 2 [xserve02.local:42180] [[24811,0],2] odls:constructing child list - found proc 7 for me! [xserve02.local:42180] [[24811,0],2] odls:constructing child list - checking proc 8 on node 1 with daemon 1 [xserve02.local:42180] [[24811,0],2] odls:constructing child list - checking proc 9 on node 2 with daemon 2 [xserve02.local:42180] [[24811,0],2] odls:constructing child list - found proc 9 for me! [xserve02.local:42180] [[24811,0],2] odls:constructing child list - checking proc 10 on node 1 with daemon 1 [xserve02.local:42180] [[24811,0],2] odls:constructing child list - checking proc 11 on node 2 with daemon 2 [xserve02.local:42180] [[24811,0],2] odls:constructing child list - found proc 11 for me! [xserve02.local:42180] [[24811,0],2] odls:constructing child list - checking proc 12 on node 1 with daemon 1 [xserve02.local:42180] [[24811,0],2] odls:constructing child list - checking proc 13 on node 2 with daemon 2 [xserve02.local:42180] [[24811,0],2] odls:constructing child list - found proc 13 for me! [xserve02.local:42180] [[24811,0],2] odls:constructing child list - checking proc 14 on node 1 with daemon 1 [xserve02.local:42180] [[24811,0],2] odls:constructing child list - checking proc 15 on node 2 with daemon 2 [xserve02.local:42180] [[24811,0],2] odls:constructing child list - found proc 15 for me! [xserve02.local:42180] [[24811,0],2] odls:construct:child: num_participating 1 [xserve02.local:42180] [[24811,0],2] odls:launch found 16 processors for 8 children and set oversubscribed to false [xserve01.cluster:42519] [[24811,0],1] odls:launch reporting job [24811,1] launch status [saturna.cluster:17660] defining message event: base/ plm_base_launch_support.c 668 [saturna.cluster:17660] [[24811,0],0] plm:base:app_report_launch reissuing non-blocking recv [saturna.cluster:17660] [[24811,0],0] plm:base:app_report_launch from daemon [[24811,0],1] [saturna.cluster:17660] [[24811,0],0] plm:base:app_report_launched for proc [[24811,1],0] from daemon [[24811,0],1]: pid 42523 state 2 exit 0 [saturna.cluster:17660] [[24811,0],0] plm:base:app_report_launched for proc [[24811,1],2] from daemon [[24811,0],1]: pid 42524 state 2 exit 0 [saturna.cluster:17660] [[24811,0],0] plm:base:app_report_launched for proc [[24811,1],4] from daemon [[24811,0],1]: pid 42525 state 2 exit 0 [saturna.cluster:17660] [[24811,0],0] plm:base:app_report_launched for proc [[24811,1],6] from daemon [[24811,0],1]: pid 42526 state 2 exit 0
[xserve01.cluster:42519] [[24811,0],1] odls:launch setting waitpids
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:send_relay
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:send_relay - recipient list is empty! [saturna.cluster:17660] [[24811,0],0] plm:base:app_report_launched for proc [[24811,1],8] from daemon [[24811,0],1]: pid 42527 state 2 exit 0 [saturna.cluster:17660] [[24811,0],0] plm:base:app_report_launched for proc [[24811,1],10] from daemon [[24811,0],1]: pid 42528 state 2 exit 0 [saturna.cluster:17660] [[24811,0],0] plm:base:app_report_launched for proc [[24811,1],12] from daemon [[24811,0],1]: pid 42529 state 2 exit 0 [saturna.cluster:17660] [[24811,0],0] plm:base:app_report_launched for proc [[24811,1],14] from daemon [[24811,0],1]: pid 42530 state 2 exit 0 [saturna.cluster:17660] [[24811,0],0] plm:base:app_report_launch completed processing [xserve02.local:42180] [[24811,0],2] odls:launch reporting job [24811,1] launch status [saturna.cluster:17660] defining message event: base/ plm_base_launch_support.c 668 [saturna.cluster:17660] [[24811,0],0] plm:base:app_report_launch reissuing non-blocking recv [saturna.cluster:17660] [[24811,0],0] plm:base:app_report_launch from daemon [[24811,0],2] [saturna.cluster:17660] [[24811,0],0] plm:base:app_report_launched for proc [[24811,1],1] from daemon [[24811,0],2]: pid 42184 state 2 exit 0 [saturna.cluster:17660] [[24811,0],0] plm:base:app_report_launched for proc [[24811,1],3] from daemon [[24811,0],2]: pid 42185 state 2 exit 0 [saturna.cluster:17660] [[24811,0],0] plm:base:app_report_launched for proc [[24811,1],5] from daemon [[24811,0],2]: pid 42186 state 2 exit 0 [saturna.cluster:17660] [[24811,0],0] plm:base:app_report_launched for proc [[24811,1],7] from daemon [[24811,0],2]: pid 42187 state 2 exit 0 [saturna.cluster:17660] [[24811,0],0] plm:base:app_report_launched for proc [[24811,1],9] from daemon [[24811,0],2]: pid 42188 state 2 exit 0 [saturna.cluster:17660] [[24811,0],0] plm:base:app_report_launched for proc [[24811,1],11] from daemon [[24811,0],2]: pid 42189 state 2 exit 0 [saturna.cluster:17660] [[24811,0],0] plm:base:app_report_launched for proc [[24811,1],13] from daemon [[24811,0],2]: pid 42190 state 2 exit 0 [saturna.cluster:17660] [[24811,0],0] plm:base:app_report_launched for proc [[24811,1],15] from daemon [[24811,0],2]: pid 42191 state 2 exit 0 [saturna.cluster:17660] [[24811,0],0] plm:base:app_report_launch completed processing [saturna.cluster:17660] [[24811,0],0] plm:base:report_launched all apps reported
[saturna.cluster:17660] [[24811,0],0] plm:base:launch wiring up iof
[xserve02.local:42180] [[24811,0],2] odls:launch setting waitpids
[xserve02.local:42180] [[24811,0],2] orte:daemon:send_relay
[xserve02.local:42180] [[24811,0],2] orte:daemon:send_relay - recipient list is empty! [saturna.cluster:17660] [[24811,0],0] plm:base:launch completed for job [24811,1] [xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: received message from [[24811,1],0]
[xserve01.cluster:42519] defining message event: orted/orted_comm.c 159
[xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: reissued recv
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor called by [[24811,1],0] for tag 1 [xserve01.cluster:42519] [[24811,0],1] orted_recv: received sync +nidmap from local proc [[24811,1],0] [xserve01.cluster:42519] [[24811,0],1] odls: registering sync on child [[24811,1],0] [xserve01.cluster:42519] [[24811,0],1] odls:sync nidmap requested for job [24811,1] [xserve01.cluster:42519] [[24811,0],1] odls: sending sync ack to child [[24811,1],0] with 307 bytes of data [xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor: processing commands completed [xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: received message from [[24811,1],4]
[xserve01.cluster:42519] defining message event: orted/orted_comm.c 159
[xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: reissued recv
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor called by [[24811,1],4] for tag 1 [xserve01.cluster:42519] [[24811,0],1] orted_recv: received sync +nidmap from local proc [[24811,1],4] [xserve01.cluster:42519] [[24811,0],1] odls: registering sync on child [[24811,1],4] [xserve01.cluster:42519] [[24811,0],1] odls:sync nidmap requested for job [24811,1] [xserve01.cluster:42519] [[24811,0],1] odls: sending sync ack to child [[24811,1],4] with 307 bytes of data [xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor: processing commands completed [xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: received message from [[24811,1],2]
[xserve01.cluster:42519] defining message event: orted/orted_comm.c 159
[xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: reissued recv
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor called by [[24811,1],2] for tag 1 [xserve01.cluster:42519] [[24811,0],1] orted_recv: received sync +nidmap from local proc [[24811,1],2] [xserve01.cluster:42519] [[24811,0],1] odls: registering sync on child [[24811,1],2] [xserve01.cluster:42519] [[24811,0],1] odls:sync nidmap requested for job [24811,1] [xserve01.cluster:42519] [[24811,0],1] odls: sending sync ack to child [[24811,1],2] with 307 bytes of data [xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor: processing commands completed [xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: received message from [[24811,1],6]
[xserve01.cluster:42519] defining message event: orted/orted_comm.c 159
[xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: reissued recv
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor called by [[24811,1],6] for tag 1 [xserve01.cluster:42519] [[24811,0],1] orted_recv: received sync +nidmap from local proc [[24811,1],6] [xserve01.cluster:42519] [[24811,0],1] odls: registering sync on child [[24811,1],6] [xserve01.cluster:42519] [[24811,0],1] odls:sync nidmap requested for job [24811,1] [xserve01.cluster:42519] [[24811,0],1] odls: sending sync ack to child [[24811,1],6] with 307 bytes of data [xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor: processing commands completed [xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: received message from [[24811,1],10]
[xserve01.cluster:42519] defining message event: orted/orted_comm.c 159
[xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: reissued recv
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor called by [[24811,1],10] for tag 1 [xserve01.cluster:42519] [[24811,0],1] orted_recv: received sync +nidmap from local proc [[24811,1],10] [xserve01.cluster:42519] [[24811,0],1] odls: registering sync on child [[24811,1],10] [xserve01.cluster:42519] [[24811,0],1] odls:sync nidmap requested for job [24811,1] [xserve01.cluster:42519] [[24811,0],1] odls: sending sync ack to child [[24811,1],10] with 307 bytes of data [xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor: processing commands completed [xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: received message from [[24811,1],8]
[xserve01.cluster:42519] defining message event: orted/orted_comm.c 159
[xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: reissued recv
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor called by [[24811,1],8] for tag 1 [xserve01.cluster:42519] [[24811,0],1] orted_recv: received sync +nidmap from local proc [[24811,1],8] [xserve01.cluster:42519] [[24811,0],1] odls: registering sync on child [[24811,1],8] [xserve01.cluster:42519] [[24811,0],1] odls:sync nidmap requested for job [24811,1] [xserve01.cluster:42519] [[24811,0],1] odls: sending sync ack to child [[24811,1],8] with 307 bytes of data [xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor: processing commands completed [xserve02.local:42180] [[24811,0],2] orted_recv_cmd: received message from [[24811,1],5]
[xserve02.local:42180] defining message event: orted/orted_comm.c 159
[xserve02.local:42180] [[24811,0],2] orted_recv_cmd: reissued recv
[xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor called by [[24811,1],5] for tag 1 [xserve02.local:42180] [[24811,0],2] orted_recv: received sync+nidmap from local proc [[24811,1],5] [xserve02.local:42180] [[24811,0],2] odls: registering sync on child [[24811,1],5] [xserve02.local:42180] [[24811,0],2] odls:sync nidmap requested for job [24811,1] [xserve02.local:42180] [[24811,0],2] odls: sending sync ack to child [[24811,1],5] with 307 bytes of data [xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor: processing commands completed [xserve02.local:42180] [[24811,0],2] orted_recv_cmd: received message from [[24811,1],1]
[xserve02.local:42180] defining message event: orted/orted_comm.c 159
[xserve02.local:42180] [[24811,0],2] orted_recv_cmd: reissued recv
[xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor called by [[24811,1],1] for tag 1 [xserve02.local:42180] [[24811,0],2] orted_recv: received sync+nidmap from local proc [[24811,1],1] [xserve02.local:42180] [[24811,0],2] odls: registering sync on child [[24811,1],1] [xserve02.local:42180] [[24811,0],2] odls:sync nidmap requested for job [24811,1] [xserve02.local:42180] [[24811,0],2] odls: sending sync ack to child [[24811,1],1] with 307 bytes of data [xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor: processing commands completed [xserve02.local:42180] [[24811,0],2] orted_recv_cmd: received message from [[24811,1],3]
[xserve02.local:42180] defining message event: orted/orted_comm.c 159
[xserve02.local:42180] [[24811,0],2] orted_recv_cmd: reissued recv
[xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor called by [[24811,1],3] for tag 1 [xserve02.local:42180] [[24811,0],2] orted_recv: received sync+nidmap from local proc [[24811,1],3] [xserve02.local:42180] [[24811,0],2] odls: registering sync on child [[24811,1],3] [xserve02.local:42180] [[24811,0],2] odls:sync nidmap requested for job [24811,1] [xserve02.local:42180] [[24811,0],2] odls: sending sync ack to child [[24811,1],3] with 307 bytes of data [xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor: processing commands completed [xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: received message from [[24811,1],12]
[xserve01.cluster:42519] defining message event: orted/orted_comm.c 159
[xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: reissued recv
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor called by [[24811,1],12] for tag 1 [xserve01.cluster:42519] [[24811,0],1] orted_recv: received sync +nidmap from local proc [[24811,1],12] [xserve01.cluster:42519] [[24811,0],1] odls: registering sync on child [[24811,1],12] [xserve01.cluster:42519] [[24811,0],1] odls:sync nidmap requested for job [24811,1] [xserve01.cluster:42519] [[24811,0],1] odls: sending sync ack to child [[24811,1],12] with 307 bytes of data [xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor: processing commands completed [xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: received message from [[24811,1],14]
[xserve01.cluster:42519] defining message event: orted/orted_comm.c 159
[xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: reissued recv
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor called by [[24811,1],14] for tag 1 [xserve01.cluster:42519] [[24811,0],1] orted_recv: received sync +nidmap from local proc [[24811,1],14] [xserve01.cluster:42519] [[24811,0],1] odls: registering sync on child [[24811,1],14] [xserve01.cluster:42519] [[24811,0],1] odls:sync nidmap requested for job [24811,1] [xserve01.cluster:42519] [[24811,0],1] odls: sending sync ack to child [[24811,1],14] with 307 bytes of data
[xserve01.cluster:42519] [[24811,0],1] odls: sending contact info to HNP
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor: processing commands completed [saturna.cluster:17660] defining message event: base/ routed_base_receive.c 153 [xserve02.local:42180] [[24811,0],2] orted_recv_cmd: received message from [[24811,1],11]
[xserve02.local:42180] defining message event: orted/orted_comm.c 159
[xserve02.local:42180] [[24811,0],2] orted_recv_cmd: reissued recv
[xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor called by [[24811,1],11] for tag 1 [xserve02.local:42180] [[24811,0],2] orted_recv: received sync+nidmap from local proc [[24811,1],11] [xserve02.local:42180] [[24811,0],2] odls: registering sync on child [[24811,1],11] [xserve02.local:42180] [[24811,0],2] odls:sync nidmap requested for job [24811,1] [xserve02.local:42180] [[24811,0],2] odls: sending sync ack to child [[24811,1],11] with 307 bytes of data [xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor: processing commands completed [xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: received message from [[24811,1],2]
[xserve01.cluster:42519] defining message event: orted/orted_comm.c 159
[xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: reissued recv
[xserve02.local:42180] [[24811,0],2] orted_recv_cmd: received message from [[24811,1],7]
[xserve02.local:42180] defining message event: orted/orted_comm.c 159
[xserve02.local:42180] [[24811,0],2] orted_recv_cmd: reissued recv
[xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor called by [[24811,1],7] for tag 1 [xserve02.local:42180] [[24811,0],2] orted_recv: received sync+nidmap from local proc [[24811,1],7] [xserve02.local:42180] [[24811,0],2] odls: registering sync on child [[24811,1],7] [xserve02.local:42180] [[24811,0],2] odls:sync nidmap requested for job [24811,1] [xserve02.local:42180] [[24811,0],2] odls: sending sync ack to child [[24811,1],7] with 307 bytes of data [xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor: processing commands completed [xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor called by [[24811,1],2] for tag 1 [xserve01.cluster:42519] [[24811,0],1] orted_cmd: received collective data cmd [xserve01.cluster:42519] [[24811,0],1] odls: collecting data from child [[24811,1],2] [xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor: processing commands completed [xserve02.local:42180] [[24811,0],2] orted_recv_cmd: received message from [[24811,1],9]
[xserve02.local:42180] defining message event: orted/orted_comm.c 159
[xserve02.local:42180] [[24811,0],2] orted_recv_cmd: reissued recv
[xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor called by [[24811,1],9] for tag 1 [xserve02.local:42180] [[24811,0],2] orted_recv: received sync+nidmap from local proc [[24811,1],9] [xserve02.local:42180] [[24811,0],2] odls: registering sync on child [[24811,1],9] [xserve02.local:42180] [[24811,0],2] odls:sync nidmap requested for job [24811,1] [xserve02.local:42180] [[24811,0],2] odls: sending sync ack to child [[24811,1],9] with 307 bytes of data [xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor: processing commands completed [xserve02.local:42180] [[24811,0],2] orted_recv_cmd: received message from [[24811,1],13]
[xserve02.local:42180] defining message event: orted/orted_comm.c 159
[xserve02.local:42180] [[24811,0],2] orted_recv_cmd: reissued recv
[xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor called by [[24811,1],13] for tag 1 [xserve02.local:42180] [[24811,0],2] orted_recv: received sync+nidmap from local proc [[24811,1],13] [xserve02.local:42180] [[24811,0],2] odls: registering sync on child [[24811,1],13] [xserve02.local:42180] [[24811,0],2] odls:sync nidmap requested for job [24811,1] [xserve02.local:42180] [[24811,0],2] odls: sending sync ack to child [[24811,1],13] with 307 bytes of data [xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor: processing commands completed [xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: received message from [[24811,1],0]
[xserve01.cluster:42519] defining message event: orted/orted_comm.c 159
[xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: reissued recv
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor called by [[24811,1],0] for tag 1 [xserve01.cluster:42519] [[24811,0],1] orted_cmd: received collective data cmd [xserve01.cluster:42519] [[24811,0],1] odls: collecting data from child [[24811,1],0] [xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor: processing commands completed [xserve02.local:42180] [[24811,0],2] orted_recv_cmd: received message from [[24811,1],15]
[xserve02.local:42180] defining message event: orted/orted_comm.c 159
[xserve02.local:42180] [[24811,0],2] orted_recv_cmd: reissued recv
[xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor called by [[24811,1],15] for tag 1 [xserve02.local:42180] [[24811,0],2] orted_recv: received sync+nidmap from local proc [[24811,1],15] [xserve02.local:42180] [[24811,0],2] odls: registering sync on child [[24811,1],15] [xserve02.local:42180] [[24811,0],2] odls:sync nidmap requested for job [24811,1] [xserve02.local:42180] [[24811,0],2] odls: sending sync ack to child [[24811,1],15] with 307 bytes of data
[xserve02.local:42180] [[24811,0],2] odls: sending contact info to HNP
[xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor: processing commands completed [saturna.cluster:17660] defining message event: base/ routed_base_receive.c 153 [xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: received message from [[24811,1],4]
[xserve01.cluster:42519] defining message event: orted/orted_comm.c 159
[xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: reissued recv
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor called by [[24811,1],4] for tag 1 [xserve01.cluster:42519] [[24811,0],1] orted_cmd: received collective data cmd [xserve01.cluster:42519] [[24811,0],1] odls: collecting data from child [[24811,1],4] [xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor: processing commands completed [xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: received message from [[24811,1],6]
[xserve01.cluster:42519] defining message event: orted/orted_comm.c 159
[xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: reissued recv
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor called by [[24811,1],6] for tag 1 [xserve01.cluster:42519] [[24811,0],1] orted_cmd: received collective data cmd [xserve01.cluster:42519] [[24811,0],1] odls: collecting data from child [[24811,1],6] [xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor: processing commands completed [xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: received message from [[24811,1],10]
[xserve01.cluster:42519] defining message event: orted/orted_comm.c 159
[xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: reissued recv
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor called by [[24811,1],10] for tag 1 [xserve01.cluster:42519] [[24811,0],1] orted_cmd: received collective data cmd [xserve01.cluster:42519] [[24811,0],1] odls: collecting data from child [[24811,1],10] [xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor: processing commands completed [xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: received message from [[24811,1],8]
[xserve01.cluster:42519] defining message event: orted/orted_comm.c 159
[xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: reissued recv
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor called by [[24811,1],8] for tag 1 [xserve01.cluster:42519] [[24811,0],1] orted_cmd: received collective data cmd [xserve01.cluster:42519] [[24811,0],1] odls: collecting data from child [[24811,1],8] [xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor: processing commands completed [xserve02.local:42180] [[24811,0],2] orted_recv_cmd: received message from [[24811,1],5]
[xserve02.local:42180] defining message event: orted/orted_comm.c 159
[xserve02.local:42180] [[24811,0],2] orted_recv_cmd: reissued recv
[xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor called by [[24811,1],5] for tag 1 [xserve02.local:42180] [[24811,0],2] orted_cmd: received collective data cmd [xserve02.local:42180] [[24811,0],2] odls: collecting data from child [[24811,1],5] [xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor: processing commands completed [xserve02.local:42180] [[24811,0],2] orted_recv_cmd: received message from [[24811,1],3]
[xserve02.local:42180] defining message event: orted/orted_comm.c 159
[xserve02.local:42180] [[24811,0],2] orted_recv_cmd: reissued recv
[xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor called by [[24811,1],3] for tag 1 [xserve02.local:42180] [[24811,0],2] orted_cmd: received collective data cmd [xserve02.local:42180] [[24811,0],2] odls: collecting data from child [[24811,1],3] [xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor: processing commands completed [xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: received message from [[24811,1],12]
[xserve01.cluster:42519] defining message event: orted/orted_comm.c 159
[xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: reissued recv
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor called by [[24811,1],12] for tag 1 [xserve01.cluster:42519] [[24811,0],1] orted_cmd: received collective data cmd [xserve01.cluster:42519] [[24811,0],1] odls: collecting data from child [[24811,1],12] [xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor: processing commands completed [xserve02.local:42180] [[24811,0],2] orted_recv_cmd: received message from [[24811,1],1]
[xserve02.local:42180] defining message event: orted/orted_comm.c 159
[xserve02.local:42180] [[24811,0],2] orted_recv_cmd: reissued recv
[xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor called by [[24811,1],1] for tag 1 [xserve02.local:42180] [[24811,0],2] orted_cmd: received collective data cmd [xserve02.local:42180] [[24811,0],2] odls: collecting data from child [[24811,1],1] [xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor: processing commands completed [saturna.cluster:17660] [[24811,0],0] orted_recv_cmd: received message from [[24811,0],1]
[saturna.cluster:17660] defining message event: orted/orted_comm.c 159
[xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: received message from [[24811,1],14]
[xserve01.cluster:42519] defining message event: orted/orted_comm.c 159
[xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: reissued recv
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor called by [[24811,1],14] for tag 1 [xserve01.cluster:42519] [[24811,0],1] orted_cmd: received collective data cmd [xserve01.cluster:42519] [[24811,0],1] odls: collecting data from child [[24811,1],14]
[xserve01.cluster:42519] [[24811,0],1] odls: executing collective
[xserve01.cluster:42519] [[24811,0],1] odls: daemon collective called
[xserve01.cluster:42519] [[24811,0],1] odls: daemon collective for job [24811,1] from [[24811,0],1] type 2 num_collected 1 num_participating 1 num_contributors 8 [xserve01.cluster:42519] [[24811,0],1] odls: daemon col[saturna.cluster:17660] [[24811,0],0] orted_recv_cmd: reissued recv
lective not the HNP - sending to parent [[24811,0],0]
[saturna.cluster:17660] [[24811,0],0] orte:daemon:cmd:processor called by [[24811,0],1] for tag 1
[xserve01.cluster:42519] [[24811,0],1] odls: collective completed
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor: processing commands completed [saturna.cluster:17660] [[24811,0],0] orted_cmd: received collective data cmd
[saturna.cluster:17660] [[24811,0],0] odls: daemon collective called
[saturna.cluster:17660] [[24811,0],0] odls: daemon collective for job [24811,1] from [[24811,0],1] type 2 num_collected 1 num_participating 2 num_contributors 8 [saturna.cluster:17660] [[24811,0],0] orte:daemon:cmd:processor: processing commands completed [xserve02.local:42180] [[24811,0],2] orted_recv_cmd: received message from [[24811,1],9]
[xserve02.local:42180] defining message event: orted/orted_comm.c 159
[xserve02.local:42180] [[24811,0],2] orted_recv_cmd: reissued recv
[xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor called by [[24811,1],9] for tag 1 [xserve02.local:42180] [[24811,0],2] orted_cmd: received collective data cmd [xserve02.local:42180] [[24811,0],2] odls: collecting data from child [[24811,1],9] [xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor: processing commands completed [xserve02.local:42180] [[24811,0],2] orted_recv_cmd: received message from [[24811,1],13]
[xserve02.local:42180] defining message event: orted/orted_comm.c 159
[xserve02.local:42180] [[24811,0],2] orted_recv_cmd: reissued recv
[xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor called by [[24811,1],13] for tag 1 [xserve02.local:42180] [[24811,0],2] orted_cmd: received collective data cmd [xserve02.local:42180] [[24811,0],2] odls: collecting data from child [[24811,1],13] [xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor: processing commands completed [xserve02.local:42180] [[24811,0],2] orted_recv_cmd: received message from [[24811,1],7]
[xserve02.local:42180] defining message event: orted/orted_comm.c 159
[xserve02.local:42180] [[24811,0],2] orted_recv_cmd: reissued recv
[xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor called by [[24811,1],7] for tag 1 [xserve02.local:42180] [[24811,0],2] orted_cmd: received collective data cmd [xserve02.local:42180] [[24811,0],2] odls: collecting data from child [[24811,1],7] [xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor: processing commands completed [xserve02.local:42180] [[24811,0],2] orted_recv_cmd: received message from [[24811,1],11]
[xserve02.local:42180] defining message event: orted/orted_comm.c 159
[xserve02.local:42180] [[24811,0],2] orted_recv_cmd: reissued recv
[xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor called by [[24811,1],11] for tag 1 [xserve02.local:42180] [[24811,0],2] orted_cmd: received collective data cmd [xserve02.local:42180] [[24811,0],2] odls: collecting data from child [[24811,1],11] [xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor: processing commands completed [xserve02.local:42180] [[24811,0],2] orted_recv_cmd: received message from [[24811,1],15]
[xserve02.local:42180] defining message event: orted/orted_comm.c 159
[xserve02.local:42180] [[24811,0],2] orted_recv_cmd: reissued recv
[xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor called by [[24811,1],15] for tag 1 [xserve02.local:42180] [[24811,0],2] orted_cmd: received collective data cmd [xserve02.local:42180] [[24811,0],2] odls: collecting data from child [[24811,1],15]
[xserve02.local:42180] [[24811,0],2] odls: executing collective
[xserve02.local:42180] [[24811,0],2] odls: daemon collective called
[xserve02.local:42180] [[24811,0],2] odls: daemon collective for job [24811,1] from [[24811,0],2] type 2 num_collected 1 num_participating 1 num_contributors 8 [xserve02.local:42180] [[24811,0],2] odls: daemon collective not the HNP - sending to parent [[24811,0],0]
[xserve02.local:42180] [[24811,0],2] odls: collective completed
[xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor: processing commands completed [saturna.cluster:17660] [[24811,0],0] orted_recv_cmd: received message from [[24811,0],2]
[saturna.cluster:17660] defining message event: orted/orted_comm.c 159
[saturna.cluster:17660] [[24811,0],0] orted_recv_cmd: reissued recv
[saturna.cluster:17660] [[24811,0],0] orte:daemon:cmd:processor called by [[24811,0],2] for tag 1 [saturna.cluster:17660] [[24811,0],0] orted_cmd: received collective data cmd
[saturna.cluster:17660] [[24811,0],0] odls: daemon collective called
[saturna.cluster:17660] [[24811,0],0] odls: daemon collective for job [24811,1] from [[24811,0],2] type 2 num_collected 2 num_participating 2 num_contributors 16 [saturna.cluster:17660] [[24811,0],0] odls: daemon collective HNP - xcasting to job [24811,1]
[saturna.cluster:17660] defining message event: grpcomm_bad_module.c 183
[saturna.cluster:17660] [[24811,0],0] orte:daemon:cmd:processor: processing commands completed [saturna.cluster:17660] [[24811,0],0] orte:daemon:cmd:processor called by [[24811,0],0] for tag 1 [saturna.cluster:17660] [[24811,0],0] orted_cmd: received message_local_procs [saturna.cluster:17660] [[24811,0],0] orted:comm:message_local_procs delivering message to job [24811,1] tag 15
[saturna.cluster:17660] [[24811,0],0] orte:daemon:send_relay
[saturna.cluster:17660] [[24811,0],0] orte:daemon:send_relay sending relay msg to 1 [saturna.cluster:17660] [[24811,0],0] orte:daemon:send_relay sending relay msg to 2 [xserve02.local:42180] [[24811,0],2] orted_recv_cmd: received message from [[24811,0],0]
[xserve02.local:42180] defining message event: orted/orted_comm.c 159
[xserve02.local:42180] [[24811,0],2] orted_recv_cmd: reissued recv
[xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor called by [[24811,0],0] for tag 1 [xserve02.local:42180] [[24811,0],2] orted_cmd: received message_local_procs [xserve02.local:42180] [[24811,0],2] orted:comm:message_local_procs delivering message to job [24811,1] tag 15 [xserve02.local:42180] [[24811,0],2] odls: sending message to tag 15 on child [[24811,1],1] [xserve02.local:42180] [[24811,0],2] odls: sending message to tag 15 on child [[24811,1],3] [xserve02.local:42180] [[24811,0],2] odls: sending message to tag 15 on child [[24811,1],5] [xserve02.local:42180] [[24811,0],2] odls: sending message to tag 15 on child [[24811,1],7] [xserve02.local:42180] [[24811,0],2] odls: sending message to tag 15 on child [[24811,1],9] [xserve02.local:42180] [[24811,0],2] odls: sending message to tag 15 on child [[24811,1],11] [xserve02.local:42180] [[24811,0],2] odls: sending message to tag 15 on child [[24811,1],13] [xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: received message from [[24811,0],0]
[xserve01.cluster:42519] defining message event: orted/orted_comm.c 159
[xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: reissued recv
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor called by [[24811,0],0] for tag 1 [xserve01.cluster:42519] [[24811,0],1] orted_cmd: received message_local_procs [xserve01.cluster:42519] [[24811,0],1] orted:comm:message_local_procs delivering message to job [24811,1] tag 15 [xserve01.cluster:42519] [[24811,0],1] odls: sending message to tag 15 on child [[24811,1],0] [xserve01.cluster:42519] [[24811,0],1] odls: sending message to tag 15 on child [[24811,1],2] [xserve01.cluster:42519] [[24811,0],1] odls: sending message to tag 15 on child [[24811,1],4] [xserve01.cluster:42519] [[24811,0],1] odls: sending message to tag 15 on child [[24811,1],6] [xserve01.cluster:42519] [[24811,0],1] odls: sending message to tag 15 on child [[24811,1],8] [xserve01.cluster:42[xserve02.local:42180] [[24811,0],2] odls: sending message to tag 15 on child [[24811,1],15]
[xserve02.local:42180] [[24811,0],2] orte:daemon:send_relay
[xserve02.local:42180] [[24811,0],2] orte:daemon:send_relay - recipient list is empty! 519] [[24811,0],1] odls: sending message to tag 15 on child [[24811,1], 10] [xserve01.cluster:42519] [[24811,0],1] odls: sending message to tag 15 on child [[24811,1],12] [xserve01.cluster:42519] [[24811,0],1] odls: sending message to tag 15 on child [[24811,1],14]
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:send_relay
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:send_relay - recipient list is empty! [xserve02.local:42180] [[24811,0],2] orted_recv_cmd: received message from [[24811,1],5]
[xserve02.local:42180] defining message event: orted/orted_comm.c 159
[xserve02.local:42180] [[24811,0],2] orted_recv_cmd: reissued recv
[xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor called by [[24811,1],5] for tag 1 [xserve02.local:42180] [[24811,0],2] orted_cmd: received collective data cmd [xserve02.local:42180] [[24811,0],2] odls: collecting data from child [[24811,1],5] [xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor: processing commands completed [xserve02.local:42180] [[24811,0],2] orted_recv_cmd: received message from [[24811,1],13]
[xserve02.local:42180] defining message event: orted/orted_comm.c 159
[xserve02.local:42180] [[24811,0],2] orted_recv_cmd: reissued recv
[xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor called by [[24811,1],13] for tag 1 [xserve02.local:42180] [[24811,0],2] orted_cmd: received collective data cmd [xserve02.local:42180] [[24811,0],2] odls: collecting data from child [[24811,1],13] [xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor: processing commands completed [xserve02.local:42180] [[24811,0],2] orted_recv_cmd: received message from [[24811,1],7]
[xserve02.local:42180] defining message event: orted/orted_comm.c 159
[xserve02.local:42180] [[24811,0],2] orted_recv_cmd: reissued recv
[xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor called by [[24811,1],7] for tag 1 [xserve02.local:42180] [[24811,0],2] orted_cmd: received collective data cmd [xserve02.local:42180] [[24811,0],2] odls: collecting data from child [[24811,1],7] [xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor: processing commands completed [xserve02.local:42180] [[24811,0],2] orted_recv_cmd: received message from [[24811,1],9]
[xserve02.local:42180] defining message event: orted/orted_comm.c 159
[xserve02.local:42180] [[24811,0],2] orted_recv_cmd: reissued recv
[xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor called by [[24811,1],9] for tag 1 [xserve02.local:42180] [[24811,0],2] orted_cmd: received collective data cmd [xserve02.local:42180] [[24811,0],2] odls: collecting data from child [[24811,1],9] [xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: received message from [[24811,1],12]
[xserve01.cluster:42519] defining message event: orted/orted_comm.c 159
[xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: reissued recv
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor called by [[24811,1],12] for tag 1 [xserve01.cluster:42519] [[24811,0],1] orted_cmd: received collective data cmd [xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor: processing commands completed [xserve01.cluster:42519] [[24811,0],1] odls: collecting data from child [[24811,1],12] [xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor: processing commands completed [xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: received message from [[24811,1],10]
[xserve01.cluster:42519] defining message event: orted/orted_comm.c 159
[xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: reissued recv
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor called by [[24811,1],10] for tag 1 [xserve01.cluster:42519] [[24811,0],1] orted_cmd: received collective data cmd [xserve01.cluster:42519] [[24811,0],1] odls: collecting data from child [[24811,1],10] [xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor: processing commands completed [xserve02.local:42180] [[24811,0],2] orted_recv_cmd: received message from [[24811,1],11]
[xserve02.local:42180] defining message event: orted/orted_comm.c 159
[xserve02.local:42180] [[24811,0],2] orted_recv_cmd: reissued recv
[xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor called by [[24811,1],11] for tag 1 [xserve02.local:42180] [[24811,0],2] orted_cmd: received collective data cmd [xserve02.local:42180] [[24811,0],2] odls: collecting data from child [[24811,1],11] [xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor: processing commands completed [xserve02.local:42180] [[24811,0],2] orted_recv_cmd: received message from [[24811,1],15]
[xserve02.local:42180] defining message event: orted/orted_comm.c 159
[xserve02.local:42180] [[24811,0],2] orted_recv_cmd: reissued recv
[xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor called by [[24811,1],15] for tag 1 [xserve02.local:42180] [[24811,0],2] orted_cmd: received collective data cmd [xserve02.local:42180] [[24811,0],2] odls: collecting data from child [[24811,1],15] [xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor: processing commands completed [xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: received message from [[24811,1],4]
[xserve01.cluster:42519] defining message event: orted/orted_comm.c 159
[xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: reissued recv
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor called by [[24811,1],4] for tag 1 [xserve01.cluster:42519] [[24811,0],1] orted_cmd: received collective data cmd [xserve01.cluster:42519] [[24811,0],1] odls: collecting data from child [[24811,1],4] [xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor: processing commands completed [xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: received message from [[24811,1],6]
[xserve01.cluster:42519] defining message event: orted/orted_comm.c 159
[xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: reissued recv
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor called by [[24811,1],6] for tag 1 [xserve01.cluster:42519] [[24811,0],1] orted_cmd: received collective data cmd [xserve01.cluster:42519] [[24811,0],1] odls: collecting data from child [[24811,1],6] [xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor: processing commands completed [xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: received message from [[24811,1],8]
[xserve01.cluster:42519] defining message event: orted/orted_comm.c 159
[xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: reissued recv
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor called by [[24811,1],8] for tag 1 [xserve01.cluster:42519] [[24811,0],1] orted_cmd: received collective data cmd [xserve01.cluster:42519] [[24811,0],1] odls: collecting data from child [[24811,1],8] [xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor: processing commands completed [xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: received message from [[24811,1],14]
[xserve01.cluster:42519] defining message event: orted/orted_comm.c 159
[xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: reissued recv
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor called by [[24811,1],14] for tag 1 [xserve01.cluster:42519] [[24811,0],1] orted_cmd: received collective data cmd [xserve01.cluster:42519] [[24811,0],1] odls: collecting data from child [[24811,1],14] [xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor: processing commands completed [xserve02.local:42180] [[24811,0],2] orted_recv_cmd: received message from [[24811,1],1]
[xserve02.local:42180] defining message event: orted/orted_comm.c 159
[xserve02.local:42180] [[24811,0],2] orted_recv_cmd: reissued recv
[xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor called by [[24811,1],1] for tag 1 [xserve02.local:42180] [[24811,0],2] orted_cmd: received collective data cmd [xserve02.local:42180] [[24811,0],2] odls: collecting data from child [[24811,1],1] [xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor: processing commands completed [xserve02.local:42180] [[24811,0],2] orted_recv_cmd: received message from [[24811,1],3]
[xserve02.local:42180] defining message event: orted/orted_comm.c 159
[xserve02.local:42180] [[24811,0],2] orted_recv_cmd: reissued recv
[xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor called by [[24811,1],3] for tag 1 [xserve02.local:42180] [[24811,0],2] orted_cmd: received collective data cmd [xserve02.local:42180] [[24811,0],2] odls: collecting data from child [[24811,1],3]
[xserve02.local:42180] [[24811,0],2] odls: executing collective
[xserve02.local:42180] [[24811,0],2] odls: daemon collective called
[saturna.cluster:17660] [[24811,0],0] orted_recv_cmd: received message from [[24811,0],2] [xserve02.local:42180] [[24811,0],2] odls: daemon collective for job [24811,1] from [[24811,0],2] type 1 num_collected 1 num_participating 1 num_contributors 8 [xserve02.local:42180] [[24811,0],2] odls: daemon collective not the HNP - sending to parent [[24811,0],0]
[xserve02.local:42180] [[24811,0],2] odls: collective completed
[xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor: processing commands completed
[saturna.cluster:17660] defining message event: orted/orted_comm.c 159
[saturna.cluster:17660] [[24811,0],0] orted_recv_cmd: reissued recv
[saturna.cluster:17660] [[24811,0],0] orted_recv_cmd: received message from [[24811,0],1]
[saturna.cluster:17660] defining message event: orted/orted_comm.c 159
[saturna.cluster:17660] [[24811,0],0] orted_recv_cmd: reissued recv
[xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: received message from [[24811,1],0]
[xserve01.cluster:42519] defining message event: orted/orted_comm.c 159
[xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: reissued recv
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor called by [[24811,1],0] for tag 1 [xserve01.cluster:42519] [[24811,0],1] orted_cmd: received collective data cmd [xserve01.cluster:42519] [[24811,0],1] odls: collecting data from child [[24811,1],0] [xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor: processing commands completed [xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: received message from [[24811,1],2]
[xserve01.cluster:42519] defining message event: orted/orted_comm.c 159
[xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: reissued recv
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor called by [[24811,1],2] for tag 1 [xserve01.cluster:42519] [[24811,0],1] orted_cmd: received collective data cmd [xserve01.cluster:42519] [[24811,0],1] odls: collecting data from child [[24811,1],2]
[xserve01.cluster:42519] [[24811,0],1] odls: executing collective
[xserve01.cluster:42519] [[24811,0],1] odls: daemon collective called
[saturna.cluster:17660] [[24811,0],0] orte:daemon:cmd:processor called by [[24811,0],2] for tag 1 [saturna.cluster:17660] [[24811,0],0] orted_cmd: received collective data cmd
[saturna.cluster:17660] [[24811,0],0] odls: daemon collective called
[saturna.cluster:17660] [[24811,0],0] odls: daemon collective for job [24811,1] from [[24811,0],2] type 1 num_collected 1 num_participating 2 num_contributors 8 [saturna.cluster:17660] [[24811,0],0] orte:daemon:cmd:processor: processing commands completed [saturna.cluster:17660] [[24811,0],0] orte:daemon:cmd:processor called by [[24811,0],1] for tag 1 [xserve01.cluster:42519] [[24811,0],1] odls: daemon collective for job [24811,1] from [[24811,0],1] type 1 num_collected 1 num_participating 1 num_contributors 8 [xserve01.cluster:42519] [[24811,0],1] odls: daemon collective not the HNP - sending to parent [[24811,0],0]
[xserve01.cluster:42519] [[24811,0],1] odls: collective completed
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor: processing commands completed [saturna.cluster:17660] [[24811,0],0] orted_cmd: received collective data cmd
[saturna.cluster:17660] [[24811,0],0] odls: daemon collective called
[saturna.cluster:17660] [[24811,0],0] odls: daemon collective for job [24811,1] from [[24811,0],1] type 1 num_collected 2 num_participating 2 num_contributors 16 [saturna.cluster:17660] [[24811,0],0] odls: daemon collective HNP - xcasting to job [24811,1]
[saturna.cluster:17660] defining message event: grpcomm_bad_module.c 183
[saturna.cluster:17660] [[24811,0],0] orte:daemon:cmd:processor: processing commands completed [saturna.cluster:17660] [[24811,0],0] orte:daemon:cmd:processor called by [[24811,0],0] for tag 1 [saturna.cluster:17660] [[24811,0],0] orted_cmd: received message_local_procs [saturna.cluster:17660] [[24811,0],0] orted:comm:message_local_procs delivering message to job [24811,1] tag 17
[saturna.cluster:17660] [[24811,0],0] orte:daemon:send_relay
[saturna.cluster:17660] [[24811,0],0] orte:daemon:send_relay sending relay msg to 1 [saturna.cluster:17660] [[24811,0],0] orte:daemon:send_relay sending relay msg to 2 [xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: received message from [[24811,0],0]
[xserve01.cluster:42519] defining message event: orted/orted_comm.c 159
[xserve01.cluster:42519] [[24811,0],1] orted_recv_cmd: reissued recv
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:cmd:processor called by [[24811,0],0] for tag 1 [xserve01.cluster:42519] [[24811,0],1] orted_cmd: received message_local_procs [xserve01.cluster:42519] [[24811,0],1] orted:comm:message_local_procs delivering message to job [24811,1] tag 17 [xserve01.cluster:42519] [[24811,0],1] odls: sending message to tag 17 on child [[24811,1],0] [xserve01.cluster:42519] [[24811,0],1] odls: sending message to tag 17 on child [[24811,1],2] [xserve01.cluster:42519] [[24811,0],1] odls: sending message to tag 17 on child [[24811,1],4] [xserve01.cluster:42519] [[24811,0],1] odls: sending message to tag 17 on child [[24811,1],6] [xserve02.local:42180] [[24811,0],2] orted_recv_cmd: received message from [[24811,0],0]
[xserve02.local:42180] defining message event: orted/orted_comm.c 159
[xserve02.local:42180] [[24811,0],2] orted_recv_cmd: reissued recv
[xserve02.local:42180] [[24811,0],2] orte:daemon:cmd:processor called by [[24811,0],0] for tag 1 [xserve02.local:42180] [[24811,0],2] orted_cmd: received message_local_procs [xserve02.local:42180] [[24811,0],2] orted:comm:message_local_procs delivering message to job [24811,1] tag 17 [xserve02.local:42180] [[24811,0],2] odls: sending message to tag 17 on child [[24811,1],1] [xserve02.local:42180] [[24811,0],2] odls: sending message to tag 17 on child [[24811,1],3] [xserve02.local:42180] [[24811,0],2] odls: sending message to tag 17 on child [[24811,1],5] [xserve02.local:42180] [[24811,0],2] odls: sending message to tag 17 on child [[24811,1],7] [xserve02.local:42180] [[24811,0],2] odls: sending message to tag 17 on child [[24811,1],9] [xserve01.cluster:42519] [[24811,0],1] odls: sending message to tag 17 on child [[24811,1],8] [xserve01.cluster:42519] [[24811,0],1] odls: sending message to tag 17 on child [[24811,1],10] [xserve01.cluster:42519] [[24811,0],1] odls: sending message to tag 17 on child [[24811,1],12] [xserve01.cluster:42519] [[24811,0],1] odls: sending message to tag 17 on child [[24811,1],14]
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:send_relay
[xserve01.cluster:42519] [[24811,0],1] orte:daemon:send_relay - recipient list is empty! [xserve02.local:42180] [[24811,0],2] odls: sending message to tag 17 on child [[24811,1],11] [xserve02.local:42180] [[24811,0],2] odls: sending message to tag 17 on child [[24811,1],13] [xserve02.local:42180] [[24811,0],2] odls: sending message to tag 17 on child [[24811,1],15]
[xserve02.local:42180] [[24811,0],2] orte:daemon:send_relay
[xserve02.local:42180] [[24811,0],2] orte:daemon:send_relay - recipient list is empty!
[saturna.cluster:17660] defining message event: iof_hnp_receive.c 227
[xserve02.local][[24811,1],1][btl_tcp_endpoint.c: 486:mca_btl_tcp_endpoint_recv_connect_ack] received unexpected process identifier [[24811,1],2]
[saturna.cluster:17660] defining message event: iof_hnp_receive.c 227
[xserve01.cluster][[24811,1],2][btl_tcp_endpoint.c: 486:mca_btl_tcp_endpoint_recv_connect_ack] received unexpected process identifier [[24811,1],5]


Reply via email to