I'm attempting to use MPI over tcp; the attached (rather trivial) code gets stuck in MPI_Send. Looking at TCP dumps indicates that the TCP connection is made successfully to the right port, but the actual data doesn't appear to be sent.
I'm beginning to suspect that there's some basic problem with my configuration, or an underlying bug in TCP message passing in MPI. Any suggestions to try (or a response indicating that MPI over TCP actually works, and that it's some problem with my setup) appreciated. The relevant portion of the hostfile looks like this: archimedes.int.donarmstrong.com slots=2 krel.int.donarmstrong.com slots=8 and the output of the run and tcpdump is attached. Thanks in advance. Don Armstrong -- [T]he question of whether Machines Can Think, [...] is about as relevant as the question of whether Submarines Can Swim. -- Edsger W. Dijkstra "The threats to computing science" http://www.donarmstrong.com http://rzlab.ucr.edu
/*The Parallel Hello World Program*/ #include <stdio.h> #include <mpi.h> main(int argc, char **argv) { int rank; int size; MPI_Status status; int i,j; int buffer[10]; printf("Starting\n"); MPI_Init(&argc,&argv); printf("Calling MPI_Init\n"); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); if (rank > 0) { for(i=0; i<10;i++) { buffer[i]=i+1; } printf("Hello World from Node rank %d\n",rank); printf("Rank %d is receiving from 0\n",rank); MPI_Recv(buffer,10,MPI_INT,0,0,MPI_COMM_WORLD,&status); printf("Rank %d is sending to 0\n",rank); MPI_Send(buffer,10,MPI_INT,0,0,MPI_COMM_WORLD); } else { for(i=0; i<10;i++) { buffer[i]=i; } for(i=1;i < size;i++) { printf("Sending to rank %d from 0\n",i); MPI_Send(buffer,10,MPI_INT,i,0,MPI_COMM_WORLD); } for(i=1;i < size;i++) { printf("Rank 0 is receiving from %d\n",i); MPI_Recv(buffer,10,MPI_INT,i,0,MPI_COMM_WORLD,&status); } } MPI_Finalize(); }
[archimedes:29730] progress: initialized event flag to: 5 [archimedes:29730] progress: initialized yield_when_idle to: true [archimedes:29730] progress: initialized num users to: 0 [archimedes:29730] progress: initialized poll rate to: 20000000 [archimedes:29730] progress: event_users_increment setting count to 1 [archimedes:29730] mca: base: components_open: Looking for rml components [archimedes:29730] mca: base: components_open: opening rml components [archimedes:29730] mca: base: components_open: found loaded component ftrm [archimedes:29730] mca: base: components_open: component ftrm has no register function [archimedes:29730] mca: base: components_open: component ftrm open function successful [archimedes:29730] mca: base: components_open: found loaded component oob [archimedes:29730] mca: base: components_open: component oob has no register function [archimedes:29730] mca: base: components_open: Looking for oob components [archimedes:29730] mca: base: components_open: opening oob components [archimedes:29730] mca: base: components_open: found loaded component tcp [archimedes:29730] mca: base: components_open: component tcp has no register function [archimedes:29730] mca: base: components_open: component tcp open function successful [archimedes:29730] mca: base: components_open: component oob open function successful [archimedes:29730] orte_rml_base_select: initializing rml component ftrm [archimedes:29730] orte_rml_base_select: initializing rml component oob [archimedes:29730] orte_rml_base_select: module ftrm unloaded [archimedes:29730] mca: base: components_open: Looking for grpcomm components [archimedes:29730] mca: base: components_open: opening grpcomm components [archimedes:29730] mca: base: components_open: found loaded component bad [archimedes:29730] mca: base: components_open: component bad has no register function [archimedes:29730] mca: base: components_open: component bad open function successful [archimedes:29730] mca: base: components_open: found loaded component basic [archimedes:29730] mca: base: components_open: component basic has no register function [archimedes:29730] mca: base: components_open: component basic open function successful [archimedes:29730] mca:base:select: Auto-selecting grpcomm components [archimedes:29730] mca:base:select:(grpcomm) Querying component [bad] [archimedes:29730] mca:base:select:(grpcomm) Query of component [bad] set priority to 10 [archimedes:29730] mca:base:select:(grpcomm) Querying component [basic] [archimedes:29730] mca:base:select:(grpcomm) Query of component [basic] set priority to 1 [archimedes:29730] mca:base:select:(grpcomm) Selected component [bad] [archimedes:29730] mca: base: close: component basic closed [archimedes:29730] mca: base: close: unloading component basic [archimedes:29730] [[60576,0],0] setting up session dir with tmpdir: UNDEF host archimedes [archimedes:29730] procdir: /home/don/tmp/openmpi-sessions-don@archimedes_0/60576/0/0 [archimedes:29730] jobdir: /home/don/tmp/openmpi-sessions-don@archimedes_0/60576/0 [archimedes:29730] top: openmpi-sessions-don@archimedes_0 [archimedes:29730] tmp: /home/don/tmp [archimedes:29730] [[60576,0],0] writing contact file /home/don/tmp/openmpi-sessions-don@archimedes_0/60576/contact.txt [archimedes:29730] [[60576,0],0] wrote contact file [archimedes:29730] progress: set_event_flag setting to 1 [archimedes:29730] progress: progress_set_yield_when_idle to false [archimedes:29730] [[60576,0],0] hostfile: checking hostfile /etc/openmpi/openmpi-default-hostfile for nodes [archimedes:29730] [[60576,0],0] hostfile: node archimedes is being included - keep all is FALSE [archimedes:29730] [[60576,0],0] hostfile: node krel.int.donarmstrong.com is being included - keep all is FALSE [archimedes:29730] [[60576,0],0] hostfile: filtering nodes through hostfile /etc/openmpi/openmpi-default-hostfile [archimedes:29730] [[60576,0],0] hostfile: node archimedes is being included - keep all is FALSE [archimedes:29730] [[60576,0],0] hostfile: node krel.int.donarmstrong.com is being included - keep all is FALSE [archimedes:29730] progressed_wait: ../../../../../orte/mca/plm/base/plm_base_launch_support.c 459 [krel:05504] progress: initialized event flag to: 5 [krel:05504] progress: initialized yield_when_idle to: true [krel:05504] progress: initialized num users to: 0 [krel:05504] progress: initialized poll rate to: 33910000 [krel:05504] progress: event_users_increment setting count to 1 [krel:05504] mca: base: components_open: Looking for rml components [krel:05504] mca: base: components_open: opening rml components [krel:05504] mca: base: components_open: found loaded component ftrm [krel:05504] mca: base: components_open: component ftrm has no register function [krel:05504] mca: base: components_open: component ftrm open function successful [krel:05504] mca: base: components_open: found loaded component oob [krel:05504] mca: base: components_open: component oob has no register function [krel:05504] mca: base: components_open: Looking for oob components [krel:05504] mca: base: components_open: opening oob components [krel:05504] mca: base: components_open: found loaded component tcp [krel:05504] mca: base: components_open: component tcp has no register function [krel:05504] mca: base: components_open: component tcp open function successful [krel:05504] mca: base: components_open: component oob open function successful [krel:05504] orte_rml_base_select: initializing rml component ftrm [krel:05504] orte_rml_base_select: initializing rml component oob [krel:05504] orte_rml_base_select: module ftrm unloaded [krel:05504] mca: base: components_open: Looking for grpcomm components [krel:05504] mca: base: components_open: opening grpcomm components [krel:05504] mca: base: components_open: found loaded component bad [krel:05504] mca: base: components_open: component bad has no register function [krel:05504] mca: base: components_open: component bad open function successful [krel:05504] mca: base: components_open: found loaded component basic [krel:05504] mca: base: components_open: component basic has no register function [krel:05504] mca: base: components_open: component basic open function successful [krel:05504] mca:base:select: Auto-selecting grpcomm components [krel:05504] mca:base:select:(grpcomm) Querying component [bad] [krel:05504] mca:base:select:(grpcomm) Query of component [bad] set priority to 10 [krel:05504] mca:base:select:(grpcomm) Querying component [basic] [krel:05504] mca:base:select:(grpcomm) Query of component [basic] set priority to 1 [krel:05504] mca:base:select:(grpcomm) Selected component [bad] [krel:05504] mca: base: close: component basic closed [krel:05504] mca: base: close: unloading component basic [krel:05504] [[60576,0],1] setting up session dir with tmpdir: UNDEF host krel [krel:05504] procdir: /home/don/tmp/openmpi-sessions-don@krel_0/60576/0/1 [krel:05504] jobdir: /home/don/tmp/openmpi-sessions-don@krel_0/60576/0 [krel:05504] top: openmpi-sessions-don@krel_0 [krel:05504] tmp: /home/don/tmp [krel:05504] progress: progress_set_yield_when_idle to false [krel:05504] progress: set_event_flag setting to 1 [krel:05504] rml_send [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 10, 10) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 10) [archimedes:29730] defining message event: ../../../../../orte/mca/plm/base/plm_base_launch_support.c 423 [archimedes:29730] [[60576,0],0] encode:nidmap non_contig_nodes - packing all names [archimedes:29730] [[60576,0],0] grpcomm:bad:xcast sent to job [60576,0] tag 1 [archimedes:29730] defining message event: ../../../../../../orte/mca/grpcomm/bad/grpcomm_bad_module.c 183 [archimedes:29730] progressed_wait: ../../../../../orte/mca/plm/base/plm_base_launch_support.c 754 [archimedes:29730] [[60576,0],0] orte:daemon:cmd:processor called by [[60576,0],0] for tag 1 [archimedes:29730] [[60576,0],0] decode:nidmap decoding nodemap [archimedes:29730] [[60576,0],0] decode:nidmap decoding 2 nodes with 0 already loaded [archimedes:29730] [[60576,0],0] node[0].name archimedes daemon 0 arch ffc91200 [archimedes:29730] [[60576,0],0] node[1].name krel daemon 1 arch ffc91200 [archimedes:29730] [[60576,0],0] rml:base:update:contact:info got uri 3969908736.0;tcp://138.23.140.43:40475 [archimedes:29730] [[60576,0],0] rml:base:update:contact:info got uri 3969908736.1;tcp://138.23.141.162:50413 [archimedes:29730] defining message event: ../../../../../orte/mca/odls/base/odls_base_default_fns.c 1350 [archimedes:29730] [[60576,0],0] orte:daemon:send_relay [archimedes:29730] [[60576,0],0] orte:daemon:send_relay sending relay msg to 1 [archimedes:29730] rml_send [[60576,0],0] -> [[60576,0],1] (router [[60576,0],1], tag 1, 1) [krel:05504] [[60576,0],1] recv from [[60576,0],0] for [[60576,0],1] (tag 1) [krel:05504] [[60576,0],1] orted_recv_cmd: received message from [[60576,0],0] [krel:05504] defining message event: ../../../orte/orted/orted_comm.c 159 [krel:05504] [[60576,0],1] orted_recv_cmd: reissued recv [krel:05504] [[60576,0],1] orte:daemon:cmd:processor called by [[60576,0],0] for tag 1 [krel:05504] [[60576,0],1] decode:nidmap decoding nodemap [krel:05504] [[60576,0],1] decode:nidmap decoding 2 nodes with 0 already loaded [krel:05504] [[60576,0],1] node[0].name archimedes daemon 0 arch ffc91200 [krel:05504] [[60576,0],1] node[1].name krel daemon 1 arch ffc91200 [krel:05504] [[60576,0],1] rml:base:update:contact:info got uri 3969908736.0;tcp://138.23.140.43:40475 [krel:05504] [[60576,0],1] rml:base:update:contact:info got uri 3969908736.1;tcp://138.23.141.162:50413 Starting Starting [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 11) [krel:05504] [[60576,0],1] orte:daemon:send_relay [krel:05504] [[60576,0],1] orte:daemon:send_relay - recipient list is empty! [archimedes:29730] Info: Setting up debugger process table for applications MPIR_being_debugged = 0 MPIR_debug_state = 1 MPIR_partial_attach_ok = 1 MPIR_i_am_starter = 0 MPIR_proctable_size = 4 MPIR_proctable: (i, host, exe, pid) = (0, archimedes, /home/don/./mpi_test, 29820) (i, host, exe, pid) = (1, archimedes, /home/don/./mpi_test, 29821) (i, host, exe, pid) = (2, krel.int.donarmstrong.com, /home/don/./mpi_test, 5511) (i, host, exe, pid) = (3, krel.int.donarmstrong.com, /home/don/./mpi_test, 5512) [archimedes:29820] progress: initialized event flag to: 5 [archimedes:29820] progress: initialized yield_when_idle to: true [archimedes:29820] progress: initialized num users to: 0 [archimedes:29820] progress: initialized poll rate to: 20000000 [archimedes:29820] progress: event_users_increment setting count to 1 [archimedes:29821] progress: initialized event flag to: 5 [archimedes:29821] progress: initialized yield_when_idle to: true [archimedes:29821] progress: initialized num users to: 0 [archimedes:29821] progress: initialized poll rate to: 20000000 [archimedes:29821] progress: event_users_increment setting count to 1 [archimedes:29820] mca: base: components_open: Looking for rml components [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [archimedes:29820] mca: base: components_open: opening rml components [archimedes:29820] mca: base: components_open: found loaded component ftrm [archimedes:29820] mca: base: components_open: component ftrm has no register function [archimedes:29820] mca: base: components_open: component ftrm open function successful [archimedes:29820] mca: base: components_open: found loaded component oob [archimedes:29820] mca: base: components_open: component oob has no register function [archimedes:29820] mca: base: components_open: Looking for oob components Starting Starting [archimedes:29821] mca: base: components_open: Looking for rml components [archimedes:29820] mca: base: components_open: opening oob components [archimedes:29820] mca: base: components_open: found loaded component tcp [archimedes:29820] mca: base: components_open: component tcp has no register function [archimedes:29820] mca: base: components_open: component tcp open function successful [archimedes:29820] mca: base: components_open: component oob open function successful [archimedes:29820] orte_rml_base_select: initializing rml component ftrm [archimedes:29820] orte_rml_base_select: initializing rml component oob [archimedes:29821] mca: base: components_open: opening rml components [archimedes:29821] mca: base: components_open: found loaded component ftrm [archimedes:29821] mca: base: components_open: component ftrm has no register function [archimedes:29821] mca: base: components_open: component ftrm open function successful [archimedes:29820] orte_rml_base_select: module ftrm unloaded [archimedes:29821] mca: base: components_open: found loaded component oob [archimedes:29821] mca: base: components_open: component oob has no register function [archimedes:29821] mca: base: components_open: Looking for oob components [archimedes:29821] mca: base: components_open: opening oob components [archimedes:29821] mca: base: components_open: found loaded component tcp [archimedes:29821] mca: base: components_open: component tcp has no register function [archimedes:29821] mca: base: components_open: component tcp open function successful [archimedes:29821] mca: base: components_open: component oob open function successful [archimedes:29821] orte_rml_base_select: initializing rml component ftrm [archimedes:29821] orte_rml_base_select: initializing rml component oob [archimedes:29820] mca: base: components_open: Looking for grpcomm components [archimedes:29821] orte_rml_base_select: module ftrm unloaded [archimedes:29820] mca: base: components_open: opening grpcomm components [archimedes:29820] mca: base: components_open: found loaded component bad [archimedes:29820] mca: base: components_open: component bad has no register function [archimedes:29820] mca: base: components_open: component bad open function successful [archimedes:29820] mca: base: components_open: found loaded component basic [archimedes:29820] mca: base: components_open: component basic has no register function [archimedes:29820] mca: base: components_open: component basic open function successful [archimedes:29820] mca:base:select: Auto-selecting grpcomm components [archimedes:29820] mca:base:select:(grpcomm) Querying component [bad] [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [archimedes:29820] mca:base:select:(grpcomm) Query of component [bad] set priority to 10 [archimedes:29820] mca:base:select:(grpcomm) Querying component [basic] [archimedes:29820] mca:base:select:(grpcomm) Query of component [basic] set priority to 1 [archimedes:29820] mca:base:select:(grpcomm) Selected component [bad] [archimedes:29820] mca: base: close: component basic closed [archimedes:29820] mca: base: close: unloading component basic [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [krel:05511] progress: initialized event flag to: 5 [krel:05511] progress: initialized yield_when_idle to: true [krel:05511] progress: initialized num users to: 0 [krel:05511] progress: initialized poll rate to: 33910000 [krel:05511] progress: event_users_increment setting count to 1 [archimedes:29821] mca: base: components_open: Looking for grpcomm components [archimedes:29820] [[60576,1],0] setting up session dir with tmpdir: UNDEF host archimedes [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [archimedes:29821] mca: base: components_open: opening grpcomm components [archimedes:29821] mca: base: components_open: found loaded component bad [archimedes:29821] mca: base: components_open: component bad has no register function [archimedes:29821] mca: base: components_open: component bad open function successful [archimedes:29821] mca: base: components_open: found loaded component basic [archimedes:29821] mca: base: components_open: component basic has no register function [archimedes:29821] mca: base: components_open: component basic open function successful [archimedes:29821] mca:base:select: Auto-selecting grpcomm components [archimedes:29821] mca:base:select:(grpcomm) Querying component [bad] [archimedes:29821] mca:base:select:(grpcomm) Query of component [bad] set priority to 10 [archimedes:29821] mca:base:select:(grpcomm) Querying component [basic] [archimedes:29821] mca:base:select:(grpcomm) Query of component [basic] set priority to 1 [archimedes:29821] mca:base:select:(grpcomm) Selected component [bad] [archimedes:29821] mca: base: close: component basic closed [archimedes:29821] mca: base: close: unloading component basic [archimedes:29820] procdir: /home/don/tmp/openmpi-sessions-don@archimedes_0/60576/1/0 [archimedes:29820] jobdir: /home/don/tmp/openmpi-sessions-don@archimedes_0/60576/1 [archimedes:29820] top: openmpi-sessions-don@archimedes_0 [archimedes:29820] tmp: /home/don/tmp [krel:05512] progress: initialized event flag to: 5 [krel:05512] progress: initialized yield_when_idle to: true [krel:05512] progress: initialized num users to: 0 [krel:05512] progress: initialized poll rate to: 33910000 [krel:05512] progress: event_users_increment setting count to 1 [archimedes:29820] rml_send [[60576,1],0] -> [[60576,0],0] (router [[60576,0],0], tag 1, 1) [archimedes:29730] [[60576,0],0] recv from [[60576,1],0] for [[60576,0],0] (tag 1) [archimedes:29730] [[60576,0],0] orted_recv_cmd: received message from [[60576,1],0] [archimedes:29730] defining message event: ../../../orte/orted/orted_comm.c 159 [archimedes:29730] [[60576,0],0] orted_recv_cmd: reissued recv [archimedes:29730] [[60576,0],0] orte:daemon:cmd:processor called by [[60576,1],0] for tag 1 [archimedes:29730] rml_send [[60576,0],0] -> [[60576,1],0] (router [[60576,1],0], tag 20, 20) [archimedes:29730] [[60576,0],0] orte:daemon:cmd:processor: processing commands completed [archimedes:29820] progressed_wait: ../../../../../orte/mca/routed/base/routed_base_register_sync.c 113 [archimedes:29820] [[60576,1],0] recv from [[60576,0],0] for [[60576,1],0] (tag 20) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [krel:05511] mca: base: components_open: Looking for rml components [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05511] mca: base: components_open: opening rml components [krel:05511] mca: base: components_open: found loaded component ftrm [krel:05511] mca: base: components_open: component ftrm has no register function [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05511] mca: base: components_open: component ftrm open function successful [krel:05511] mca: base: components_open: found loaded component oob [krel:05511] mca: base: components_open: component oob has no register function [archimedes:29821] [[60576,1],1] setting up session dir with tmpdir: UNDEF host archimedes [krel:05511] mca: base: components_open: Looking for oob components [krel:05512] mca: base: components_open: Looking for rml components [archimedes:29820] [[60576,1],0] decode:nidmap decoding nodemap [archimedes:29820] [[60576,1],0] decode:nidmap decoding 2 nodes with 0 already loaded [archimedes:29820] [[60576,1],0] node[0].name archimedes daemon 0 arch ffc91200 [archimedes:29820] [[60576,1],0] node[1].name krel daemon 1 arch ffc91200 [archimedes:29821] procdir: /home/don/tmp/openmpi-sessions-don@archimedes_0/60576/1/1 [archimedes:29821] jobdir: /home/don/tmp/openmpi-sessions-don@archimedes_0/60576/1 [archimedes:29821] top: openmpi-sessions-don@archimedes_0 [archimedes:29821] tmp: /home/don/tmp [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [archimedes:29821] rml_send [[60576,1],1] -> [[60576,0],0] (router [[60576,0],0], tag 1, 1) [krel:05511] mca: base: components_open: opening oob components [krel:05511] mca: base: components_open: found loaded component tcp [krel:05511] mca: base: components_open: component tcp has no register function [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0[archimedes:29730] [[60576,0],0] orted_recv_cmd: received message from [[60576,1],1] [archimedes:29730] defining message event: ../../../orte/orted/orted_comm.c 159 [archimedes:29730] [[60576,0],0] orted_recv_cmd: reissued recv [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [archimedes:29821] progressed_wait: ../../../../../orte/mca/routed/base/routed_base_register_sync.c 113 [krel:05512] mca: base: components_open: opening rml components [krel:05512] mca: base: components_open: found loaded component ftrm [krel:05512] mca: base: components_open: component ftrm has no register function [archimedes:29730] [[60576,0],0] orte:daemon:cmd:processor called by [[60576,1],1] for tag 1 [archimedes:29730] rml_send [[60576,0],0] -> [[60576,1],1] (router [[60576,1],1], tag 20, 20) [archimedes:29730] defining message event: ../../../../../orte/mca/odls/base/odls_base_default_fns.c 1738 [archimedes:29730] [[60576,0],0] orte:daemon:cmd:processor: processing commands completed [krel:05511] mca: base: components_open: component tcp open function successful [krel:05511] mca: base: components_open: component oob open function successful [krel:05511] orte_rml_base_select: initializing rml component ftrm [krel:05511] orte_rml_base_select: initializing rml component oob [archimedes:29821] [[60576,1],1] recv from [[60576,0],0] for [[60576,1],1] (tag 20) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05512] mca: base: components_open: component ftrm open function successful [krel:05512] mca: base: components_open: found loaded component oob [krel:05512] mca: base: components_open: component oob has no register function [krel:05512] mca: base: components_open: Looking for oob components [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [krel:05511] orte_rml_base_select: module ftrm unloaded [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05512] mca: base: components_open: opening oob components [krel:05512] mca: base: components_open: found loaded component tcp [krel:05512] mca: base: components_open: component tcp has no register function [krel:05512] mca: base: components_open: component tcp open function successful [krel:05512] mca: base: components_open: component oob open function successful [krel:05512] orte_rml_base_select: initializing rml component ftrm [krel:05512] orte_rml_base_select: initializing rml component oob [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [krel:05512] orte_rml_base_select: module ftrm unloaded [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29821] [[60576,1],1] decode:nidmap decoding nodemap [archimedes:29821] [[60576,1],1] decode:nidmap decoding 2 nodes with 0 already loaded [archimedes:29821] [[60576,1],1] node[0].name archimedes daemon 0 arch ffc91200 [archimedes:29821] [[60576,1],1] node[1].name krel daemon 1 arch ffc91200 [krel:05511] mca: base: components_open: Looking for grpcomm components [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05511] mca: base: components_open: opening grpcomm components [krel:05511] mca: base: components_open: found loaded component bad [krel:05511] mca: base: components_open: component bad has no register function [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05511] mca: base: components_open: component bad open function successful [krel:05511] mca: base: components_open: found loaded component basic [krel:05511] mca: base: components_open: component basic has no register function [krel:05511] mca: base: components_open: component basic open function successful [krel:05511] mca:base:select: Auto-selecting grpcomm components [krel:05511] mca:base:select:(grpcomm) Querying component [bad] [krel:05511] mca:base:select:(grpcomm) Query of component [bad] set priority to 10 [krel:05512] mca: base: components_open: Looking for grpcomm components [krel:05511] mca:base:select:(grpcomm) Querying component [basic] [krel:05511] mca:base:select:(grpcomm) Query of component [basic] set priority to 1 [krel:05511] mca:base:select:(grpcomm) Selected component [bad] [krel:05511] mca: base: close: component basic closed [krel:05511] mca: base: close: unloading component basic [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05512] mca: base: components_open: opening grpcomm components [krel:05512] mca: base: components_open: found loaded component bad [krel:05512] mca: base: components_open: component bad has no register function [krel:05512] mca: base: components_open: component bad open function successful [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [krel:05512] mca: base: components_open: found loaded component basic [krel:05512] mca: base: components_open: component basic has no register function [krel:05512] mca: base: components_open: component basic open function successful [krel:05512] mca:base:select: Auto-selecting grpcomm components [krel:05512] mca:base:select:(grpcomm) Querying component [bad] [krel:05512] mca:base:select:(grpcomm) Query of component [bad] set priority to 10 [krel:05512] mca:base:select:(grpcomm) Querying component [basic] [krel:05512] mca:base:select:(grpcomm) Query of component [basic] set priority to 1 [krel:05512] mca:base:select:(grpcomm) Selected component [bad] [krel:05512] mca: base: close: component basic closed [krel:05512] mca: base: close: unloading component basic [archimedes:29820] mca: base: components_open: Looking for btl components [archimedes:29820] mca: base: components_open: opening btl components [archimedes:29820] mca: base: components_open: found loaded component ofud [archimedes:29820] mca: base: components_open: component ofud has no register function [archimedes:29820] mca: base: components_open: component ofud open function successful [archimedes:29820] mca: base: components_open: found loaded component openib [archimedes:29820] mca: base: components_open: component openib has no register function [archimedes:29820] mca: base: components_open: component openib open function successful [archimedes:29820] mca: base: components_open: found loaded component self [archimedes:29820] mca: base: components_open: component self has no register function [archimedes:29820] mca: base: components_open: component self open function successful [archimedes:29820] mca: base: components_open: found loaded component sm [archimedes:29820] mca: base: components_open: component sm has no register function [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [archimedes:29820] mca: base: components_open: component sm open function successful [archimedes:29820] mca: base: components_open: found loaded component tcp [archimedes:29820] mca: base: components_open: component tcp has no register function [krel:05511] [[60576,1],2] setting up session dir with tmpdir: UNDEF host krel [archimedes:29820] mca: base: components_open: component tcp open function successful [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [archimedes:29821] mca: base: components_open: Looking for btl components [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05511] procdir: /home/don/tmp/openmpi-sessions-don@krel_0/60576/1/2 [krel:05511] jobdir: /home/don/tmp/openmpi-sessions-don@krel_0/60576/1 [krel:05511] top: openmpi-sessions-don@krel_0 [krel:05511] tmp: /home/don/tmp [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [krel:05511] rml_send [[60576,1],2] -> [[60576,0],1] (router [[60576,0],1], tag 1, 1) [krel:05504] [[60576,0],1] recv from [[60576,1],2] for [[60576,0],1] (tag 1) [krel:05504] [[60576,0],1] orted_recv_cmd: received message from [[60576,1],2] [krel:05504] defining message event: ../../../orte/orted/orted_comm.c 159 [krel:05504] [[60576,0],1] orted_recv_cmd: reissued recv [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [krel:05504] [[60576,0],1] orte:daemon:cmd:processor called by [[60576,1],2] for tag 1 [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05511] progressed_wait: ../../../../../orte/mca/routed/base/routed_base_register_sync.c 113 [krel:05504] rml_send [[60576,0],1] -> [[60576,1],2] (router [[60576,1],2], tag 20, 20) [krel:05504] [[60576,0],1] orte:daemon:cmd:processor: processing commands completed [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05511] [[60576,1],2] recv from [[60576,0],1] for [[60576,1],2] (tag 20) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [archimedes:29821] mca: base: components_open: opening btl components [archimedes:29821] mca: base: components_open: found loaded component ofud [archimedes:29821] mca: base: components_open: component ofud has no register function [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29821] mca: base: components_open: component ofud open function successful [archimedes:29821] mca: base: components_open: found loaded component openib [archimedes:29821] mca: base: components_open: component openib has no register function [krel:05512] [[60576,1],3] setting up session dir with tmpdir: UNDEF host krel [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [archimedes:29821] mca: base: components_open: component openib open function successful [archimedes:29821] mca: base: components_open: found loaded component self [archimedes:29821] mca: base: components_open: component self has no register function [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0[archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05512] procdir: /home/don/tmp/openmpi-sessions-don@krel_0/60576/1/3 [krel:05512] jobdir: /home/don/tmp/openmpi-sessions-don@krel_0/60576/1 [krel:05512] top: openmpi-sessions-don@krel_0 [krel:05512] tmp: /home/don/tmp [archimedes:29821] mca: base: components_open: component self open function successful [archimedes:29821] mca: base: components_open: found loaded component sm [krel:05512] rml_send [[60576,1],3] -> [[60576,0],1] (router [[60576,0],1], tag 1, 1) [archimedes:29821] mca: base: components_open: component sm has no register function [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29821] mca: base: components_open: component sm open function successful [archimedes:29821] mca: base: components_open: found loaded component tcp [archimedes:29821] mca: base: components_open: component tcp has no register function [krel:05504] [[60576,0],1] recv from [[60576,1],3] for [[60576,0],1] (tag 1) [krel:05504] [[60576,0],1] orted_recv_cmd: received message from [[60576,1],3] [krel:05504] defining message event: ../../../orte/orted/orted_comm.c 159 [krel:05504] [[60576,0],1] orted_recv_cmd: reissued recv [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [krel:05511] [[60576,1],2] decode:nidmap decoding nodemap [krel:05511] [[60576,1],2] decode:nidmap decoding 2 nodes with 0 already loaded [krel:05511] [[60576,1],2] node[0].name archimedes daemon 0 arch ffc91200 [krel:05511] [[60576,1],2] node[1].name krel daemon 1 arch ffc91200 [krel:05512] progressed_wait: ../../../../../orte/mca/routed/base/routed_base_register_sync.c 113 [krel:05504] [[60576,0],1] orte:daemon:cmd:processor called by [[60576,1],3] for tag 1 [krel:05504] rml_send [[60576,0],1] -> [[60576,1],3] (router [[60576,1],3], tag 20, 20) [krel:05504] rml_send [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 18, 18) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 18) [archimedes:29730] defining message event: ../../../../../orte/mca/routed/base/routed_base_receive.c 153 [archimedes:29821] mca: base: components_open: component tcp open function successful [krel:05504] [[60576,0],1] orte:daemon:cmd:processor: processing commands completed [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [archimedes:29820] select: initializing btl component ofud [archimedes:29820] select: init of component ofud returned failure [archimedes:29820] select: module ofud unloaded [archimedes:29820] select: initializing btl component openib [archimedes:29820] select: init of component openib returned failure [archimedes:29820] select: module openib unloaded [archimedes:29820] select: initializing btl component self [archimedes:29820] select: init of component self returned success [archimedes:29820] select: initializing btl component sm [archimedes:29820] select: init of component sm returned success [archimedes:29820] select: initializing btl component tcp [krel:05512] [[60576,1],3] recv from [[60576,0],1] for [[60576,1],3] (tag 20) [archimedes:29820] [[60576,1],0] grpcomm:set_proc_attr: setting attribute btl.tcp.1.4 data size 48 [archimedes:29820] select: init of component tcp returned success [archimedes:29820] [[60576,1],0] grpcomm:set_proc_attr: setting attribute pml.base.2.0 data size 4 [archimedes:29820] [[60576,1],0] grpcomm:bad: modex entered [archimedes:29820] [[60576,1],0] grpcomm:base:pack_modex: reporting 2 entries [archimedes:29820] [[60576,1],0] grpcomm:bad:modex: executing allgather [archimedes:29820] [[60576,1],0] grpcomm:bad entering allgather [archimedes:29820] rml_send [[60576,1],0] -> [[60576,0],0] (router [[60576,0],0], tag 1, 1) [archimedes:29730] [[60576,0],0] recv from [[60576,1],0] for [[60576,0],0] (tag 1) [archimedes:29730] [[60576,0],0] orted_recv_cmd: received message from [[60576,1],0] [archimedes:29730] defining message event: ../../../orte/orted/orted_comm.c 159 [archimedes:29730] [[60576,0],0] orted_recv_cmd: reissued recv [archimedes:29730] [[60576,0],0] orte:daemon:cmd:processor called by [[60576,1],0] for tag 1 [archimedes:29730] [[60576,0],0] orte:daemon:cmd:processor: processing commands completed [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [archimedes:29820] [[60576,1],0] grpcomm:bad allgather buffer sent [archimedes:29820] progressed_wait: ../../../../../../orte/mca/grpcomm/bad/grpcomm_bad_module.c 394 [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [krel:05512] [[60576,1],3] decode:nidmap decoding nodemap [krel:05512] [[60576,1],3] decode:nidmap decoding 2 nodes with 0 already loaded [krel:05512] [[60576,1],3] node[0].name archimedes daemon 0 arch ffc91200 [krel:05512] [[60576,1],3] node[1].name krel daemon 1 arch ffc91200 [archimedes:29821] select: initializing btl component ofud [archimedes:29821] select: init of component ofud returned failure [archimedes:29821] select: module ofud unloaded [archimedes:29821] select: initializing btl component openib [archimedes:29821] select: init of component openib returned failure [archimedes:29821] select: module openib unloaded [archimedes:29821] select: initializing btl component self [archimedes:29821] select: init of component self returned success [archimedes:29821] select: initializing btl component sm [archimedes:29821] select: init of component sm returned success [archimedes:29821] select: initializing btl component tcp [archimedes:29821] [[60576,1],1] grpcomm:set_proc_attr: setting attribute btl.tcp.1.4 data size 48 [archimedes:29821] select: init of component tcp returned success [archimedes:29821] [[60576,1],1] grpcomm:bad: modex entered [archimedes:29821] [[60576,1],1] grpcomm:base:pack_modex: reporting 1 entries [archimedes:29821] [[60576,1],1] grpcomm:bad:modex: executing allgather [archimedes:29821] [[60576,1],1] grpcomm:bad entering allgather [archimedes:29821] rml_send [[60576,1],1] -> [[60576,0],0] (router [[60576,0],0], tag 1, 1) [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0[archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [archimedes:29730] [[60576,0],0] recv from [[60576,1],1] for [[60576,0],0] (tag 1) [archimedes:29730] [[60576,0],0] orted_recv_cmd: received message from [[60576,1],1] [archimedes:29730] defining message event: ../../../orte/orted/orted_comm.c 159 [archimedes:29730] [[60576,0],0] orted_recv_cmd: reissued recv [archimedes:29730] [[60576,0],0] orte:daemon:cmd:processor called by [[60576,1],1] for tag 1 [archimedes:29730] [[60576,0],0] orte:daemon:cmd:processor: processing commands completed [archimedes:29821] [[60576,1],1] grpcomm:bad allgather buffer sent [archimedes:29821] progressed_wait: ../../../../../../orte/mca/grpcomm/bad/grpcomm_bad_module.c 394 [krel:05511] mca: base: components_open: Looking for btl components [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05511] mca: base: components_open: opening btl components [krel:05511] mca: base: components_open: found loaded component ofud [krel:05511] mca: base: components_open: component ofud has no register function [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05511] mca: base: components_open: component ofud open function successful [krel:05511] mca: base: components_open: found loaded component openib [krel:05511] mca: base: components_open: component openib has no register function [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [krel:05512] mca: base: components_open: Looking for btl components [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [krel:05511] mca: base: components_open: component openib open function successful [krel:05511] mca: base: components_open: found loaded component self [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05511] mca: base: components_open: component self has no register function [krel:05511] mca: base: components_open: component self open function successful [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05511] mca: base: components_open: found loaded component sm [krel:05511] mca: base: components_open: component sm has no register function [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [krel:05511] mca: base: components_open: component sm open function successful [krel:05511] mca: base: components_open: found loaded component tcp [krel:05511] mca: base: components_open: component tcp has no register function [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [krel:05512] mca: base: components_open: opening btl components [krel:05512] mca: base: components_open: found loaded component ofud [krel:05512] mca: base: components_open: component ofud has no register function [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05512] mca: base: components_open: component ofud open function successful [krel:05512] mca: base: components_open: found loaded component openib [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05511] mca: base: components_open: component tcp open function successful [krel:05512] mca: base: components_open: component openib has no register function [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05512] mca: base: components_open: component openib open function successful [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05512] mca: base: components_open: found loaded component self [krel:05512] mca: base: components_open: component self has no register function [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [krel:05512] mca: base: components_open: found loaded component sm [krel:05512] mca: base: components_open: component sm has no register function [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [krel:05512] mca: base: components_open: component sm open function successful [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05512] mca: base: components_open: found loaded component tcp [krel:05512] mca: base: components_open: component tcp has no register function [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [krel:05512] mca: base: components_open: component tcp open function successful [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05511] select: initializing btl component ofud [krel:05511] select: init of component ofud returned failure [krel:05511] select: module ofud unloaded [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [krel:05511] select: initializing btl component openib [krel:05511] select: init of component openib returned failure [krel:05511] select: module openib unloaded [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05511] select: initializing btl component self [krel:05511] select: init of component self returned success [krel:05511] select: initializing btl component sm [krel:05511] select: init of component sm returned success [krel:05511] select: initializing btl component tcp [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05511] [[60576,1],2] grpcomm:set_proc_attr: setting attribute btl.tcp.1.4 data size 48 [krel:05511] select: init of component tcp returned success [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05511] [[60576,1],2] grpcomm:bad: modex entered [krel:05504] [[60576,0],1] recv from [[60576,1],2] for [[60576,0],1] (tag 1) [krel:05504] [[60576,0],1] orted_recv_cmd: received message from [[60576,1],2] [krel:05504] defining message event: ../../../orte/orted/orted_comm.c 159 [krel:05504] [[60576,0],1] orted_recv_cmd: reissued recv [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [krel:05504] [[60576,0],1] orte:daemon:cmd:processor called by [[60576,1],2] for tag 1 [krel:05504] [[60576,0],1] orte:daemon:cmd:processor: processing commands completed [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05511] [[60576,1],2] grpcomm:base:pack_modex: reporting 1 entries [krel:05511] [[60576,1],2] grpcomm:bad:modex: executing allgather [krel:05511] [[60576,1],2] grpcomm:bad entering allgather [krel:05511] rml_send [[60576,1],2] -> [[60576,0],1] (router [[60576,0],1], tag 1, 1) [krel:05511] [[60576,1],2] grpcomm:bad allgather buffer sent [krel:05511] progressed_wait: ../../../../../../orte/mca/grpcomm/bad/grpcomm_bad_module.c 394 [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05512] select: initializing btl component ofud [krel:05512] select: init of component ofud returned failure [krel:05512] select: module ofud unloaded [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05512] select: initializing btl component openib [krel:05512] select: init of component openib returned failure [krel:05512] select: module openib unloaded [krel:05512] select: initializing btl component self [krel:05512] select: init of component self returned success [krel:05512] select: initializing btl component sm [krel:05512] select: init of component sm returned success [krel:05512] select: initializing btl component tcp [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05512] [[60576,1],3] grpcomm:set_proc_attr: setting attribute btl.tcp.1.4 data size 48 [krel:05512] select: init of component tcp returned success [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05512] [[60576,1],3] grpcomm:bad: modex entered [krel:05512] [[60576,1],3] grpcomm:base:pack_modex: reporting 1 entries [krel:05512] [[60576,1],3] grpcomm:bad:modex: executing allgather [krel:05512] [[60576,1],3] grpcomm:bad entering allgather [krel:05512] rml_send [[60576,1],3] -> [[60576,0],1] (router [[60576,0],1], tag 1, 1) [krel:05504] [[60576,0],1] recv from [[60576,1],3] for [[60576,0],1] (tag 1) [krel:05504] [[60576,0],1] orted_recv_cmd: received message from [[60576,1],3] [krel:05504] defining message event: ../../../orte/orted/orted_comm.c 159 [krel:05504] [[60576,0],1] orted_recv_cmd: reissued recv [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [krel:05504] [[60576,0],1] orte:daemon:cmd:processor called by [[60576,1],3] for tag 1 [krel:05512] [[60576,1],3] grpcomm:bad allgather buffer sent [krel:05512] progressed_wait: ../../../../../../orte/mca/grpcomm/bad/grpcomm_bad_module.c 394 [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 1) [archimedes:29730] [[60576,0],0] orted_recv_cmd: received message from [[60576,0],1] [archimedes:29730] defining message event: ../../../orte/orted/orted_comm.c 159 [archimedes:29730] [[60576,0],0] orted_recv_cmd: reissued recv [archimedes:29730] [[60576,0],0] orte:daemon:cmd:processor called by [[60576,0],1] for tag 1 [archimedes:29730] [[60576,0],0] grpcomm:bad:xcast sent to job [60576,1] tag 15 [archimedes:29730] defining message event: ../../../../../../orte/mca/grpcomm/bad/grpcomm_bad_module.c 183 [archimedes:29730] [[60576,0],0] orte:daemon:cmd:processor: processing commands completed [archimedes:29730] [[60576,0],0] orte:daemon:cmd:processor called by [[60576,0],0] for tag 1 [archimedes:29730] [[60576,0],0] orted:comm:message_local_procs delivering message to job [60576,1] tag 15 [archimedes:29730] rml_send [[60576,0],0] -> [[60576,1],0] (router [[60576,1],0], tag 15, 15) [archimedes:29730] rml_send [[60576,0],0] -> [[60576,1],1] (router [[60576,1],1], tag 15, 15) [archimedes:29730] [[60576,0],0] orte:daemon:send_relay [archimedes:29730] [[60576,0],0] orte:daemon:send_relay sending relay msg to 1 [archimedes:29730] rml_send [[60576,0],0] -> [[60576,0],1] (router [[60576,0],1], tag 1, 1) [archimedes:29821] [[60576,1],1] recv from [[60576,0],0] for [[60576,1],1] (tag 15) [archimedes:29821] [[60576,1],1] allgather buffer received [archimedes:29821] [[60576,1],1] grpcomm:bad allgather completed [archimedes:29821] [[60576,1],1] grpcomm:bad:modex: processing modex info [archimedes:29821] [[60576,1],1] grpcomm:bad:modex: received 560 data bytes from 4 procs [archimedes:29821] [[60576,1],1] grpcomm:bad: modex completed [archimedes:29820] [[60576,1],0] recv from [[60576,0],0] for [[60576,1],0] (tag 15) [archimedes:29820] [[60576,1],0] allgather buffer received [archimedes:29820] [[60576,1],0] grpcomm:bad allgather completed [archimedes:29820] [[60576,1],0] grpcomm:bad:modex: processing modex info [archimedes:29820] [[60576,1],0] grpcomm:bad:modex: received 560 data bytes from 4 procs [krel:05504] [[60576,0],1] recv from [[60576,0],0] for [[60576,0],1] (tag 1) [krel:05504] [[60576,0],1] orted_recv_cmd: received message from [[60576,0],0] [krel:05504] defining message event: ../../../orte/orted/orted_comm.c 159 [krel:05504] [[60576,0],1] orted_recv_cmd: reissued recv [krel:05504] [[60576,0],1] orte:daemon:cmd:processor called by [[60576,0],0] for tag 1 [krel:05504] [[60576,0],1] orted:comm:message_local_procs delivering message to job [60576,1] tag 15 [krel:05504] rml_send [[60576,0],1] -> [[60576,1],2] (router [[60576,1],2], tag 15, 15) [krel:05504] rml_send [[60576,0],1] -> [[60576,1],3] (router [[60576,1],3], tag 15, 15) [krel:05504] [[60576,0],1] orte:daemon:send_relay [krel:05504] [[60576,0],1] orte:daemon:send_relay - recipient list is empty! [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [archimedes:29730] [[60576,0],0] routing message from [[60576,1],0] for [[60576,1],1] to [[60576,1],1] (tag: 106) [krel:05504] [[60576,0],1] routing message from [[60576,1],2] for [[60576,1],3] to [[60576,1],3] (tag: 106) [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [archimedes:29730] [[60576,0],0] routing message from [[60576,1],0] for [[60576,1],1] to [[60576,1],1] (tag: 106) [archimedes:29821] progress: event_users_increment setting count to 2 [archimedes:29820] [[60576,1],0] grpcomm:bad: modex completed [archimedes:29820] progress: event_users_increment setting count to 2 [archimedes:29820] rml_send [[60576,1],0] -> [[60576,1],1] (router [[60576,0],0], tag 106, 14) [archimedes:29820] progress: event_users_decrement setting count to 1 [archimedes:29820] progress: event_users_increment setting count to 2 [archimedes:29820] rml_send [[60576,1],0] -> [[60576,1],1] (router [[60576,0],0], tag 106, 14) [archimedes:29820] progress: event_users_decrement setting count to 1 [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05511] [[60576,1],2] recv from [[60576,0],1] for [[60576,1],2] (tag 15) [krel:05511] [[60576,1],2] allgather buffer received [archimedes:29821] [[60576,1],1] recv from [[60576,1],0] for [[60576,1],1] (tag 106) [archimedes:29821] progress: event_users_decrement setting count to 1 [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05512] [[60576,1],3] recv from [[60576,0],1] for [[60576,1],3] (tag 15) [krel:05512] [[60576,1],3] allgather buffer received [krel:05512] [[60576,1],3] grpcomm:bad allgather completed [krel:05512] [[60576,1],3] grpcomm:bad:modex: processing modex info [krel:05512] [[60576,1],3] grpcomm:bad:modex: received 560 data bytes from 4 procs [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05511] [[60576,1],2] grpcomm:bad allgather completed [krel:05511] [[60576,1],2] grpcomm:bad:modex: processing modex info [krel:05511] [[60576,1],2] grpcomm:bad:modex: received 560 data bytes from 4 procs [krel:05511] [[60576,1],2] grpcomm:bad: modex completed [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05512] [[60576,1],3] grpcomm:bad: modex completed [archimedes:29821] progress: event_users_increment setting count to 2 [archimedes:29821] [[60576,1],1] recv from [[60576,1],0] for [[60576,1],1] (tag 106) [archimedes:29821] progress: event_users_decrement setting count to 1 [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05512] progress: event_users_increment setting count to 2 [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05511] progress: event_users_increment setting count to 2 [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05511] rml_send [[60576,1],2] -> [[60576,1],3] (router [[60576,0],1], tag 106, 14) [krel:05511] progress: event_users_decrement setting count to 1 [krel:05511] progress: event_users_increment setting count to 2 [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05512] [[60576,1],3] recv from [[60576,1],2] for [[60576,1],3] (tag 106) [krel:05512] progress: event_users_decrement setting count to 1 [krel:05511] rml_send [[60576,1],2] -> [[60576,1],3] (router [[60576,0],1], tag 106, 14) [krel:05511] progress: event_users_decrement setting count to 1 [krel:05512] progress: event_users_increment setting count to 2 [krel:05512] [[60576,1],3] recv from [[60576,1],2] for [[60576,1],3] (tag 106) [krel:05512] progress: event_users_decrement setting count to 1 [archimedes:29821] progress: event_users_increment setting count to 2 [archimedes:29821] progress: event_users_increment setting count to 3 [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05511] progress: event_users_increment setting count to 2 [archimedes:29821] [[60576,1],1] grpcomm:bad entering barrier [archimedes:29821] rml_send [[60576,1],1] -> [[60576,0],0] (router [[60576,0],0], tag 1, 1) [archimedes:29821] [[60576,1],1] grpcomm:bad barrier sent [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [archimedes:29730] [[60576,0],0] recv from [[60576,1],1] for [[60576,0],0] (tag 1) [archimedes:29730] [[60576,0],0] orted_recv_cmd: received message from [[60576,1],1] [archimedes:29730] defining message event: ../../../orte/orted/orted_comm.c 159 [archimedes:29730] [[60576,0],0] orted_recv_cmd: reissued recv [krel:05512] progress: event_users_increment setting count to 2 [archimedes:29821] progressed_wait: ../../../../../../orte/mca/grpcomm/bad/grpcomm_bad_module.c 270 [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05511] progress: event_users_increment setting count to 3 [archimedes:29730] [[60576,0],0] orte:daemon:cmd:processor called by [[60576,1],1] for tag 1 [archimedes:29730] [[60576,0],0] orte:daemon:cmd:processor: processing commands completed [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05512] progress: event_users_increment setting count to 3 [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05511] progress: event_users_increment setting count to 4 [krel:05511] progress: event_users_decrement setting count to 3 [krel:05512] progress: event_users_increment setting count to 4 [krel:05512] progress: event_users_decrement setting count to 3 [krel:05504] [[60576,0],1] orted_recv_cmd: reissued recv [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05511] [[60576,1],2] grpcomm:bad entering barrier [krel:05511] rml_send [[60576,1],2] -> [[60576,0],1] (router [[60576,0],1], tag 1, 1) [krel:05511] [[60576,1],2] grpcomm:bad barrier sent [krel:05511] progressed_wait: ../../../../../../orte/mca/grpcomm/bad/grpcomm_bad_module.c 270 [krel:05504] [[60576,0],1] orte:daemon:cmd:processor called by [[60576,1],3] for tag 1 [krel:05504] [[60576,0],1] orte:daemon:cmd:processor: processing commands completed [krel:05512] [[60576,1],3] grpcomm:bad entering barrier [krel:05512] rml_send [[60576,1],3] -> [[60576,0],1] (router [[60576,0],1], tag 1, 1) [krel:05512] [[60576,1],3] grpcomm:bad barrier sent [krel:05512] progressed_wait: ../../../../../../orte/mca/grpcomm/bad/grpcomm_bad_module.c 270 [krel:05504] [[60576,0],1] recv from [[60576,1],2] for [[60576,0],1] (tag 1) [krel:05504] [[60576,0],1] orted_recv_cmd: received message from [[60576,1],2] [krel:05504] defining message event: ../../../orte/orted/orted_comm.c 159 [krel:05504] [[60576,0],1] orted_recv_cmd: reissued recv [krel:05504] [[60576,0],1] orte:daemon:cmd:processor called by [[60576,1],2] for tag 1 [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 1) [archimedes:29730] [[60576,0],0] orted_recv_cmd: received message from [[60576,0],1] [archimedes:29730] defining message event: ../../../orte/orted/orted_comm.c 159 [archimedes:29730] [[60576,0],0] orted_recv_cmd: reissued recv [krel:05504] rml_send [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 1, 1) [krel:05504] [[60576,0],1] orte:daemon:cmd:processor: processing commands completed [archimedes:29730] [[60576,0],0] orte:daemon:cmd:processor called by [[60576,0],1] for tag 1 [archimedes:29730] [[60576,0],0] orte:daemon:cmd:processor: processing commands completed [archimedes:29820] progress: event_users_increment setting count to 2 [archimedes:29820] progress: event_users_increment setting count to 3 [archimedes:29820] progress: event_users_increment setting count to 4 [archimedes:29820] progress: event_users_decrement setting count to 3 [archimedes:29820] [[60576,1],0] grpcomm:bad entering barrier [archimedes:29820] rml_send [[60576,1],0] -> [[60576,0],0] (router [[60576,0],0], tag 1, 1) [archimedes:29730] [[60576,0],0] recv from [[60576,1],0] for [[60576,0],0] (tag 1) [archimedes:29730] [[60576,0],0] orted_recv_cmd: received message from [[60576,1],0] [archimedes:29730] defining message event: ../../../orte/orted/orted_comm.c 159 [archimedes:29730] [[60576,0],0] orted_recv_cmd: reissued recv [archimedes:29730] [[60576,0],0] orte:daemon:cmd:processor called by [[60576,1],0] for tag 1 [archimedes:29730] [[60576,0],0] grpcomm:bad:xcast sent to job [60576,1] tag 17 [archimedes:29730] defining message event: ../../../../../../orte/mca/grpcomm/bad/grpcomm_bad_module.c 183 [archimedes:29730] [[60576,0],0] orte:daemon:cmd:processor: processing commands completed [archimedes:29820] [[60576,1],0] grpcomm:bad barrier sent [archimedes:29820] progressed_wait: ../../../../../../orte/mca/grpcomm/bad/grpcomm_bad_module.c 270 [archimedes:29730] [[60576,0],0] orte:daemon:cmd:processor called by [[60576,0],0] for tag 1 [archimedes:29730] [[60576,0],0] orted:comm:message_local_procs delivering message to job [60576,1] tag 17 [archimedes:29730] rml_send [[60576,0],0] -> [[60576,1],0] (router [[60576,1],0], tag 17, 17) [archimedes:29730] rml_send [[60576,0],0] -> [[60576,1],1] (router [[60576,1],1], tag 17, 17) [archimedes:29730] [[60576,0],0] orte:daemon:send_relay [archimedes:29730] [[60576,0],0] orte:daemon:send_relay sending relay msg to 1 [archimedes:29730] rml_send [[60576,0],0] -> [[60576,0],1] (router [[60576,0],1], tag 1, 1) [archimedes:29820] [[60576,1],0] recv from [[60576,0],0] for [[60576,1],0] (tag 17) [archimedes:29820] [[60576,1],0] grpcomm:bad received barrier release [archimedes:29820] progress: set_event_flag setting to 2 [archimedes:29821] [[60576,1],1] recv from [[60576,0],0] for [[60576,1],1] (tag 17) [archimedes:29821] [[60576,1],1] grpcomm:bad received barrier release [archimedes:29821] progress: set_event_flag setting to 2 [krel:05504] [[60576,0],1] recv from [[60576,0],0] for [[60576,0],1] (tag 1) [krel:05504] [[60576,0],1] orted_recv_cmd: received message from [[60576,0],0] [krel:05504] defining message event: ../../../orte/orted/orted_comm.c 159 [krel:05504] [[60576,0],1] orted_recv_cmd: reissued recv [krel:05504] [[60576,0],1] orte:daemon:cmd:processor called by [[60576,0],0] for tag 1 [krel:05504] [[60576,0],1] orted:comm:message_local_procs delivering message to job [60576,1] tag 17 [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05511] [[60576,1],2] recv from [[60576,0],1] for [[60576,1],2] (tag 17) [krel:05511] [[60576,1],2] grpcomm:bad received barrier release [krel:05511] progress: set_event_flag setting to 2 [krel:05512] [[60576,1],3] recv from [[60576,0],1] for [[60576,1],3] (tag 17) [krel:05512] [[60576,1],3] grpcomm:bad received barrier release [krel:05512] progress: set_event_flag setting to 2 [archimedes:29820] progress: progress_set_yield_when_idle to false Calling MPI_Init Sending to rank 1 from 0 [archimedes:29821] progress: progress_set_yield_when_idle to false Calling MPI_Init Hello World from Node rank 1 Rank 1 is receiving from 0 Sending to rank 2 from 0 Rank 1 is sending to 0 [archimedes:29821] progress: set_event_flag setting to 5 [archimedes:29821] progress: event_users_increment setting count to 4 [archimedes:29821] [[60576,1],1] grpcomm:bad entering barrier [archimedes:29821] rml_send [[60576,1],1] -> [[60576,0],0] (router [[60576,0],0], tag 1, 1) [archimedes:29730] [[60576,0],0] recv from [[60576,1],1] for [[60576,0],0] (tag 1) [archimedes:29730] [[60576,0],0] orted_recv_cmd: received message from [[60576,1],1] [archimedes:29730] defining message event: ../../../orte/orted/orted_comm.c 159 [archimedes:29730] [[60576,0],0] orted_recv_cmd: reissued recv [archimedes:29730] [[60576,0],0] orte:daemon:cmd:processor called by [[60576,1],1] for tag 1 [archimedes:29730] [[60576,0],0] orte:daemon:cmd:processor: processing commands completed [archimedes:29821] [[60576,1],1] grpcomm:bad barrier sent [archimedes:29821] progressed_wait: ../../../../../../orte/mca/grpcomm/bad/grpcomm_bad_module.c 270 [archimedes:29820] btl: tcp: attempting to connect() to [[60576,1],2] address 138.23.141.162 on port 2001 [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05504] rml_send_buffer_nb [[60576,0],1] -> [[60576,0],0] (router [[60576,0],0], tag 2, 2) [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 [krel:05512] progress: progress_set_yield_when_idle to false [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 Calling MPI_Init Hello World from Node rank 3 Rank 3 is receiving from 0 [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 2) [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_receive.c 227 Calling MPI_Init Hello World from Node rank 2 Rank 2 is receiving from 0 [krel:05511] progress: progress_set_yield_when_idle to false [archimedes:29730] defining timer event: 0 sec 0 usec at ../../../../../orte/tools/orterun/orterun.c:1226 mpirun: killing job... [archimedes:29730] [[60576,0],0]:../../../../../orte/tools/orterun/orterun.c(1129) updating exit status to 1 [archimedes:29730] defining message event: ../../../../../orte/mca/plm/base/plm_base_orted_cmds.c 276 [archimedes:29730] rml_send_buffer_nb [[60576,0],0] -> [[60576,0],1] (router [[60576,0],1], tag 1, 1) [archimedes:29730] defining timeout: 0 sec 1000 usec at ../../../../../orte/mca/plm/base/plm_base_orted_cmds.c:321 [archimedes:29730] progressed_wait: ../../../../../orte/mca/plm/base/plm_base_orted_cmds.c 324 [archimedes:29730] defining timeout: 0 sec 4000 usec at ../../../../../orte/tools/orterun/orterun.c:1164 [archimedes:29730] [[60576,0],0] orte:daemon:cmd:processor called by [[60576,0],0] for tag 1 [archimedes:29730] defining message event: ../../../../../orte/mca/odls/base/odls_base_default_fns.c 2409 [archimedes:29730] [[60576,0],0] orte:daemon:cmd:processor: processing commands completed [archimedes:29730] [[60576,0],0] recv from [[60576,0],1] for [[60576,0],0] (tag 5) [archimedes:29730] defining message event: ../../../../../orte/mca/plm/base/plm_base_receive.c 329 [archimedes:29730] [[60576,0],0] calling job_complete trigger -------------------------------------------------------------------------- mpirun noticed that process rank 0 with PID 29820 on node archimedes exited on signal 0 (Unknown signal 0). -------------------------------------------------------------------------- [archimedes:29730] defining message event: ../../../../../orte/mca/plm/base/plm_base_orted_cmds.c 142 [archimedes:29730] defining timeout: 0 sec 0 usec at ../../../../../orte/mca/plm/base/plm_base_orted_cmds.c:186 [archimedes:29730] progressed_wait: ../../../../../orte/mca/plm/base/plm_base_orted_cmds.c 189 [archimedes:29730] defining timeout: 0 sec 2000 usec at ../../../../../orte/tools/orterun/orterun.c:847 [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_read.c 281 [archimedes:29730] defining message event: ../../../../../../orte/mca/iof/hnp/iof_hnp_read.c 281 [archimedes:29730] [[60576,0],0] orte:daemon:cmd:processor called by [[60576,0],0] for tag 1 [archimedes:29730] [[60576,0],0] calling orted_exit trigger mpirun: clean termination accomplished [archimedes:29730] sess_dir_finalize: job session dir not empty - leaving [archimedes:29730] mca: base: close: component bad closed [archimedes:29730] mca: base: close: unloading component bad [archimedes:29730] mca: base: close: component tcp closed [archimedes:29730] mca: base: close: unloading component tcp [archimedes:29730] mca: base: close: component oob closed [archimedes:29730] mca: base: close: unloading component oob [archimedes:29730] sess_dir_finalize: proc session dir not empty - leaving orterun: exiting with status 1 4 total processes killed (some possibly by mpirun during cleanup)
11:46:58.981251 IP (tos 0x0, ttl 64, id 27444, offset 0, flags [DF], proto TCP (6), length 60) archimedes.ucr.edu.57489 > spieth-lsci-141-162.bulk.ucr.edu.cisco-sccp: Flags [S], cksum 0x6627 (correct), seq 3182294834, win 14600, options [mss 1460,sackOK,TS val 1099996166 ecr 0,nop,wscale 2], length 0 0x0000: 0017 a44b 7cea 0030 487d 8254 0800 4500 ...K|..0H}.T..E. 0x0010: 003c 6b34 4000 4006 a18b 8a17 8c2b 8a17 .<k4@.@......+.. 0x0020: 8da2 e091 07d0 bdad f732 0000 0000 a002 .........2...... 0x0030: 3908 6627 0000 0204 05b4 0402 080a 4190 9.f'..........A. 0x0040: 9c06 0000 0000 0103 0302 .......... 11:46:58.981276 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60) spieth-lsci-141-162.bulk.ucr.edu.cisco-sccp > archimedes.ucr.edu.57489: Flags [S.], cksum 0xa363 (correct), seq 475759913, ack 3182294835, win 5792, options [mss 1460,sackOK,TS val 300429737 ecr 1099996166,nop,wscale 7], length 0 0x0000: 0030 487d 8254 0017 a44b 7cea 0800 4500 .0H}.T...K|...E. 0x0010: 003c 0000 4000 4006 0cc0 8a17 8da2 8a17 .<..@.@......... 0x0020: 8c2b 07d0 e091 1c5b 8529 bdad f733 a012 .+.....[.)...3.. 0x0030: 16a0 a363 0000 0204 05b4 0402 080a 11e8 ...c............ 0x0040: 31a9 4190 9c06 0103 0307 1.A....... 11:46:58.981491 IP (tos 0x0, ttl 64, id 27445, offset 0, flags [DF], proto TCP (6), length 52) archimedes.ucr.edu.57489 > spieth-lsci-141-162.bulk.ucr.edu.cisco-sccp: Flags [.], cksum 0xda8d (correct), seq 1, ack 1, win 3650, options [nop,nop,TS val 1099996166 ecr 300429737], length 0 0x0000: 0017 a44b 7cea 0030 487d 8254 0800 4500 ...K|..0H}.T..E. 0x0010: 0034 6b35 4000 4006 a192 8a17 8c2b 8a17 .4k5@.@......+.. 0x0020: 8da2 e091 07d0 bdad f733 1c5b 852a 8010 .........3.[.*.. 0x0030: 0e42 da8d 0000 0101 080a 4190 9c06 11e8 .B........A..... 0x0040: 31a9 1. 11:46:58.981516 IP (tos 0x0, ttl 64, id 27446, offset 0, flags [DF], proto TCP (6), length 60) archimedes.ucr.edu.57489 > spieth-lsci-141-162.bulk.ucr.edu.cisco-sccp: Flags [P.], cksum 0xed09 (correct), seq 1:9, ack 1, win 3650, options [nop,nop,TS val 1099996166 ecr 300429737], length 8 0x0000: 0017 a44b 7cea 0030 487d 8254 0800 4500 ...K|..0H}.T..E. 0x0010: 003c 6b36 4000 4006 a189 8a17 8c2b 8a17 .<k6@.@......+.. 0x0020: 8da2 e091 07d0 bdad f733 1c5b 852a 8018 .........3.[.*.. 0x0030: 0e42 ed09 0000 0101 080a 4190 9c06 11e8 .B........A..... 0x0040: 31a9 ed72 0001 0000 0000 1..r...... 11:46:58.981527 IP (tos 0x0, ttl 64, id 46211, offset 0, flags [DF], proto TCP (6), length 52) spieth-lsci-141-162.bulk.ucr.edu.cisco-sccp > archimedes.ucr.edu.57489: Flags [.], cksum 0xe899 (correct), seq 1, ack 9, win 46, options [nop,nop,TS val 300429737 ecr 1099996166], length 0 0x0000: 0030 487d 8254 0017 a44b 7cea 0800 4500 .0H}.T...K|...E. 0x0010: 0034 b483 4000 4006 5844 8a17 8da2 8a17 .4..@.@.XD...... 0x0020: 8c2b 07d0 e091 1c5b 852a bdad f73b 8010 .+.....[.*...;.. 0x0030: 002e e899 0000 0101 080a 11e8 31a9 4190 ............1.A. 0x0040: 9c06 .. 11:46:58.982079 IP (tos 0x0, ttl 64, id 46212, offset 0, flags [DF], proto TCP (6), length 52) spieth-lsci-141-162.bulk.ucr.edu.cisco-sccp > archimedes.ucr.edu.57489: Flags [F.], cksum 0xe898 (correct), seq 1, ack 9, win 46, options [nop,nop,TS val 300429737 ecr 1099996166], length 0 0x0000: 0030 487d 8254 0017 a44b 7cea 0800 4500 .0H}.T...K|...E. 0x0010: 0034 b484 4000 4006 5843 8a17 8da2 8a17 .4..@.@.XC...... 0x0020: 8c2b 07d0 e091 1c5b 852a bdad f73b 8011 .+.....[.*...;.. 0x0030: 002e e898 0000 0101 080a 11e8 31a9 4190 ............1.A. 0x0040: 9c06 .. 11:46:58.982239 IP (tos 0x0, ttl 64, id 27447, offset 0, flags [DF], proto TCP (6), length 52) archimedes.ucr.edu.57489 > spieth-lsci-141-162.bulk.ucr.edu.cisco-sccp: Flags [F.], cksum 0xda83 (correct), seq 9, ack 2, win 3650, options [nop,nop,TS val 1099996166 ecr 300429737], length 0 0x0000: 0017 a44b 7cea 0030 487d 8254 0800 4500 ...K|..0H}.T..E. 0x0010: 0034 6b37 4000 4006 a190 8a17 8c2b 8a17 .4k7@.@......+.. 0x0020: 8da2 e091 07d0 bdad f73b 1c5b 852b 8011 .........;.[.+.. 0x0030: 0e42 da83 0000 0101 080a 4190 9c06 11e8 .B........A..... 0x0040: 31a9 1. 11:46:58.982254 IP (tos 0x0, ttl 64, id 46213, offset 0, flags [DF], proto TCP (6), length 52) spieth-lsci-141-162.bulk.ucr.edu.cisco-sccp > archimedes.ucr.edu.57489: Flags [.], cksum 0xe897 (correct), seq 2, ack 10, win 46, options [nop,nop,TS val 300429737 ecr 1099996166], length 0 0x0000: 0030 487d 8254 0017 a44b 7cea 0800 4500 .0H}.T...K|...E. 0x0010: 0034 b485 4000 4006 5842 8a17 8da2 8a17 .4..@.@.XB...... 0x0020: 8c2b 07d0 e091 1c5b 852b bdad f73c 8010 .+.....[.+...<.. 0x0030: 002e e897 0000 0101 080a 11e8 31a9 4190 ............1.A. 0x0040: 9c06 ..