Ralph, Please do not bother about the output containing "src is (null) and orte type is 0" in my previous email. It is just some printf I added to dss_copy.c to make some sense of what is going wrong.
Prakash >>> prakash.velayut...@cchmc.org 06/05/07 6:16 AM >>> Hi, Sorry about that. Two lines got cut out from the program. Here is the full program and error messages again. No Resource Manager involved, just ssh/rsh. Hostfile contains bmi-opt2-01 bmi-opt2-02 bmi-opt2-03 bmi-opt2-04 ############################ #include<string.h> #include<stdlib.h> #include<stdio.h> #include"mpi.h" void main(int argc, char **argv) { int tag = 0; int my_rank; int num_proc; char message_0[] = "hello slave, i'm your master"; char message_1[50]; char master_data[] = "slaves to work"; int array_of_errcodes[10]; int num; MPI_Status status; MPI_Comm inter_comm; MPI_Info info; int arr[1]; int rc1; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &my_rank); MPI_Comm_size(MPI_COMM_WORLD, &num_proc); printf("MASTER : spawning 3 slaves ... \n"); rc1 = MPI_Comm_spawn("/bin/hostname", MPI_ARGV_NULL, 1, MPI_INFO_NULL, 0, MPI_COMM_WORLD, &inter_comm, arr); printf("MASTER : send a message to master of slaves ...\n"); MPI_Send(message_0, 50, MPI_CHAR,0 , tag, inter_comm); MPI_Recv(message_1, 50, MPI_CHAR, 0, tag, inter_comm, &status); printf("MASTER : message received : %s\n", message_1); MPI_Send(master_data, 50, MPI_CHAR,0 , tag, inter_comm); MPI_Finalize(); exit(0); } ################################# prakash@bmi-opt2-01:~/thesis/CS/Samples/x86_64> mpirun -np 1 --pernode --prefix /usr/local/openmpi-1.2 --hostfile machinefile ./master1 MASTER : spawning 3 slaves ... src is (null) and orte type is 0 [bmi-opt2-01:03527] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file dss/dss_copy.c at line 43 [bmi-opt2-01:03527] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file gpr_replica_put_get_fn.c at line 410 [bmi-opt2-01:03527] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file base/rmaps_base_registry_fns.c at line 612 [bmi-opt2-01:03527] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file base/rmaps_base_map_job.c at line 93 [bmi-opt2-01:03527] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file base/rmaps_base_receive.c at line 139 mpirun: killing job... mpirun noticed that job rank 0 with PID 3532 on node bmi-opt2-01 exited on signal 15 (Terminated). Thanks, Prakash >>> r...@lanl.gov 06/03/07 9:31 PM >>> Hi Prakash Are you sure the code you provided here is the one generating the output you attached? I don't see this message anywhere in your code: MASTER : spawning 3 slaves ... and it certainly isn't anything we generate. Also, your output implies you are in some kind of loop, yet your code contains only a single comm_spawn. Could you please clarify? Thanks Ralph On 6/3/07 5:50 AM, "Prakash Velayutham" <prakash.velayut...@cchmc.org> wrote: > Hello, > > Version - Open MPI 1.2.1. > > I have a simple program as below: > > #include<string.h> > #include<stdlib.h> > #include<stdio.h> > #include"mpi.h" > > void > main(int argc, char **argv) > { > > int tag = 0; > int my_rank; > int num_proc; > char message_0[] = "hello slave, i'm your master"; > char message_1[50]; > char master_data[] = "slaves to work"; > int num; > MPI_Status status; > MPI_Comm inter_comm; > MPI_Info info; > int arr[1]; > int rc1; > MPI_Init(&argc, &argv); > MPI_Comm_rank(MPI_COMM_WORLD, &my_rank); > MPI_Comm_size(MPI_COMM_WORLD, &num_proc); > rc1 = MPI_Comm_spawn("/bin/hostname", MPI_ARGV_NULL, 1, > MPI_INFO_NULL, 0, MPI_COMM_WORLD, &inter_comm, arr); > printf("MASTER : send a message to master of slaves ...\n"); > MPI_Send(message_0, 50, MPI_CHAR,0 , tag, inter_comm); > MPI_Recv(message_1, 50, MPI_CHAR, 0, tag, inter_comm, &status); > printf("MASTER : message received : %s\n", message_1); > MPI_Send(master_data, 50, MPI_CHAR,0 , tag, inter_comm); > MPI_Finalize(); > exit(0); > } > > When this is run, all I get is >> ~/thesis/CS/Samples/x86_64> mpirun -np 4 --pernode --hostfile > machinefile --prefix /usr/local/openmpi-1.2 ./master1 > MASTER : spawning 3 slaves ... > MASTER : spawning 3 slaves ... > MASTER : spawning 3 slaves ... > MASTER : spawning 3 slaves ... > src is (null) and orte type is 0 > [bmi-opt2-01:25441] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file > dss/dss_copy.c at line 43 > [bmi-opt2-01:25441] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file > gpr_replica_put_get_fn.c at line 410 > [bmi-opt2-01:25441] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file > base/rmaps_base_registry_fns.c at line 612 > [bmi-opt2-01:25441] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file > base/rmaps_base_map_job.c at line 93 > [bmi-opt2-01:25441] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file > base/rmaps_base_receive.c at line 139 > mpirun: killing job... > > mpirun noticed that job rank 0 with PID 25447 on node bmi-opt2-01 exited > on signal 15 (Terminated). > 3 additional processes aborted (not shown) > > Any idea what is wrong with this. > > Thanks, > Prakash > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users _______________________________________________ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users _______________________________________________ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users