Hmmm...I think I know what may be happening. Could you send me: 1. what Open MPI version you are using?
2. any MCA parameters you might be setting in your environment (remember that we may be picking up some system configuration file for those) This isn't related to the problem, but I also note that you are spawning "hostname" and then trying to do MPI send/recv with it - I don't think that is going to work. Thanks Ralph On 6/5/07 4:16 AM, "Prakash Velayutham" <prakash.velayut...@cchmc.org> wrote: > Hi, > > Sorry about that. Two lines got cut out from the program. Here is the > full program and error messages again. No Resource Manager involved, > just ssh/rsh. > > Hostfile contains > > bmi-opt2-01 > bmi-opt2-02 > bmi-opt2-03 > bmi-opt2-04 > > ############################ > #include<string.h> > #include<stdlib.h> > #include<stdio.h> > #include"mpi.h" > > void > main(int argc, char **argv) > { > int tag = 0; > int my_rank; > int num_proc; > char message_0[] = "hello slave, i'm your master"; > char message_1[50]; > char master_data[] = "slaves to work"; > int array_of_errcodes[10]; > int num; > MPI_Status status; > MPI_Comm inter_comm; > MPI_Info info; > int arr[1]; > int rc1; > MPI_Init(&argc, &argv); > MPI_Comm_rank(MPI_COMM_WORLD, &my_rank); > MPI_Comm_size(MPI_COMM_WORLD, &num_proc); > printf("MASTER : spawning 3 slaves ... \n"); > rc1 = MPI_Comm_spawn("/bin/hostname", MPI_ARGV_NULL, 1, > MPI_INFO_NULL, 0, MPI_COMM_WORLD, &inter_comm, arr); > printf("MASTER : send a message to master of slaves ...\n"); > MPI_Send(message_0, 50, MPI_CHAR,0 , tag, inter_comm); > MPI_Recv(message_1, 50, MPI_CHAR, 0, tag, inter_comm, &status); > printf("MASTER : message received : %s\n", message_1); > MPI_Send(master_data, 50, MPI_CHAR,0 , tag, inter_comm); > MPI_Finalize(); > exit(0); > } > ################################# > > prakash@bmi-opt2-01:~/thesis/CS/Samples/x86_64> mpirun -np 1 --pernode > --prefix /usr/local/openmpi-1.2 --hostfile machinefile ./master1 > MASTER : spawning 3 slaves ... > src is (null) and orte type is 0 > [bmi-opt2-01:03527] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file > dss/dss_copy.c at line 43 > [bmi-opt2-01:03527] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file > gpr_replica_put_get_fn.c at line 410 > [bmi-opt2-01:03527] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file > base/rmaps_base_registry_fns.c at line 612 > [bmi-opt2-01:03527] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file > base/rmaps_base_map_job.c at line 93 > [bmi-opt2-01:03527] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file > base/rmaps_base_receive.c at line 139 > mpirun: killing job... > > mpirun noticed that job rank 0 with PID 3532 on node bmi-opt2-01 exited > on signal 15 (Terminated). > > Thanks, > Prakash > >>>> r...@lanl.gov 06/03/07 9:31 PM >>> > Hi Prakash > > Are you sure the code you provided here is the one generating the output > you > attached? I don't see this message anywhere in your code: > > MASTER : spawning 3 slaves ... > > and it certainly isn't anything we generate. Also, your output implies > you > are in some kind of loop, yet your code contains only a single > comm_spawn. > > Could you please clarify? > > Thanks > Ralph > > > On 6/3/07 5:50 AM, "Prakash Velayutham" <prakash.velayut...@cchmc.org> > wrote: > >> Hello, >> >> Version - Open MPI 1.2.1. >> >> I have a simple program as below: >> >> #include<string.h> >> #include<stdlib.h> >> #include<stdio.h> >> #include"mpi.h" >> >> void >> main(int argc, char **argv) >> { >> >> int tag = 0; >> int my_rank; >> int num_proc; >> char message_0[] = "hello slave, i'm your master"; >> char message_1[50]; >> char master_data[] = "slaves to work"; >> int num; >> MPI_Status status; >> MPI_Comm inter_comm; >> MPI_Info info; >> int arr[1]; >> int rc1; >> MPI_Init(&argc, &argv); >> MPI_Comm_rank(MPI_COMM_WORLD, &my_rank); >> MPI_Comm_size(MPI_COMM_WORLD, &num_proc); >> rc1 = MPI_Comm_spawn("/bin/hostname", MPI_ARGV_NULL, 1, >> MPI_INFO_NULL, 0, MPI_COMM_WORLD, &inter_comm, arr); >> printf("MASTER : send a message to master of slaves ...\n"); >> MPI_Send(message_0, 50, MPI_CHAR,0 , tag, inter_comm); >> MPI_Recv(message_1, 50, MPI_CHAR, 0, tag, inter_comm, > &status); >> printf("MASTER : message received : %s\n", message_1); >> MPI_Send(master_data, 50, MPI_CHAR,0 , tag, inter_comm); >> MPI_Finalize(); >> exit(0); >> } >> >> When this is run, all I get is >>> ~/thesis/CS/Samples/x86_64> mpirun -np 4 --pernode --hostfile >> machinefile --prefix /usr/local/openmpi-1.2 ./master1 >> MASTER : spawning 3 slaves ... >> MASTER : spawning 3 slaves ... >> MASTER : spawning 3 slaves ... >> MASTER : spawning 3 slaves ... >> src is (null) and orte type is 0 >> [bmi-opt2-01:25441] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file >> dss/dss_copy.c at line 43 >> [bmi-opt2-01:25441] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file >> gpr_replica_put_get_fn.c at line 410 >> [bmi-opt2-01:25441] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file >> base/rmaps_base_registry_fns.c at line 612 >> [bmi-opt2-01:25441] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file >> base/rmaps_base_map_job.c at line 93 >> [bmi-opt2-01:25441] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file >> base/rmaps_base_receive.c at line 139 >> mpirun: killing job... >> >> mpirun noticed that job rank 0 with PID 25447 on node bmi-opt2-01 > exited >> on signal 15 (Terminated). >> 3 additional processes aborted (not shown) >> >> Any idea what is wrong with this. >> >> Thanks, >> Prakash >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >