Ralph,

Please do not bother about the output containing "src is (null) and orte
type is 0" in my previous email. It is just some printf I added to
dss_copy.c to make some sense of what is going wrong.

Prakash

>>> prakash.velayut...@cchmc.org 06/05/07 6:16 AM >>>
Hi,

Sorry about that. Two lines got cut out from the program. Here is the
full program and error messages again. No Resource Manager involved,
just ssh/rsh.

Hostfile contains

bmi-opt2-01
bmi-opt2-02
bmi-opt2-03
bmi-opt2-04

############################
#include<string.h>
#include<stdlib.h>
#include<stdio.h>
#include"mpi.h"

void
main(int argc, char **argv)
{
        int             tag = 0;
        int             my_rank;
        int             num_proc;
        char            message_0[] = "hello slave, i'm your master";
        char            message_1[50];
        char            master_data[] = "slaves to work";
        int             array_of_errcodes[10];
        int             num;
        MPI_Status      status;
        MPI_Comm        inter_comm;
        MPI_Info        info;
        int             arr[1];
        int             rc1;
        MPI_Init(&argc, &argv);
        MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
        MPI_Comm_size(MPI_COMM_WORLD, &num_proc);
        printf("MASTER : spawning 3 slaves ... \n");
        rc1 = MPI_Comm_spawn("/bin/hostname", MPI_ARGV_NULL, 1,
MPI_INFO_NULL, 0, MPI_COMM_WORLD, &inter_comm, arr);
        printf("MASTER : send a message to master of slaves ...\n");
        MPI_Send(message_0, 50, MPI_CHAR,0 , tag, inter_comm);
        MPI_Recv(message_1, 50, MPI_CHAR, 0, tag, inter_comm, &status);
        printf("MASTER : message received : %s\n", message_1);
        MPI_Send(master_data, 50, MPI_CHAR,0 , tag, inter_comm);
        MPI_Finalize();
        exit(0);
}
#################################

prakash@bmi-opt2-01:~/thesis/CS/Samples/x86_64> mpirun -np 1 --pernode
--prefix /usr/local/openmpi-1.2 --hostfile machinefile ./master1
MASTER : spawning 3 slaves ... 
src is (null) and orte type is 0
[bmi-opt2-01:03527] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file
dss/dss_copy.c at line 43
[bmi-opt2-01:03527] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file
gpr_replica_put_get_fn.c at line 410
[bmi-opt2-01:03527] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file
base/rmaps_base_registry_fns.c at line 612
[bmi-opt2-01:03527] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file
base/rmaps_base_map_job.c at line 93
[bmi-opt2-01:03527] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file
base/rmaps_base_receive.c at line 139
mpirun: killing job...

mpirun noticed that job rank 0 with PID 3532 on node bmi-opt2-01 exited
on signal 15 (Terminated). 

Thanks,
Prakash

>>> r...@lanl.gov 06/03/07 9:31 PM >>>
Hi Prakash

Are you sure the code you provided here is the one generating the output
you
attached? I don't see this message anywhere in your code:

MASTER : spawning 3 slaves ...

and it certainly isn't anything we generate. Also, your output implies
you
are in some kind of loop, yet your code contains only a single
comm_spawn.

Could you please clarify?

Thanks
Ralph


On 6/3/07 5:50 AM, "Prakash Velayutham" <prakash.velayut...@cchmc.org>
wrote:

> Hello,
> 
> Version - Open MPI 1.2.1.
> 
> I have a simple program as below:
> 
> #include<string.h>
> #include<stdlib.h>
> #include<stdio.h>
> #include"mpi.h"
> 
> void
> main(int argc, char **argv)
> {
> 
>         int             tag = 0;
>         int             my_rank;
>         int             num_proc;
>         char            message_0[] = "hello slave, i'm your master";
>         char            message_1[50];
>         char            master_data[] = "slaves to work";
>         int             num;
>         MPI_Status      status;
>         MPI_Comm        inter_comm;
>         MPI_Info        info;
>         int             arr[1];
>         int             rc1;
>         MPI_Init(&argc, &argv);
>         MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
>         MPI_Comm_size(MPI_COMM_WORLD, &num_proc);
>         rc1 = MPI_Comm_spawn("/bin/hostname", MPI_ARGV_NULL, 1,
> MPI_INFO_NULL, 0, MPI_COMM_WORLD, &inter_comm, arr);
>         printf("MASTER : send a message to master of slaves ...\n");
>         MPI_Send(message_0, 50, MPI_CHAR,0 , tag, inter_comm);
>         MPI_Recv(message_1, 50, MPI_CHAR, 0, tag, inter_comm,
&status);
>         printf("MASTER : message received : %s\n", message_1);
>         MPI_Send(master_data, 50, MPI_CHAR,0 , tag, inter_comm);
>         MPI_Finalize();
>         exit(0);
> }
> 
> When this is run, all I get is
>> ~/thesis/CS/Samples/x86_64> mpirun -np 4 --pernode --hostfile
> machinefile --prefix /usr/local/openmpi-1.2 ./master1
> MASTER : spawning 3 slaves ...
> MASTER : spawning 3 slaves ...
> MASTER : spawning 3 slaves ...
> MASTER : spawning 3 slaves ...
> src is (null) and orte type is 0
> [bmi-opt2-01:25441] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file
> dss/dss_copy.c at line 43
> [bmi-opt2-01:25441] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file
> gpr_replica_put_get_fn.c at line 410
> [bmi-opt2-01:25441] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file
> base/rmaps_base_registry_fns.c at line 612
> [bmi-opt2-01:25441] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file
> base/rmaps_base_map_job.c at line 93
> [bmi-opt2-01:25441] [0,0,0] ORTE_ERROR_LOG: Bad parameter in file
> base/rmaps_base_receive.c at line 139
> mpirun: killing job...
> 
> mpirun noticed that job rank 0 with PID 25447 on node bmi-opt2-01
exited
> on signal 15 (Terminated).
> 3 additional processes aborted (not shown)
> 
> Any idea what is wrong with this.
> 
> Thanks,
> Prakash
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to