Dear Ralph, thanks for that. I have done much the same (as I indicated in my original post). I this case my C-program correctly spawned the slaves and the slaves printed the correctly passed argument lists. On running this and my fortran slave I get:
nsize, mytid: iargs 2 0 : 1 spray: 0 1:1 2 3 4 nsize, mytid: iargs 2 1 : 1 spray: 1 1:5 6 7 8 which is what I expect. I still think the error may well be mine rather that ompi's but I am at a loss to see what is going on !! Thanks for the help so far, Fred Marquis. c-program ========= #include "mpi.h" #include <stdio.h> #include <stdlib.h> int main( int argc, char *argv[] ) { int np[2] = { 1, 1 }; int errcodes[2]; char *cmds[2] = { "./spray", "./spray" }; char *args[2] = { "1 2 3 4", "5 6 7 8" }; char **array_of_argv[2]; char *argv0[] = {"1 2 3 4", (char *)0}; char *argv1[] = {"5 6 7 8", (char *)0}; array_of_argv[0] = argv0; array_of_argv[1] = argv1; MPI_Comm parentcomm, intercomm; MPI_Info infos[2] = { MPI_INFO_NULL, MPI_INFO_NULL }; MPI_Init( &argc, &argv ); MPI_Comm_spawn_multiple( 2, cmds, array_of_argv, np, infos, 0, MPI_COMM_WORLD, &intercomm, errcodes ); MPI_Finalize(); return 0; } On Wed, May 05, 2010 at 07:47:20PM +0100, Ralph Castain wrote: > I think OMPI is okay - here is a C sample program and the associated output: > > $ mpirun -np 3 ./spawn_multiple > Parent [pid 98895] about to spawn! > Parent [pid 98896] about to spawn! > Parent [pid 98897] about to spawn! > Parent done with spawn > Parent sending message to children > Parent done with spawn > Parent done with spawn > Hello from the child 0 of 2 on host Ralph pid 98898: argv[1] = foo > Child 0 received msg: 38 > Hello from the child 1 of 2 on host Ralph pid 98899: argv[1] = bar > Parent disconnected > Parent disconnected > Child 1 disconnected > Child 0 disconnected > Parent disconnected > > > > On May 5, 2010, at 12:08 PM, Fred Marquis wrote: > > > Hi, > > > > I am using mpi_comm_spawn_multiple to spawn multiple commands with > > argument lists. I am trying to do this in fortran (77) using version > > openmpi-1.4.1 and the ifort compiler v9.0. The operating system is SuSE > > Linux 10.1 (x86-64). > > > > I have put together a simple controlling example program (test_pbload.F) > > and an example slave program (spray.F) to try and explain my problem. > > > > In the controlling program mpi_comm_spawn_multiple is used to set 2 copies > > of the slave running. The first is started with the argument list "1 2 3 4" > > and the second with "5 6 7 8". > > > > The slaves are started OK and the slaves print out the argument lists and > > exit. In addition the slaves print out their rank numbers so I can see > > which argument list belongs to which slave. > > > > What I am finding is that the argument lists are not being sent to the > > slaves correctly, indeed both slaves seem to be getting both arguments > > lists !!! > > > > To compile and run the programs I follow the steps below. > > > > Controlling program "test_pbload.F" > > > > mpif77 -o test_pbload test_pbload.F > > > > Slave program "spray.F" > > > > mpif77 -o spray spray.F > > > > Run the controller > > > > mpirun -np 1 test_pbload > > > > > > > > > > The output of which is from the first slave: > > > > nsize, mytid: iargs 2 0 : 2 > > spray: 0 1:1 2 3 4 < FIRST ARGUMENT > > spray: 0 2:4 5 6 7 < SECOND ARGUMENT > > > > and the second slave: > > > > nsize, mytid: iargs 2 1 : 2 > > spray: 1 1:1 2 3 4 < FIRST ARGUMENT > > spray: 1 2:4 5 6 7 < SECOND ARGUMENT > > > > In each case the arguments (2 in both cases) are the same. > > > > I have written a C version of the controlling program and everthing works > > as expected so I presume that I have either got the specification of the > > argument list wrong or I have discovered an error/bug. At the moment I > > working on the former -- but am at a loss to see what is wrong !! > > > > Any help, pointers etc really appreciated. > > > > > > Controlling program (that uses MPI_COMM_SPAWN_MULTIPLE) test_pbload.F > > > > program main > > c > > implicit none > > #include "mpif.h" > > > > integer error > > integer intercomm > > CHARACTER*25 commands(2), argvs(2, 2) > > integer nprocs(2),info(2),ncpus > > c > > call mpi_init(error) > > c > > ncpus = 2 > > c > > commands(1) = ' ./spray ' > > nprocs(1) = 1 > > info(1) = MPI_INFO_NULL > > argvs(1, 1) = ' 1 2 3 4 ' > > argvs(1, 2) = ' ' > > c > > commands(2) = ' ./spray ' > > nprocs(2) = 1 > > info(2) = MPI_INFO_NULL > > argvs(2, 1) = ' 4 5 6 7 ' > > argvs(2, 2) = ' ' > > c > > call mpi_comm_spawn_multiple( ncpus, > > 1 commands, argvs, nprocs, info, > > 2 0, MPI_COMM_WORLD, intercomm, > > 3 MPI_ERRCODES_IGNORE, error ) > > c > > call mpi_finalize(error) > > c > > end > > > > Slave program (started by the controlling program) spray.F > > > > program main > > integer error > > integer pid > > character*20 line(100) > > call mpi_init(error) > > c > > CALL MPI_COMM_SIZE(MPI_COMM_WORLD,NSIZE,error) > > CALL MPI_COMM_RANK(MPI_COMM_WORLD,MYTID,error) > > c > > iargs=iargc() > > write(*,*) 'nsize, mytid: iargs', nsize, mytid, ":", iargs > > c > > if( iargs.gt.0 ) then > > do i = 1, iargs > > call getarg(i,line(i)) > > write(*,'(1x,a,i3,20(i2,1h:,a))') > > 1 'spray: ',mytid,i,line(i) > > enddo > > endif > > c > > call mpi_finalize(error) > > c > > end > > > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- ---------------------------------------------------------- Dr. A.J. Marquis Tel: +44 (0)20 7594 7040 Dept. of Mech. Eng. Fax: +44 (0)20 7594 1472 Imperial College Exhibition Road E-Mail: a.marq...@imperial.ac.uk London SW7 2AZ BOFH: Maintence window broken All views expressed are my own ! ----------------------------------------------------------