Looks to me like it didn't find your executable - could be a question of where 
it exists relative to where you are running. If you look in your OMPI source 
tree at the orte/test/mpi directory, you'll see an example program 
"simple_spawn.c" there. Just "make simple_spawn" and execute that with your 
default hostfile set - does it work okay?

It works fine for me, hence the question.

Also, what OMPI version are you using?

On Aug 28, 2012, at 4:25 PM, Brian Budge <brian.bu...@gmail.com> wrote:

> I see.  Okay.  So, I just tried removing the check for universe size,
> and set the universe size to 2.  Here's my output:
> 
> LD_LIBRARY_PATH=/home/budgeb/p4/pseb/external/lib.dev:/usr/local/lib
> OMPI_MCA_orte_default_hostfile=`pwd`/hostsfile ./master_exe
> [budgeb-interlagos:29965] [[4156,0],0] ORTE_ERROR_LOG: Fatal in file
> base/plm_base_receive.c at line 253
> [budgeb-interlagos:29963] [[4156,1],0] ORTE_ERROR_LOG: The specified
> application failed to start in file dpm_orte.c at line 785
> 
> The corresponding run with mpirun still works.
> 
> Thanks,
>  Brian
> 
> On Tue, Aug 28, 2012 at 2:46 PM, Ralph Castain <r...@open-mpi.org> wrote:
>> I see the issue - it's here:
>> 
>>>  MPI_Attr_get(MPI_COMM_WORLD, MPI_UNIVERSE_SIZE, &puniverseSize, &flag);
>>> 
>>>  if(!flag) {
>>>      std::cerr << "no universe size" << std::endl;
>>>      return -1;
>>>  }
>>>  universeSize = *puniverseSize;
>>>  if(universeSize == 1) {
>>>      std::cerr << "cannot start slaves... not enough nodes" << std::endl;
>>>  }
>> 
>> The universe size is set to 1 on a singleton because the attribute gets set 
>> at the beginning of time - we haven't any way to go back and change it. The 
>> sequence of events explains why. The singleton starts up and sets its 
>> attributes, including universe_size. It also spins off an orte daemon to act 
>> as its own private "mpirun" in case you call comm_spawn. At this point, 
>> however, no hostfile has been read - the singleton is just an MPI proc doing 
>> its own thing, and the orte daemon is just sitting there on "stand-by".
>> 
>> When your app calls comm_spawn, then the orte daemon gets called to launch 
>> the new procs. At that time, it (not the original singleton!) reads the 
>> hostfile to find out how many nodes are around, and then does the launch.
>> 
>> You are trying to check the number of nodes from within the singleton, which 
>> won't work - it has no way of discovering that info.
>> 
>> 
>> 
>> 
>> On Aug 28, 2012, at 2:38 PM, Brian Budge <brian.bu...@gmail.com> wrote:
>> 
>>>> echo hostsfile
>>> localhost
>>> budgeb-sandybridge
>>> 
>>> Thanks,
>>> Brian
>>> 
>>> On Tue, Aug 28, 2012 at 2:36 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>>> Hmmm...what is in your "hostsfile"?
>>>> 
>>>> On Aug 28, 2012, at 2:33 PM, Brian Budge <brian.bu...@gmail.com> wrote:
>>>> 
>>>>> Hi Ralph -
>>>>> 
>>>>> Thanks for confirming this is possible.  I'm trying this and currently
>>>>> failing.  Perhaps there's something I'm missing in the code to make
>>>>> this work.  Here are the two instantiations and their outputs:
>>>>> 
>>>>>> LD_LIBRARY_PATH=/home/budgeb/p4/pseb/external/lib.dev:/usr/local/lib 
>>>>>> OMPI_MCA_orte_default_hostfile=`pwd`/hostsfile ./master_exe
>>>>> cannot start slaves... not enough nodes
>>>>> 
>>>>>> LD_LIBRARY_PATH=/home/budgeb/p4/pseb/external/lib.dev:/usr/local/lib 
>>>>>> OMPI_MCA_orte_default_hostfile=`pwd`/hostsfile mpirun -n 1 ./master_exe
>>>>> master spawned 1 slaves...
>>>>> slave responding...
>>>>> 
>>>>> 
>>>>> The code:
>>>>> 
>>>>> //master.cpp
>>>>> #include <mpi.h>
>>>>> #include <boost/filesystem.hpp>
>>>>> #include <iostream>
>>>>> 
>>>>> int main(int argc, char **args) {
>>>>>  int worldSize, universeSize, *puniverseSize, flag;
>>>>> 
>>>>>  MPI_Comm everyone; //intercomm
>>>>>  boost::filesystem::path curPath =
>>>>> boost::filesystem::absolute(boost::filesystem::current_path());
>>>>> 
>>>>>  std::string toRun = (curPath / "slave_exe").string();
>>>>> 
>>>>>  int ret = MPI_Init(&argc, &args);
>>>>> 
>>>>>  if(ret != MPI_SUCCESS) {
>>>>>      std::cerr << "failed init" << std::endl;
>>>>>      return -1;
>>>>>  }
>>>>> 
>>>>>  MPI_Comm_size(MPI_COMM_WORLD, &worldSize);
>>>>> 
>>>>>  if(worldSize != 1) {
>>>>>      std::cerr << "too many masters" << std::endl;
>>>>>  }
>>>>> 
>>>>>  MPI_Attr_get(MPI_COMM_WORLD, MPI_UNIVERSE_SIZE, &puniverseSize, &flag);
>>>>> 
>>>>>  if(!flag) {
>>>>>      std::cerr << "no universe size" << std::endl;
>>>>>      return -1;
>>>>>  }
>>>>>  universeSize = *puniverseSize;
>>>>>  if(universeSize == 1) {
>>>>>      std::cerr << "cannot start slaves... not enough nodes" << std::endl;
>>>>>  }
>>>>> 
>>>>> 
>>>>>  char *buf = (char*)alloca(toRun.size() + 1);
>>>>>  memcpy(buf, toRun.c_str(), toRun.size());
>>>>>  buf[toRun.size()] = '\0';
>>>>> 
>>>>>  MPI_Comm_spawn(buf, MPI_ARGV_NULL, universeSize-1, MPI_INFO_NULL,
>>>>> 0, MPI_COMM_SELF, &everyone,
>>>>>                 MPI_ERRCODES_IGNORE);
>>>>> 
>>>>>  std::cerr << "master spawned " << universeSize-1 << " slaves..."
>>>>> << std::endl;
>>>>> 
>>>>>  MPI_Finalize();
>>>>> 
>>>>> return 0;
>>>>> }
>>>>> 
>>>>> 
>>>>> //slave.cpp
>>>>> #include <mpi.h>
>>>>> 
>>>>> int main(int argc, char **args) {
>>>>>  int size;
>>>>>  MPI_Comm parent;
>>>>>  MPI_Init(&argc, &args);
>>>>> 
>>>>>  MPI_Comm_get_parent(&parent);
>>>>> 
>>>>>  if(parent == MPI_COMM_NULL) {
>>>>>      std::cerr << "slave has no parent" << std::endl;
>>>>>  }
>>>>>  MPI_Comm_remote_size(parent, &size);
>>>>>  if(size != 1) {
>>>>>      std::cerr << "parent size is " << size << std::endl;
>>>>>  }
>>>>> 
>>>>>  std::cerr << "slave responding..." << std::endl;
>>>>> 
>>>>>  MPI_Finalize();
>>>>> 
>>>>>  return 0;
>>>>> }
>>>>> 
>>>>> 
>>>>> Any ideas?  Thanks for any help.
>>>>> 
>>>>> Brian
>>>>> 
>>>>> On Wed, Aug 22, 2012 at 9:03 AM, Ralph Castain <r...@open-mpi.org> wrote:
>>>>>> It really is just that simple :-)
>>>>>> 
>>>>>> On Aug 22, 2012, at 8:56 AM, Brian Budge <brian.bu...@gmail.com> wrote:
>>>>>> 
>>>>>>> Okay.  Is there a tutorial or FAQ for setting everything up?  Or is it
>>>>>>> really just that simple?  I don't need to run a copy of the orte
>>>>>>> server somewhere?
>>>>>>> 
>>>>>>> if my current ip is 192.168.0.1,
>>>>>>> 
>>>>>>> 0 > echo 192.168.0.11 > /tmp/hostfile
>>>>>>> 1 > echo 192.168.0.12 >> /tmp/hostfile
>>>>>>> 2 > export OMPI_MCA_orte_default_hostfile=/tmp/hostfile
>>>>>>> 3 > ./mySpawningExe
>>>>>>> 
>>>>>>> At this point, mySpawningExe will be the master, running on
>>>>>>> 192.168.0.1, and I can have spawned, for example, childExe on
>>>>>>> 192.168.0.11 and 192.168.0.12?  Or childExe1 on 192.168.0.11 and
>>>>>>> childExe2 on 192.168.0.12?
>>>>>>> 
>>>>>>> Thanks for the help.
>>>>>>> 
>>>>>>> Brian
>>>>>>> 
>>>>>>> On Wed, Aug 22, 2012 at 7:15 AM, Ralph Castain <r...@open-mpi.org> 
>>>>>>> wrote:
>>>>>>>> Sure, that's still true on all 1.3 or above releases. All you need to 
>>>>>>>> do is set the hostfile envar so we pick it up:
>>>>>>>> 
>>>>>>>> OMPI_MCA_orte_default_hostfile=<foo>
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Aug 21, 2012, at 7:23 PM, Brian Budge <brian.bu...@gmail.com> wrote:
>>>>>>>> 
>>>>>>>>> Hi.  I know this is an old thread, but I'm curious if there are any
>>>>>>>>> tutorials describing how to set this up?  Is this still available on
>>>>>>>>> newer open mpi versions?
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Brian
>>>>>>>>> 
>>>>>>>>> On Fri, Jan 4, 2008 at 7:57 AM, Ralph Castain <r...@lanl.gov> wrote:
>>>>>>>>>> Hi Elena
>>>>>>>>>> 
>>>>>>>>>> I'm copying this to the user list just to correct a mis-statement on 
>>>>>>>>>> my part
>>>>>>>>>> in an earlier message that went there. I had stated that a singleton 
>>>>>>>>>> could
>>>>>>>>>> comm_spawn onto other nodes listed in a hostfile by setting an 
>>>>>>>>>> environmental
>>>>>>>>>> variable that pointed us to the hostfile.
>>>>>>>>>> 
>>>>>>>>>> This is incorrect in the 1.2 code series. That series does not allow
>>>>>>>>>> singletons to read a hostfile at all. Hence, any comm_spawn done by a
>>>>>>>>>> singleton can only launch child processes on the singleton's local 
>>>>>>>>>> host.
>>>>>>>>>> 
>>>>>>>>>> This situation has been corrected for the upcoming 1.3 code series. 
>>>>>>>>>> For the
>>>>>>>>>> 1.2 series, though, you will have to do it via an mpirun command 
>>>>>>>>>> line.
>>>>>>>>>> 
>>>>>>>>>> Sorry for the confusion - I sometimes have too many code families to 
>>>>>>>>>> keep
>>>>>>>>>> straight in this old mind!
>>>>>>>>>> 
>>>>>>>>>> Ralph
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On 1/4/08 5:10 AM, "Elena Zhebel" <ezhe...@fugro-jason.com> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Hello Ralph,
>>>>>>>>>>> 
>>>>>>>>>>> Thank you very much for the explanations.
>>>>>>>>>>> But I still do not get it running...
>>>>>>>>>>> 
>>>>>>>>>>> For the case
>>>>>>>>>>> mpirun -n 1 -hostfile my_hostfile -host my_master_host my_master.exe
>>>>>>>>>>> everything works.
>>>>>>>>>>> 
>>>>>>>>>>> For the case
>>>>>>>>>>> ./my_master.exe
>>>>>>>>>>> it does not.
>>>>>>>>>>> 
>>>>>>>>>>> I did:
>>>>>>>>>>> - create my_hostfile and put it in the $HOME/.openmpi/components/
>>>>>>>>>>> my_hostfile :
>>>>>>>>>>> bollenstreek slots=2 max_slots=3
>>>>>>>>>>> octocore01 slots=8  max_slots=8
>>>>>>>>>>> octocore02 slots=8  max_slots=8
>>>>>>>>>>> clstr000 slots=2 max_slots=3
>>>>>>>>>>> clstr001 slots=2 max_slots=3
>>>>>>>>>>> clstr002 slots=2 max_slots=3
>>>>>>>>>>> clstr003 slots=2 max_slots=3
>>>>>>>>>>> clstr004 slots=2 max_slots=3
>>>>>>>>>>> clstr005 slots=2 max_slots=3
>>>>>>>>>>> clstr006 slots=2 max_slots=3
>>>>>>>>>>> clstr007 slots=2 max_slots=3
>>>>>>>>>>> - setenv OMPI_MCA_rds_hostfile_path my_hostfile (I  put it in 
>>>>>>>>>>> .tcshrc and
>>>>>>>>>>> then source .tcshrc)
>>>>>>>>>>> - in my_master.cpp I did
>>>>>>>>>>> MPI_Info info1;
>>>>>>>>>>> MPI_Info_create(&info1);
>>>>>>>>>>> char* hostname =
>>>>>>>>>>> "clstr002,clstr003,clstr005,clstr006,clstr007,octocore01,octocore02";
>>>>>>>>>>> MPI_Info_set(info1, "host", hostname);
>>>>>>>>>>> 
>>>>>>>>>>> _intercomm = intracomm.Spawn("./childexe", argv1, _nProc, info1, 0,
>>>>>>>>>>> MPI_ERRCODES_IGNORE);
>>>>>>>>>>> 
>>>>>>>>>>> - After I call the executable, I've got this error message
>>>>>>>>>>> 
>>>>>>>>>>> bollenstreek: > ./my_master
>>>>>>>>>>> number of processes to run: 1
>>>>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>>>> Some of the requested hosts are not included in the current 
>>>>>>>>>>> allocation for
>>>>>>>>>>> the application:
>>>>>>>>>>> ./childexe
>>>>>>>>>>> The requested hosts were:
>>>>>>>>>>> clstr002,clstr003,clstr005,clstr006,clstr007,octocore01,octocore02
>>>>>>>>>>> 
>>>>>>>>>>> Verify that you have mapped the allocated resources properly using 
>>>>>>>>>>> the
>>>>>>>>>>> --host specification.
>>>>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>>>> [bollenstreek:21443] [0,0,0] ORTE_ERROR_LOG: Out of resource in file
>>>>>>>>>>> base/rmaps_base_support_fns.c at line 225
>>>>>>>>>>> [bollenstreek:21443] [0,0,0] ORTE_ERROR_LOG: Out of resource in file
>>>>>>>>>>> rmaps_rr.c at line 478
>>>>>>>>>>> [bollenstreek:21443] [0,0,0] ORTE_ERROR_LOG: Out of resource in file
>>>>>>>>>>> base/rmaps_base_map_job.c at line 210
>>>>>>>>>>> [bollenstreek:21443] [0,0,0] ORTE_ERROR_LOG: Out of resource in file
>>>>>>>>>>> rmgr_urm.c at line 372
>>>>>>>>>>> [bollenstreek:21443] [0,0,0] ORTE_ERROR_LOG: Out of resource in file
>>>>>>>>>>> communicator/comm_dyn.c at line 608
>>>>>>>>>>> 
>>>>>>>>>>> Did I miss something?
>>>>>>>>>>> Thanks for help!
>>>>>>>>>>> 
>>>>>>>>>>> Elena
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: Ralph H Castain [mailto:r...@lanl.gov]
>>>>>>>>>>> Sent: Tuesday, December 18, 2007 3:50 PM
>>>>>>>>>>> To: Elena Zhebel; Open MPI Users <us...@open-mpi.org>
>>>>>>>>>>> Cc: Ralph H Castain
>>>>>>>>>>> Subject: Re: [OMPI users] MPI::Intracomm::Spawn and cluster 
>>>>>>>>>>> configuration
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On 12/18/07 7:35 AM, "Elena Zhebel" <ezhe...@fugro-jason.com> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Thanks a lot! Now it works!
>>>>>>>>>>>> The solution is to use mpirun -n 1 -hostfile my.hosts *.exe and 
>>>>>>>>>>>> pass
>>>>>>>>>>> MPI_Info
>>>>>>>>>>>> Key to the Spawn function!
>>>>>>>>>>>> 
>>>>>>>>>>>> One more question: is it necessary to start my "master" program 
>>>>>>>>>>>> with
>>>>>>>>>>>> mpirun -n 1 -hostfile my_hostfile -host my_master_host 
>>>>>>>>>>>> my_master.exe ?
>>>>>>>>>>> 
>>>>>>>>>>> No, it isn't necessary - assuming that my_master_host is the first 
>>>>>>>>>>> host
>>>>>>>>>>> listed in your hostfile! If you are only executing one 
>>>>>>>>>>> my_master.exe (i.e.,
>>>>>>>>>>> you gave -n 1 to mpirun), then we will automatically map that 
>>>>>>>>>>> process onto
>>>>>>>>>>> the first host in your hostfile.
>>>>>>>>>>> 
>>>>>>>>>>> If you want my_master.exe to go on someone other than the first 
>>>>>>>>>>> host in the
>>>>>>>>>>> file, then you have to give us the -host option.
>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Are there other possibilities for easy start?
>>>>>>>>>>>> I would say just to run ./my_master.exe , but then the master 
>>>>>>>>>>>> process
>>>>>>>>>>> doesn't
>>>>>>>>>>>> know about the available in the network hosts.
>>>>>>>>>>> 
>>>>>>>>>>> You can set the hostfile parameter in your environment instead of 
>>>>>>>>>>> on the
>>>>>>>>>>> command line. Just set OMPI_MCA_rds_hostfile_path = my.hosts.
>>>>>>>>>>> 
>>>>>>>>>>> You can then just run ./my_master.exe on the host where you want 
>>>>>>>>>>> the master
>>>>>>>>>>> to reside - everything should work the same.
>>>>>>>>>>> 
>>>>>>>>>>> Just as an FYI: the name of that environmental variable is going to 
>>>>>>>>>>> change
>>>>>>>>>>> in the 1.3 release, but everything will still work the same.
>>>>>>>>>>> 
>>>>>>>>>>> Hope that helps
>>>>>>>>>>> Ralph
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks and regards,
>>>>>>>>>>>> Elena
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>> From: Ralph H Castain [mailto:r...@lanl.gov]
>>>>>>>>>>>> Sent: Monday, December 17, 2007 5:49 PM
>>>>>>>>>>>> To: Open MPI Users <us...@open-mpi.org>; Elena Zhebel
>>>>>>>>>>>> Cc: Ralph H Castain
>>>>>>>>>>>> Subject: Re: [OMPI users] MPI::Intracomm::Spawn and cluster 
>>>>>>>>>>>> configuration
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On 12/17/07 8:19 AM, "Elena Zhebel" <ezhe...@fugro-jason.com> 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> Hello Ralph,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thank you for your answer.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I'm using OpenMPI 1.2.3. , compiler glibc232, Linux Suse 10.0.
>>>>>>>>>>>>> My "master" executable runs only on the one local host, then it 
>>>>>>>>>>>>> spawns
>>>>>>>>>>>>> "slaves" (with MPI::Intracomm::Spawn).
>>>>>>>>>>>>> My question was: how to determine the hosts where these "slaves" 
>>>>>>>>>>>>> will be
>>>>>>>>>>>>> spawned?
>>>>>>>>>>>>> You said: "You have to specify all of the hosts that can be used 
>>>>>>>>>>>>> by
>>>>>>>>>>>>> your job
>>>>>>>>>>>>> in the original hostfile". How can I specify the host file? I can 
>>>>>>>>>>>>> not
>>>>>>>>>>>>> find it
>>>>>>>>>>>>> in the documentation.
>>>>>>>>>>>> 
>>>>>>>>>>>> Hmmm...sorry about the lack of documentation. I always assumed 
>>>>>>>>>>>> that the MPI
>>>>>>>>>>>> folks in the project would document such things since it has 
>>>>>>>>>>>> little to do
>>>>>>>>>>>> with the underlying run-time, but I guess that fell through the 
>>>>>>>>>>>> cracks.
>>>>>>>>>>>> 
>>>>>>>>>>>> There are two parts to your question:
>>>>>>>>>>>> 
>>>>>>>>>>>> 1. how to specify the hosts to be used for the entire job. I 
>>>>>>>>>>>> believe that
>>>>>>>>>>> is
>>>>>>>>>>>> somewhat covered here:
>>>>>>>>>>>> http://www.open-mpi.org/faq/?category=running#simple-spmd-run
>>>>>>>>>>>> 
>>>>>>>>>>>> That FAQ tells you what a hostfile should look like, though you 
>>>>>>>>>>>> may already
>>>>>>>>>>>> know that. Basically, we require that you list -all- of the nodes 
>>>>>>>>>>>> that both
>>>>>>>>>>>> your master and slave programs will use.
>>>>>>>>>>>> 
>>>>>>>>>>>> 2. how to specify which nodes are available for the master, and 
>>>>>>>>>>>> which for
>>>>>>>>>>>> the slave.
>>>>>>>>>>>> 
>>>>>>>>>>>> You would specify the host for your master on the mpirun command 
>>>>>>>>>>>> line with
>>>>>>>>>>>> something like:
>>>>>>>>>>>> 
>>>>>>>>>>>> mpirun -n 1 -hostfile my_hostfile -host my_master_host 
>>>>>>>>>>>> my_master.exe
>>>>>>>>>>>> 
>>>>>>>>>>>> This directs Open MPI to map that specified executable on the 
>>>>>>>>>>>> specified
>>>>>>>>>>> host
>>>>>>>>>>>> - note that my_master_host must have been in my_hostfile.
>>>>>>>>>>>> 
>>>>>>>>>>>> Inside your master, you would create an MPI_Info key "host" that 
>>>>>>>>>>>> has a
>>>>>>>>>>> value
>>>>>>>>>>>> consisting of a string "host1,host2,host3" identifying the hosts 
>>>>>>>>>>>> you want
>>>>>>>>>>>> your slave to execute upon. Those hosts must have been included in
>>>>>>>>>>>> my_hostfile. Include that key in the MPI_Info array passed to your 
>>>>>>>>>>>> Spawn.
>>>>>>>>>>>> 
>>>>>>>>>>>> We don't currently support providing a hostfile for the slaves (as 
>>>>>>>>>>>> opposed
>>>>>>>>>>>> to the host-at-a-time string above). This may become available in 
>>>>>>>>>>>> a future
>>>>>>>>>>>> release - TBD.
>>>>>>>>>>>> 
>>>>>>>>>>>> Hope that helps
>>>>>>>>>>>> Ralph
>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks and regards,
>>>>>>>>>>>>> Elena
>>>>>>>>>>>>> 
>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>> From: users-boun...@open-mpi.org 
>>>>>>>>>>>>> [mailto:users-boun...@open-mpi.org] On
>>>>>>>>>>>>> Behalf Of Ralph H Castain
>>>>>>>>>>>>> Sent: Monday, December 17, 2007 3:31 PM
>>>>>>>>>>>>> To: Open MPI Users <us...@open-mpi.org>
>>>>>>>>>>>>> Cc: Ralph H Castain
>>>>>>>>>>>>> Subject: Re: [OMPI users] MPI::Intracomm::Spawn and cluster
>>>>>>>>>>>>> configuration
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On 12/12/07 5:46 AM, "Elena Zhebel" <ezhe...@fugro-jason.com> 
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I'm working on a MPI application where I'm using OpenMPI instead 
>>>>>>>>>>>>>> of
>>>>>>>>>>>>>> MPICH.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> In my "master" program I call the function MPI::Intracomm::Spawn 
>>>>>>>>>>>>>> which
>>>>>>>>>>>>> spawns
>>>>>>>>>>>>>> "slave" processes. It is not clear for me how to spawn the 
>>>>>>>>>>>>>> "slave"
>>>>>>>>>>>>> processes
>>>>>>>>>>>>>> over the network. Currently "master" creates "slaves" on the same
>>>>>>>>>>>>>> host.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> If I use 'mpirun --hostfile openmpi.hosts' then processes are 
>>>>>>>>>>>>>> spawn
>>>>>>>>>>>>>> over
>>>>>>>>>>>>> the
>>>>>>>>>>>>>> network as expected. But now I need to spawn processes over the
>>>>>>>>>>>>>> network
>>>>>>>>>>>>> from
>>>>>>>>>>>>>> my own executable using MPI::Intracomm::Spawn, how can I achieve 
>>>>>>>>>>>>>> it?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I'm not sure from your description exactly what you are trying to 
>>>>>>>>>>>>> do,
>>>>>>>>>>>>> nor in
>>>>>>>>>>>>> what environment this is all operating within or what version of 
>>>>>>>>>>>>> Open
>>>>>>>>>>>>> MPI
>>>>>>>>>>>>> you are using. Setting aside the environment and version issue, 
>>>>>>>>>>>>> I'm
>>>>>>>>>>>>> guessing
>>>>>>>>>>>>> that you are running your executable over some specified set of 
>>>>>>>>>>>>> hosts,
>>>>>>>>>>>>> but
>>>>>>>>>>>>> want to provide a different hostfile that specifies the hosts to 
>>>>>>>>>>>>> be
>>>>>>>>>>>>> used for
>>>>>>>>>>>>> the "slave" processes. Correct?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> If that is correct, then I'm afraid you can't do that in any 
>>>>>>>>>>>>> version
>>>>>>>>>>>>> of Open
>>>>>>>>>>>>> MPI today. You have to specify all of the hosts that can be used 
>>>>>>>>>>>>> by
>>>>>>>>>>>>> your job
>>>>>>>>>>>>> in the original hostfile. You can then specify a subset of those 
>>>>>>>>>>>>> hosts
>>>>>>>>>>>>> to be
>>>>>>>>>>>>> used by your original "master" program, and then specify a 
>>>>>>>>>>>>> different
>>>>>>>>>>>>> subset
>>>>>>>>>>>>> to be used by the "slaves" when calling Spawn.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> But the system requires that you tell it -all- of the hosts that 
>>>>>>>>>>>>> are
>>>>>>>>>>>>> going
>>>>>>>>>>>>> to be used at the beginning of the job.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> At the moment, there is no plan to remove that requirement, though
>>>>>>>>>>>>> there has
>>>>>>>>>>>>> been occasional discussion about doing so at some point in the 
>>>>>>>>>>>>> future.
>>>>>>>>>>>>> No
>>>>>>>>>>>>> promises that it will happen, though - managed environments, in
>>>>>>>>>>>>> particular,
>>>>>>>>>>>>> currently object to the idea of changing the allocation 
>>>>>>>>>>>>> on-the-fly. We
>>>>>>>>>>>>> may,
>>>>>>>>>>>>> though, make a provision for purely hostfile-based environments 
>>>>>>>>>>>>> (i.e.,
>>>>>>>>>>>>> unmanaged) at some time in the future.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Ralph
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks in advance for any help.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Elena
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> users mailing list
>>>>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> users mailing list
>>>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>>>> 
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> users mailing list
>>>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> _______________________________________________
>>>>>>>>>> users mailing list
>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> us...@open-mpi.org
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>> 
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> us...@open-mpi.org
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> 
>>>> 
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to