>echo hostsfile
localhost
budgeb-sandybridge

Thanks,
  Brian

On Tue, Aug 28, 2012 at 2:36 PM, Ralph Castain <r...@open-mpi.org> wrote:
> Hmmm...what is in your "hostsfile"?
>
> On Aug 28, 2012, at 2:33 PM, Brian Budge <brian.bu...@gmail.com> wrote:
>
>> Hi Ralph -
>>
>> Thanks for confirming this is possible.  I'm trying this and currently
>> failing.  Perhaps there's something I'm missing in the code to make
>> this work.  Here are the two instantiations and their outputs:
>>
>>> LD_LIBRARY_PATH=/home/budgeb/p4/pseb/external/lib.dev:/usr/local/lib 
>>> OMPI_MCA_orte_default_hostfile=`pwd`/hostsfile ./master_exe
>> cannot start slaves... not enough nodes
>>
>>> LD_LIBRARY_PATH=/home/budgeb/p4/pseb/external/lib.dev:/usr/local/lib 
>>> OMPI_MCA_orte_default_hostfile=`pwd`/hostsfile mpirun -n 1 ./master_exe
>> master spawned 1 slaves...
>> slave responding...
>>
>>
>> The code:
>>
>> //master.cpp
>> #include <mpi.h>
>> #include <boost/filesystem.hpp>
>> #include <iostream>
>>
>> int main(int argc, char **args) {
>>    int worldSize, universeSize, *puniverseSize, flag;
>>
>>    MPI_Comm everyone; //intercomm
>>    boost::filesystem::path curPath =
>> boost::filesystem::absolute(boost::filesystem::current_path());
>>
>>    std::string toRun = (curPath / "slave_exe").string();
>>
>>    int ret = MPI_Init(&argc, &args);
>>
>>    if(ret != MPI_SUCCESS) {
>>        std::cerr << "failed init" << std::endl;
>>        return -1;
>>    }
>>
>>    MPI_Comm_size(MPI_COMM_WORLD, &worldSize);
>>
>>    if(worldSize != 1) {
>>        std::cerr << "too many masters" << std::endl;
>>    }
>>
>>    MPI_Attr_get(MPI_COMM_WORLD, MPI_UNIVERSE_SIZE, &puniverseSize, &flag);
>>
>>    if(!flag) {
>>        std::cerr << "no universe size" << std::endl;
>>        return -1;
>>    }
>>    universeSize = *puniverseSize;
>>    if(universeSize == 1) {
>>        std::cerr << "cannot start slaves... not enough nodes" << std::endl;
>>    }
>>
>>
>>    char *buf = (char*)alloca(toRun.size() + 1);
>>    memcpy(buf, toRun.c_str(), toRun.size());
>>    buf[toRun.size()] = '\0';
>>
>>    MPI_Comm_spawn(buf, MPI_ARGV_NULL, universeSize-1, MPI_INFO_NULL,
>> 0, MPI_COMM_SELF, &everyone,
>>                   MPI_ERRCODES_IGNORE);
>>
>>    std::cerr << "master spawned " << universeSize-1 << " slaves..."
>> << std::endl;
>>
>>    MPI_Finalize();
>>
>>   return 0;
>> }
>>
>>
>> //slave.cpp
>> #include <mpi.h>
>>
>> int main(int argc, char **args) {
>>    int size;
>>    MPI_Comm parent;
>>    MPI_Init(&argc, &args);
>>
>>    MPI_Comm_get_parent(&parent);
>>
>>    if(parent == MPI_COMM_NULL) {
>>        std::cerr << "slave has no parent" << std::endl;
>>    }
>>    MPI_Comm_remote_size(parent, &size);
>>    if(size != 1) {
>>        std::cerr << "parent size is " << size << std::endl;
>>    }
>>
>>    std::cerr << "slave responding..." << std::endl;
>>
>>    MPI_Finalize();
>>
>>    return 0;
>> }
>>
>>
>> Any ideas?  Thanks for any help.
>>
>>  Brian
>>
>> On Wed, Aug 22, 2012 at 9:03 AM, Ralph Castain <r...@open-mpi.org> wrote:
>>> It really is just that simple :-)
>>>
>>> On Aug 22, 2012, at 8:56 AM, Brian Budge <brian.bu...@gmail.com> wrote:
>>>
>>>> Okay.  Is there a tutorial or FAQ for setting everything up?  Or is it
>>>> really just that simple?  I don't need to run a copy of the orte
>>>> server somewhere?
>>>>
>>>> if my current ip is 192.168.0.1,
>>>>
>>>> 0 > echo 192.168.0.11 > /tmp/hostfile
>>>> 1 > echo 192.168.0.12 >> /tmp/hostfile
>>>> 2 > export OMPI_MCA_orte_default_hostfile=/tmp/hostfile
>>>> 3 > ./mySpawningExe
>>>>
>>>> At this point, mySpawningExe will be the master, running on
>>>> 192.168.0.1, and I can have spawned, for example, childExe on
>>>> 192.168.0.11 and 192.168.0.12?  Or childExe1 on 192.168.0.11 and
>>>> childExe2 on 192.168.0.12?
>>>>
>>>> Thanks for the help.
>>>>
>>>> Brian
>>>>
>>>> On Wed, Aug 22, 2012 at 7:15 AM, Ralph Castain <r...@open-mpi.org> wrote:
>>>>> Sure, that's still true on all 1.3 or above releases. All you need to do 
>>>>> is set the hostfile envar so we pick it up:
>>>>>
>>>>> OMPI_MCA_orte_default_hostfile=<foo>
>>>>>
>>>>>
>>>>> On Aug 21, 2012, at 7:23 PM, Brian Budge <brian.bu...@gmail.com> wrote:
>>>>>
>>>>>> Hi.  I know this is an old thread, but I'm curious if there are any
>>>>>> tutorials describing how to set this up?  Is this still available on
>>>>>> newer open mpi versions?
>>>>>>
>>>>>> Thanks,
>>>>>> Brian
>>>>>>
>>>>>> On Fri, Jan 4, 2008 at 7:57 AM, Ralph Castain <r...@lanl.gov> wrote:
>>>>>>> Hi Elena
>>>>>>>
>>>>>>> I'm copying this to the user list just to correct a mis-statement on my 
>>>>>>> part
>>>>>>> in an earlier message that went there. I had stated that a singleton 
>>>>>>> could
>>>>>>> comm_spawn onto other nodes listed in a hostfile by setting an 
>>>>>>> environmental
>>>>>>> variable that pointed us to the hostfile.
>>>>>>>
>>>>>>> This is incorrect in the 1.2 code series. That series does not allow
>>>>>>> singletons to read a hostfile at all. Hence, any comm_spawn done by a
>>>>>>> singleton can only launch child processes on the singleton's local host.
>>>>>>>
>>>>>>> This situation has been corrected for the upcoming 1.3 code series. For 
>>>>>>> the
>>>>>>> 1.2 series, though, you will have to do it via an mpirun command line.
>>>>>>>
>>>>>>> Sorry for the confusion - I sometimes have too many code families to 
>>>>>>> keep
>>>>>>> straight in this old mind!
>>>>>>>
>>>>>>> Ralph
>>>>>>>
>>>>>>>
>>>>>>> On 1/4/08 5:10 AM, "Elena Zhebel" <ezhe...@fugro-jason.com> wrote:
>>>>>>>
>>>>>>>> Hello Ralph,
>>>>>>>>
>>>>>>>> Thank you very much for the explanations.
>>>>>>>> But I still do not get it running...
>>>>>>>>
>>>>>>>> For the case
>>>>>>>> mpirun -n 1 -hostfile my_hostfile -host my_master_host my_master.exe
>>>>>>>> everything works.
>>>>>>>>
>>>>>>>> For the case
>>>>>>>> ./my_master.exe
>>>>>>>> it does not.
>>>>>>>>
>>>>>>>> I did:
>>>>>>>> - create my_hostfile and put it in the $HOME/.openmpi/components/
>>>>>>>> my_hostfile :
>>>>>>>> bollenstreek slots=2 max_slots=3
>>>>>>>> octocore01 slots=8  max_slots=8
>>>>>>>> octocore02 slots=8  max_slots=8
>>>>>>>> clstr000 slots=2 max_slots=3
>>>>>>>> clstr001 slots=2 max_slots=3
>>>>>>>> clstr002 slots=2 max_slots=3
>>>>>>>> clstr003 slots=2 max_slots=3
>>>>>>>> clstr004 slots=2 max_slots=3
>>>>>>>> clstr005 slots=2 max_slots=3
>>>>>>>> clstr006 slots=2 max_slots=3
>>>>>>>> clstr007 slots=2 max_slots=3
>>>>>>>> - setenv OMPI_MCA_rds_hostfile_path my_hostfile (I  put it in .tcshrc 
>>>>>>>> and
>>>>>>>> then source .tcshrc)
>>>>>>>> - in my_master.cpp I did
>>>>>>>> MPI_Info info1;
>>>>>>>> MPI_Info_create(&info1);
>>>>>>>> char* hostname =
>>>>>>>> "clstr002,clstr003,clstr005,clstr006,clstr007,octocore01,octocore02";
>>>>>>>> MPI_Info_set(info1, "host", hostname);
>>>>>>>>
>>>>>>>> _intercomm = intracomm.Spawn("./childexe", argv1, _nProc, info1, 0,
>>>>>>>> MPI_ERRCODES_IGNORE);
>>>>>>>>
>>>>>>>> - After I call the executable, I've got this error message
>>>>>>>>
>>>>>>>> bollenstreek: > ./my_master
>>>>>>>> number of processes to run: 1
>>>>>>>> --------------------------------------------------------------------------
>>>>>>>> Some of the requested hosts are not included in the current allocation 
>>>>>>>> for
>>>>>>>> the application:
>>>>>>>> ./childexe
>>>>>>>> The requested hosts were:
>>>>>>>> clstr002,clstr003,clstr005,clstr006,clstr007,octocore01,octocore02
>>>>>>>>
>>>>>>>> Verify that you have mapped the allocated resources properly using the
>>>>>>>> --host specification.
>>>>>>>> --------------------------------------------------------------------------
>>>>>>>> [bollenstreek:21443] [0,0,0] ORTE_ERROR_LOG: Out of resource in file
>>>>>>>> base/rmaps_base_support_fns.c at line 225
>>>>>>>> [bollenstreek:21443] [0,0,0] ORTE_ERROR_LOG: Out of resource in file
>>>>>>>> rmaps_rr.c at line 478
>>>>>>>> [bollenstreek:21443] [0,0,0] ORTE_ERROR_LOG: Out of resource in file
>>>>>>>> base/rmaps_base_map_job.c at line 210
>>>>>>>> [bollenstreek:21443] [0,0,0] ORTE_ERROR_LOG: Out of resource in file
>>>>>>>> rmgr_urm.c at line 372
>>>>>>>> [bollenstreek:21443] [0,0,0] ORTE_ERROR_LOG: Out of resource in file
>>>>>>>> communicator/comm_dyn.c at line 608
>>>>>>>>
>>>>>>>> Did I miss something?
>>>>>>>> Thanks for help!
>>>>>>>>
>>>>>>>> Elena
>>>>>>>>
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Ralph H Castain [mailto:r...@lanl.gov]
>>>>>>>> Sent: Tuesday, December 18, 2007 3:50 PM
>>>>>>>> To: Elena Zhebel; Open MPI Users <us...@open-mpi.org>
>>>>>>>> Cc: Ralph H Castain
>>>>>>>> Subject: Re: [OMPI users] MPI::Intracomm::Spawn and cluster 
>>>>>>>> configuration
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 12/18/07 7:35 AM, "Elena Zhebel" <ezhe...@fugro-jason.com> wrote:
>>>>>>>>
>>>>>>>>> Thanks a lot! Now it works!
>>>>>>>>> The solution is to use mpirun -n 1 -hostfile my.hosts *.exe and pass
>>>>>>>> MPI_Info
>>>>>>>>> Key to the Spawn function!
>>>>>>>>>
>>>>>>>>> One more question: is it necessary to start my "master" program with
>>>>>>>>> mpirun -n 1 -hostfile my_hostfile -host my_master_host my_master.exe ?
>>>>>>>>
>>>>>>>> No, it isn't necessary - assuming that my_master_host is the first host
>>>>>>>> listed in your hostfile! If you are only executing one my_master.exe 
>>>>>>>> (i.e.,
>>>>>>>> you gave -n 1 to mpirun), then we will automatically map that process 
>>>>>>>> onto
>>>>>>>> the first host in your hostfile.
>>>>>>>>
>>>>>>>> If you want my_master.exe to go on someone other than the first host 
>>>>>>>> in the
>>>>>>>> file, then you have to give us the -host option.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Are there other possibilities for easy start?
>>>>>>>>> I would say just to run ./my_master.exe , but then the master process
>>>>>>>> doesn't
>>>>>>>>> know about the available in the network hosts.
>>>>>>>>
>>>>>>>> You can set the hostfile parameter in your environment instead of on 
>>>>>>>> the
>>>>>>>> command line. Just set OMPI_MCA_rds_hostfile_path = my.hosts.
>>>>>>>>
>>>>>>>> You can then just run ./my_master.exe on the host where you want the 
>>>>>>>> master
>>>>>>>> to reside - everything should work the same.
>>>>>>>>
>>>>>>>> Just as an FYI: the name of that environmental variable is going to 
>>>>>>>> change
>>>>>>>> in the 1.3 release, but everything will still work the same.
>>>>>>>>
>>>>>>>> Hope that helps
>>>>>>>> Ralph
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks and regards,
>>>>>>>>> Elena
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Ralph H Castain [mailto:r...@lanl.gov]
>>>>>>>>> Sent: Monday, December 17, 2007 5:49 PM
>>>>>>>>> To: Open MPI Users <us...@open-mpi.org>; Elena Zhebel
>>>>>>>>> Cc: Ralph H Castain
>>>>>>>>> Subject: Re: [OMPI users] MPI::Intracomm::Spawn and cluster 
>>>>>>>>> configuration
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 12/17/07 8:19 AM, "Elena Zhebel" <ezhe...@fugro-jason.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hello Ralph,
>>>>>>>>>>
>>>>>>>>>> Thank you for your answer.
>>>>>>>>>>
>>>>>>>>>> I'm using OpenMPI 1.2.3. , compiler glibc232, Linux Suse 10.0.
>>>>>>>>>> My "master" executable runs only on the one local host, then it 
>>>>>>>>>> spawns
>>>>>>>>>> "slaves" (with MPI::Intracomm::Spawn).
>>>>>>>>>> My question was: how to determine the hosts where these "slaves" 
>>>>>>>>>> will be
>>>>>>>>>> spawned?
>>>>>>>>>> You said: "You have to specify all of the hosts that can be used by
>>>>>>>>>> your job
>>>>>>>>>> in the original hostfile". How can I specify the host file? I can not
>>>>>>>>>> find it
>>>>>>>>>> in the documentation.
>>>>>>>>>
>>>>>>>>> Hmmm...sorry about the lack of documentation. I always assumed that 
>>>>>>>>> the MPI
>>>>>>>>> folks in the project would document such things since it has little 
>>>>>>>>> to do
>>>>>>>>> with the underlying run-time, but I guess that fell through the 
>>>>>>>>> cracks.
>>>>>>>>>
>>>>>>>>> There are two parts to your question:
>>>>>>>>>
>>>>>>>>> 1. how to specify the hosts to be used for the entire job. I believe 
>>>>>>>>> that
>>>>>>>> is
>>>>>>>>> somewhat covered here:
>>>>>>>>> http://www.open-mpi.org/faq/?category=running#simple-spmd-run
>>>>>>>>>
>>>>>>>>> That FAQ tells you what a hostfile should look like, though you may 
>>>>>>>>> already
>>>>>>>>> know that. Basically, we require that you list -all- of the nodes 
>>>>>>>>> that both
>>>>>>>>> your master and slave programs will use.
>>>>>>>>>
>>>>>>>>> 2. how to specify which nodes are available for the master, and which 
>>>>>>>>> for
>>>>>>>>> the slave.
>>>>>>>>>
>>>>>>>>> You would specify the host for your master on the mpirun command line 
>>>>>>>>> with
>>>>>>>>> something like:
>>>>>>>>>
>>>>>>>>> mpirun -n 1 -hostfile my_hostfile -host my_master_host my_master.exe
>>>>>>>>>
>>>>>>>>> This directs Open MPI to map that specified executable on the 
>>>>>>>>> specified
>>>>>>>> host
>>>>>>>>> - note that my_master_host must have been in my_hostfile.
>>>>>>>>>
>>>>>>>>> Inside your master, you would create an MPI_Info key "host" that has a
>>>>>>>> value
>>>>>>>>> consisting of a string "host1,host2,host3" identifying the hosts you 
>>>>>>>>> want
>>>>>>>>> your slave to execute upon. Those hosts must have been included in
>>>>>>>>> my_hostfile. Include that key in the MPI_Info array passed to your 
>>>>>>>>> Spawn.
>>>>>>>>>
>>>>>>>>> We don't currently support providing a hostfile for the slaves (as 
>>>>>>>>> opposed
>>>>>>>>> to the host-at-a-time string above). This may become available in a 
>>>>>>>>> future
>>>>>>>>> release - TBD.
>>>>>>>>>
>>>>>>>>> Hope that helps
>>>>>>>>> Ralph
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks and regards,
>>>>>>>>>> Elena
>>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] 
>>>>>>>>>> On
>>>>>>>>>> Behalf Of Ralph H Castain
>>>>>>>>>> Sent: Monday, December 17, 2007 3:31 PM
>>>>>>>>>> To: Open MPI Users <us...@open-mpi.org>
>>>>>>>>>> Cc: Ralph H Castain
>>>>>>>>>> Subject: Re: [OMPI users] MPI::Intracomm::Spawn and cluster
>>>>>>>>>> configuration
>>>>>>>>>>
>>>>>>>>>> On 12/12/07 5:46 AM, "Elena Zhebel" <ezhe...@fugro-jason.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Hello,
>>>>>>>>>>>
>>>>>>>>>>> I'm working on a MPI application where I'm using OpenMPI instead of
>>>>>>>>>>> MPICH.
>>>>>>>>>>>
>>>>>>>>>>> In my "master" program I call the function MPI::Intracomm::Spawn 
>>>>>>>>>>> which
>>>>>>>>>> spawns
>>>>>>>>>>> "slave" processes. It is not clear for me how to spawn the "slave"
>>>>>>>>>> processes
>>>>>>>>>>> over the network. Currently "master" creates "slaves" on the same
>>>>>>>>>>> host.
>>>>>>>>>>>
>>>>>>>>>>> If I use 'mpirun --hostfile openmpi.hosts' then processes are spawn
>>>>>>>>>>> over
>>>>>>>>>> the
>>>>>>>>>>> network as expected. But now I need to spawn processes over the
>>>>>>>>>>> network
>>>>>>>>>> from
>>>>>>>>>>> my own executable using MPI::Intracomm::Spawn, how can I achieve it?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I'm not sure from your description exactly what you are trying to do,
>>>>>>>>>> nor in
>>>>>>>>>> what environment this is all operating within or what version of Open
>>>>>>>>>> MPI
>>>>>>>>>> you are using. Setting aside the environment and version issue, I'm
>>>>>>>>>> guessing
>>>>>>>>>> that you are running your executable over some specified set of 
>>>>>>>>>> hosts,
>>>>>>>>>> but
>>>>>>>>>> want to provide a different hostfile that specifies the hosts to be
>>>>>>>>>> used for
>>>>>>>>>> the "slave" processes. Correct?
>>>>>>>>>>
>>>>>>>>>> If that is correct, then I'm afraid you can't do that in any version
>>>>>>>>>> of Open
>>>>>>>>>> MPI today. You have to specify all of the hosts that can be used by
>>>>>>>>>> your job
>>>>>>>>>> in the original hostfile. You can then specify a subset of those 
>>>>>>>>>> hosts
>>>>>>>>>> to be
>>>>>>>>>> used by your original "master" program, and then specify a different
>>>>>>>>>> subset
>>>>>>>>>> to be used by the "slaves" when calling Spawn.
>>>>>>>>>>
>>>>>>>>>> But the system requires that you tell it -all- of the hosts that are
>>>>>>>>>> going
>>>>>>>>>> to be used at the beginning of the job.
>>>>>>>>>>
>>>>>>>>>> At the moment, there is no plan to remove that requirement, though
>>>>>>>>>> there has
>>>>>>>>>> been occasional discussion about doing so at some point in the 
>>>>>>>>>> future.
>>>>>>>>>> No
>>>>>>>>>> promises that it will happen, though - managed environments, in
>>>>>>>>>> particular,
>>>>>>>>>> currently object to the idea of changing the allocation on-the-fly. 
>>>>>>>>>> We
>>>>>>>>>> may,
>>>>>>>>>> though, make a provision for purely hostfile-based environments 
>>>>>>>>>> (i.e.,
>>>>>>>>>> unmanaged) at some time in the future.
>>>>>>>>>>
>>>>>>>>>> Ralph
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks in advance for any help.
>>>>>>>>>>>
>>>>>>>>>>> Elena
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> users mailing list
>>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> users mailing list
>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> users mailing list
>>>>>>>>>> us...@open-mpi.org
>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to