Sure, that's still true on all 1.3 or above releases. All you need to do is set the hostfile envar so we pick it up:
OMPI_MCA_orte_default_hostfile=<foo> On Aug 21, 2012, at 7:23 PM, Brian Budge <brian.bu...@gmail.com> wrote: > Hi. I know this is an old thread, but I'm curious if there are any > tutorials describing how to set this up? Is this still available on > newer open mpi versions? > > Thanks, > Brian > > On Fri, Jan 4, 2008 at 7:57 AM, Ralph Castain <r...@lanl.gov> wrote: >> Hi Elena >> >> I'm copying this to the user list just to correct a mis-statement on my part >> in an earlier message that went there. I had stated that a singleton could >> comm_spawn onto other nodes listed in a hostfile by setting an environmental >> variable that pointed us to the hostfile. >> >> This is incorrect in the 1.2 code series. That series does not allow >> singletons to read a hostfile at all. Hence, any comm_spawn done by a >> singleton can only launch child processes on the singleton's local host. >> >> This situation has been corrected for the upcoming 1.3 code series. For the >> 1.2 series, though, you will have to do it via an mpirun command line. >> >> Sorry for the confusion - I sometimes have too many code families to keep >> straight in this old mind! >> >> Ralph >> >> >> On 1/4/08 5:10 AM, "Elena Zhebel" <ezhe...@fugro-jason.com> wrote: >> >>> Hello Ralph, >>> >>> Thank you very much for the explanations. >>> But I still do not get it running... >>> >>> For the case >>> mpirun -n 1 -hostfile my_hostfile -host my_master_host my_master.exe >>> everything works. >>> >>> For the case >>> ./my_master.exe >>> it does not. >>> >>> I did: >>> - create my_hostfile and put it in the $HOME/.openmpi/components/ >>> my_hostfile : >>> bollenstreek slots=2 max_slots=3 >>> octocore01 slots=8 max_slots=8 >>> octocore02 slots=8 max_slots=8 >>> clstr000 slots=2 max_slots=3 >>> clstr001 slots=2 max_slots=3 >>> clstr002 slots=2 max_slots=3 >>> clstr003 slots=2 max_slots=3 >>> clstr004 slots=2 max_slots=3 >>> clstr005 slots=2 max_slots=3 >>> clstr006 slots=2 max_slots=3 >>> clstr007 slots=2 max_slots=3 >>> - setenv OMPI_MCA_rds_hostfile_path my_hostfile (I put it in .tcshrc and >>> then source .tcshrc) >>> - in my_master.cpp I did >>> MPI_Info info1; >>> MPI_Info_create(&info1); >>> char* hostname = >>> "clstr002,clstr003,clstr005,clstr006,clstr007,octocore01,octocore02"; >>> MPI_Info_set(info1, "host", hostname); >>> >>> _intercomm = intracomm.Spawn("./childexe", argv1, _nProc, info1, 0, >>> MPI_ERRCODES_IGNORE); >>> >>> - After I call the executable, I've got this error message >>> >>> bollenstreek: > ./my_master >>> number of processes to run: 1 >>> -------------------------------------------------------------------------- >>> Some of the requested hosts are not included in the current allocation for >>> the application: >>> ./childexe >>> The requested hosts were: >>> clstr002,clstr003,clstr005,clstr006,clstr007,octocore01,octocore02 >>> >>> Verify that you have mapped the allocated resources properly using the >>> --host specification. >>> -------------------------------------------------------------------------- >>> [bollenstreek:21443] [0,0,0] ORTE_ERROR_LOG: Out of resource in file >>> base/rmaps_base_support_fns.c at line 225 >>> [bollenstreek:21443] [0,0,0] ORTE_ERROR_LOG: Out of resource in file >>> rmaps_rr.c at line 478 >>> [bollenstreek:21443] [0,0,0] ORTE_ERROR_LOG: Out of resource in file >>> base/rmaps_base_map_job.c at line 210 >>> [bollenstreek:21443] [0,0,0] ORTE_ERROR_LOG: Out of resource in file >>> rmgr_urm.c at line 372 >>> [bollenstreek:21443] [0,0,0] ORTE_ERROR_LOG: Out of resource in file >>> communicator/comm_dyn.c at line 608 >>> >>> Did I miss something? >>> Thanks for help! >>> >>> Elena >>> >>> >>> -----Original Message----- >>> From: Ralph H Castain [mailto:r...@lanl.gov] >>> Sent: Tuesday, December 18, 2007 3:50 PM >>> To: Elena Zhebel; Open MPI Users <us...@open-mpi.org> >>> Cc: Ralph H Castain >>> Subject: Re: [OMPI users] MPI::Intracomm::Spawn and cluster configuration >>> >>> >>> >>> >>> On 12/18/07 7:35 AM, "Elena Zhebel" <ezhe...@fugro-jason.com> wrote: >>> >>>> Thanks a lot! Now it works! >>>> The solution is to use mpirun -n 1 -hostfile my.hosts *.exe and pass >>> MPI_Info >>>> Key to the Spawn function! >>>> >>>> One more question: is it necessary to start my "master" program with >>>> mpirun -n 1 -hostfile my_hostfile -host my_master_host my_master.exe ? >>> >>> No, it isn't necessary - assuming that my_master_host is the first host >>> listed in your hostfile! If you are only executing one my_master.exe (i.e., >>> you gave -n 1 to mpirun), then we will automatically map that process onto >>> the first host in your hostfile. >>> >>> If you want my_master.exe to go on someone other than the first host in the >>> file, then you have to give us the -host option. >>> >>>> >>>> Are there other possibilities for easy start? >>>> I would say just to run ./my_master.exe , but then the master process >>> doesn't >>>> know about the available in the network hosts. >>> >>> You can set the hostfile parameter in your environment instead of on the >>> command line. Just set OMPI_MCA_rds_hostfile_path = my.hosts. >>> >>> You can then just run ./my_master.exe on the host where you want the master >>> to reside - everything should work the same. >>> >>> Just as an FYI: the name of that environmental variable is going to change >>> in the 1.3 release, but everything will still work the same. >>> >>> Hope that helps >>> Ralph >>> >>> >>>> >>>> Thanks and regards, >>>> Elena >>>> >>>> >>>> -----Original Message----- >>>> From: Ralph H Castain [mailto:r...@lanl.gov] >>>> Sent: Monday, December 17, 2007 5:49 PM >>>> To: Open MPI Users <us...@open-mpi.org>; Elena Zhebel >>>> Cc: Ralph H Castain >>>> Subject: Re: [OMPI users] MPI::Intracomm::Spawn and cluster configuration >>>> >>>> >>>> >>>> >>>> On 12/17/07 8:19 AM, "Elena Zhebel" <ezhe...@fugro-jason.com> wrote: >>>> >>>>> Hello Ralph, >>>>> >>>>> Thank you for your answer. >>>>> >>>>> I'm using OpenMPI 1.2.3. , compiler glibc232, Linux Suse 10.0. >>>>> My "master" executable runs only on the one local host, then it spawns >>>>> "slaves" (with MPI::Intracomm::Spawn). >>>>> My question was: how to determine the hosts where these "slaves" will be >>>>> spawned? >>>>> You said: "You have to specify all of the hosts that can be used by >>>>> your job >>>>> in the original hostfile". How can I specify the host file? I can not >>>>> find it >>>>> in the documentation. >>>> >>>> Hmmm...sorry about the lack of documentation. I always assumed that the MPI >>>> folks in the project would document such things since it has little to do >>>> with the underlying run-time, but I guess that fell through the cracks. >>>> >>>> There are two parts to your question: >>>> >>>> 1. how to specify the hosts to be used for the entire job. I believe that >>> is >>>> somewhat covered here: >>>> http://www.open-mpi.org/faq/?category=running#simple-spmd-run >>>> >>>> That FAQ tells you what a hostfile should look like, though you may already >>>> know that. Basically, we require that you list -all- of the nodes that both >>>> your master and slave programs will use. >>>> >>>> 2. how to specify which nodes are available for the master, and which for >>>> the slave. >>>> >>>> You would specify the host for your master on the mpirun command line with >>>> something like: >>>> >>>> mpirun -n 1 -hostfile my_hostfile -host my_master_host my_master.exe >>>> >>>> This directs Open MPI to map that specified executable on the specified >>> host >>>> - note that my_master_host must have been in my_hostfile. >>>> >>>> Inside your master, you would create an MPI_Info key "host" that has a >>> value >>>> consisting of a string "host1,host2,host3" identifying the hosts you want >>>> your slave to execute upon. Those hosts must have been included in >>>> my_hostfile. Include that key in the MPI_Info array passed to your Spawn. >>>> >>>> We don't currently support providing a hostfile for the slaves (as opposed >>>> to the host-at-a-time string above). This may become available in a future >>>> release - TBD. >>>> >>>> Hope that helps >>>> Ralph >>>> >>>>> >>>>> Thanks and regards, >>>>> Elena >>>>> >>>>> -----Original Message----- >>>>> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On >>>>> Behalf Of Ralph H Castain >>>>> Sent: Monday, December 17, 2007 3:31 PM >>>>> To: Open MPI Users <us...@open-mpi.org> >>>>> Cc: Ralph H Castain >>>>> Subject: Re: [OMPI users] MPI::Intracomm::Spawn and cluster >>>>> configuration >>>>> >>>>> On 12/12/07 5:46 AM, "Elena Zhebel" <ezhe...@fugro-jason.com> wrote: >>>>>> >>>>>> >>>>>> Hello, >>>>>> >>>>>> I'm working on a MPI application where I'm using OpenMPI instead of >>>>>> MPICH. >>>>>> >>>>>> In my "master" program I call the function MPI::Intracomm::Spawn which >>>>> spawns >>>>>> "slave" processes. It is not clear for me how to spawn the "slave" >>>>> processes >>>>>> over the network. Currently "master" creates "slaves" on the same >>>>>> host. >>>>>> >>>>>> If I use 'mpirun --hostfile openmpi.hosts' then processes are spawn >>>>>> over >>>>> the >>>>>> network as expected. But now I need to spawn processes over the >>>>>> network >>>>> from >>>>>> my own executable using MPI::Intracomm::Spawn, how can I achieve it? >>>>>> >>>>> >>>>> I'm not sure from your description exactly what you are trying to do, >>>>> nor in >>>>> what environment this is all operating within or what version of Open >>>>> MPI >>>>> you are using. Setting aside the environment and version issue, I'm >>>>> guessing >>>>> that you are running your executable over some specified set of hosts, >>>>> but >>>>> want to provide a different hostfile that specifies the hosts to be >>>>> used for >>>>> the "slave" processes. Correct? >>>>> >>>>> If that is correct, then I'm afraid you can't do that in any version >>>>> of Open >>>>> MPI today. You have to specify all of the hosts that can be used by >>>>> your job >>>>> in the original hostfile. You can then specify a subset of those hosts >>>>> to be >>>>> used by your original "master" program, and then specify a different >>>>> subset >>>>> to be used by the "slaves" when calling Spawn. >>>>> >>>>> But the system requires that you tell it -all- of the hosts that are >>>>> going >>>>> to be used at the beginning of the job. >>>>> >>>>> At the moment, there is no plan to remove that requirement, though >>>>> there has >>>>> been occasional discussion about doing so at some point in the future. >>>>> No >>>>> promises that it will happen, though - managed environments, in >>>>> particular, >>>>> currently object to the idea of changing the allocation on-the-fly. We >>>>> may, >>>>> though, make a provision for purely hostfile-based environments (i.e., >>>>> unmanaged) at some time in the future. >>>>> >>>>> Ralph >>>>> >>>>>> >>>>>> >>>>>> Thanks in advance for any help. >>>>>> >>>>>> Elena >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> users mailing list >>>>>> us...@open-mpi.org >>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> users mailing list >>>>> us...@open-mpi.org >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>> >>>>> _______________________________________________ >>>>> users mailing list >>>>> us...@open-mpi.org >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> >>>> >>> >>> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users