Appreciate it - but as I said in my prior note, we aren’t maintaining 1.6 any 
more, so an upgrade to 1.8 (which worked, as you noted) is in order


> On Mar 13, 2015, at 8:23 AM, Chris Paciorek <pacio...@stat.berkeley.edu> 
> wrote:
> 
> And the promised attachment.
> 
> On Thu, Mar 12, 2015 at 6:11 PM, Chris Paciorek
> <pacio...@stat.berkeley.edu <mailto:pacio...@stat.berkeley.edu>> wrote:
>> I'm having an issue with MPI_Comm_spawn not starting workers on the
>> nodes provided via -machinefile or -host. This is occurring on Ubuntu
>> 14.04/14.10 with openMPI 1.6.5. However, I do not have the problem on
>> Ubuntu 12.04 with openMPI 1.4.3 nor is there a problem when I install
>> openMPI 1.8.4 from source on Ubuntu 14.04.
>> 
>> Any suggestions as to what is going on? We can install from source to
>> deal with this, but Ubuntu 14.04/14.10/15.04 (and I think the current
>> Debian) are all relying on 1.6.5 so this issue might affect many
>> others.
>> 
>> As far as I can tell there aren't any threads on the mailing lists or
>> info in the NEWS file that relate to this.
>> 
>> Here's a reproducible test case. In the attached zip file, parent.cpp
>> generates an executable that simply tries to spawn workers using
>> MPI_Comm_spawn and the child executable simply reports what nodes the
>> workers are operating on.
>> 
>> I compile and run the program as:
>> mpicxx -o child child.cpp
>> mpicxx -o parent parent.cpp
>> mpirun -host smeagol,arwen,smeagol,arwen -np 1 parent
>> 
>> And the result is as follows with all children on the original node:
>> Starting: I'm process 0 and we are 1
>> Finishing: I'm process 0 and we are 1
>> I'm child process 0 on smeagol and we are 3
>> I'm child process 1 on smeagol and we are 3
>> My parent communicator size is: 3
>> I'm child process 2 on smeagol and we are 3
>> My parent communicator size is: 3
>> My parent communicator size is: 3
>> 
>> This is all on pretty standard Ubuntu 14.04, with openMPI installed
>> from Ubuntu (libopenmpi-dev, libopenmpi1.6, openmpi-bin)
>> 
>> I've included in the zip file:
>> * parent.cpp and child.cpp
>> * ompi_info --all on the master
>> * ompi_info -v ompi full --parsable on all nodes
>> * PATH and LD_LIBRARY_PATH info
>> * ifconfig information
>> 
>> -Chris
>> 
>> ----------------------------------------------------------------------------------------------
>> Chris Paciorek
>> 
>> Statistical Computing Consultant
>> Statistical Computing Facility, Econometrics Laboratory, Berkeley
>> Research Computing
>> 
>> Office: 495 Evans Hall                      Email: pacio...@stat.berkeley.edu
>> Mailing Address:                            Voice: 510-842-9056
>> Department of Statistics                    Fax:   510-642-7892
>> 367 Evans Hall                              Skype: cjpaciorek
>> University of California, Berkeley          WWW:
>> www.stat.berkeley.edu/~paciorek
>> Berkeley, CA 94720 USA                      Permanent forward:
>> pacio...@alumni.cmu.edu
> <mpi_comm_spawn_problem.tgz>_______________________________________________
> users mailing list
> us...@open-mpi.org <mailto:us...@open-mpi.org>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/03/26468.php 
> <http://www.open-mpi.org/community/lists/users/2015/03/26468.php>

Reply via email to