I appreciate the suggestion about running a daemon on each of the remote
nodes, but wouldn't I kind of be reinventing the wheel there? Process
management is one of the things I'd like to be able to count on ORTE for.
Would the following work to give the parent process an intercomm with
each child?
parent i.e. my non-mpirun-started process calls MPI_Init then
MPI_Open_port
parent spawns mpirun command via system/exec to create the remote
children . The name from MPI_Open_port is placed in the environment.
parent calls MPI_Comm_accept (once for each child?)
all children call MPI_connect to the name
I think this would give one intercommunicator back to the parent for
each remote process (not ideal, but I can worry about broadcast data later)
The remote processes can communicate to each other through MPI_COMM_WORLD.
Actually when I think through the details, much of this is pretty
similar to the daemon MPI_Publish_name+MPI_Lookup_name approach. The
main difference being which processes come first.
Mark Borgerding wrote:
I'm afraid I can't dictate to the customer that they must upgrade.
The target platform is RHEL 5.2 ( uses openmpi 1.2.6 )
I will try to find some sort of workaround. Any suggestions on how to
"fake" the functionality of MPI_Comm_spawn are welcome.
To reiterate my needs:
I am writing a shared object that plugs into an existing framework.
I do not control how the framework launches its processes (no mpirun).
I want to start remote processes to crunch the data.
The shared object marshall the I/O between the framework and the
remote processes.
-- Mark
Ralph Castain wrote:
Singleton comm_spawn works fine on the 1.3 release branch - if
singleton comm_spawn is critical to your plans, I suggest moving to
that version. You can get a pre-release version off of the
www.open-mpi.org web site.
On Jul 30, 2008, at 6:58 AM, Ralph Castain wrote:
As your own tests have shown, it works fine if you just "mpirun -n 1
./spawner". It is only singleton comm_spawn that appears to be
having a problem in the latest 1.2 release. So I don't think
comm_spawn is "useless". ;-)
I'm checking this morning to ensure that singletons properly spawns
on other nodes in the 1.3 release. I sincerely doubt we will
backport a fix to 1.2.
On Jul 30, 2008, at 6:49 AM, Mark Borgerding wrote:
I keep checking my email in hopes that someone will come up with
something that Matt or I might've missed.
I'm just having a hard time accepting that something so fundamental
would be so broken.
The MPI_Comm_spawn command is essentially useless without the
ability to spawn processes on other nodes.
If this is true, then my personal scorecard reads:
# Days spent using openmpi: 4 (off and on)
# identified bugs in openmpi :2
# useful programs built: 0
Please prove me wrong. I'm eager to be shown my ignorance -- to
find out where I've been stupid and what documentation I should've
read.
Matt Hughes wrote:
I've found that I always have to use mpirun to start my spawner
process, due to the exact problem you are having: the need to give
OMPI a hosts file! It seems the singleton functionality is lacking
somehow... it won't allow you to spawn on arbitrary hosts. I have
not
tested if this is fixed in the 1.3 series.
Try
mpiexec -np 1 -H op2-1,op2-2 spawner op2-2
mpiexec should start the first process on op2-1, and the spawn call
should start the second on op2-2. If you don't use the Info
object to
set the hostname specifically, then on 1.2.x it will automatically
start on op2-2. With 1.3, the spawn call will start processes
starting with the first item in the host list.
mch
[snip]
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users