On Aug 4, 2008, at 11:45 AM, Mark Borgerding wrote:
But I think I've got a path forward. I've been able to use sockets
and MPI_Comm_join to create intercomms between the singleton and
mpirun-spawned children. The important step I was missing was
"orted --persistent --seed --scope public".
Jeff Squyres wrote:
On Aug 4, 2008, at 10:02 AM, Jeff Squyres wrote:
I *think* George Bosilca sent some sample code about this across one
of the OMPI lists (users or devel) a long time ago. I'm not 100%
sure about that, though...
I unfortunately forget the trick that he used. :-\
George
On Aug 4, 2008, at 10:02 AM, Jeff Squyres wrote:
I *think* George Bosilca sent some sample code about this across one
of the OMPI lists (users or devel) a long time ago. I'm not 100%
sure about that, though...
I unfortunately forget the trick that he used. :-\
George is unable to send
On Aug 4, 2008, at 12:59 AM, Mark Borgerding wrote:
You should be able to merge each child communicator from each
accept thread into a global comm anyway.
Can you elaborate? I am struggling to see how to implement this. A
pointer to sample code would be helpful.
Specifically, I'd like to
Robert Kubrick wrote:
You should be able to merge each child communicator from each accept
thread into a global comm anyway.
Can you elaborate? I am struggling to see how to implement this. A
pointer to sample code would be helpful.
Specifically, I'd like to be able to have a single process
Just to be clear: you do not require a daemon on every node. You just
need one daemon - sitting somewhere - that can act as the data server
for MPI_Name_publish/lookup. You then tell each app where to find it.
Normally, mpirun fills that function. But if you don't have it, you
can kickoff a
On Jul 30, 2008, at 11:12 AM, Mark Borgerding wrote:
I appreciate the suggestion about running a daemon on each of the
remote nodes, but wouldn't I kind of be reinventing the wheel
there? Process management is one of the things I'd like to be able
to count on ORTE for.
Would the following
On Jul 30, 2008, at 11:12 AM, Mark Borgerding wrote:
I appreciate the suggestion about running a daemon on each of the
remote nodes, but wouldn't I kind of be reinventing the wheel there?
Process management is one of the things I'd like to be able to count
on ORTE for.
Keep in mind that t
I appreciate the suggestion about running a daemon on each of the remote
nodes, but wouldn't I kind of be reinventing the wheel there? Process
management is one of the things I'd like to be able to count on ORTE for.
Would the following work to give the parent process an intercomm with
each c
Okay, I tested it and MPI_Name_publish and MPI_Name_lookup work on
1.2.6, so this may provide an avenue (albeit cumbersome) for you to
get this to work. It may require a server, though, to make it work -
your first MPI proc may be able to play that role if you pass it's
contact info to the
IThe problem would be finding a way to tell all the MPI apps how to
contact each other as the Intercomm procedure needs that info to
complete. I don't recall if the MPI_Name_publish/lookup functions
worked in 1.2 - I'm building the code now to see.
If it does, then you could use it to get t
Mark, if you can run a server process on the remote machine, you
could send a request from your local MPI app to your server, then use
an Intercomm to link the local process to the new remote process?
On Jul 30, 2008, at 9:55 AM, Mark Borgerding wrote:
I'm afraid I can't dictate to the cust
I'm afraid I can't dictate to the customer that they must upgrade.
The target platform is RHEL 5.2 ( uses openmpi 1.2.6 )
I will try to find some sort of workaround. Any suggestions on how to
"fake" the functionality of MPI_Comm_spawn are welcome.
To reiterate my needs:
I am writing a shared o
Just to clarify: the test code I wrote does *not* use MPI_Comm_spawn in
the mpirun case. The problem may or may not exist under miprun.
Ralph Castain wrote:
As your own tests have shown, it works fine if you just "mpirun -n 1
./spawner". It is only singleton comm_spawn that appears to be hav
Singleton comm_spawn works fine on the 1.3 release branch - if
singleton comm_spawn is critical to your plans, I suggest moving to
that version. You can get a pre-release version off of the www.open-mpi.org
web site.
On Jul 30, 2008, at 6:58 AM, Ralph Castain wrote:
As your own tests have
As your own tests have shown, it works fine if you just "mpirun -n 1 ./
spawner". It is only singleton comm_spawn that appears to be having a
problem in the latest 1.2 release. So I don't think comm_spawn is
"useless". ;-)
I'm checking this morning to ensure that singletons properly spawns o
I keep checking my email in hopes that someone will come up with
something that Matt or I might've missed.
I'm just having a hard time accepting that something so fundamental
would be so broken.
The MPI_Comm_spawn command is essentially useless without the ability to
spawn processes on other n
I've found that I always have to use mpirun to start my spawner
process, due to the exact problem you are having: the need to give
OMPI a hosts file! It seems the singleton functionality is lacking
somehow... it won't allow you to spawn on arbitrary hosts. I have not
tested if this is fixed in t
Afraid I am out of suggestions - could be a bug in the old 1.2 series.
You might try with the 1.3 series...or perhaps someone else has a
suggestion here.
On Jul 29, 2008, at 2:46 PM, Mark Borgerding wrote:
Yes. The host names are listed in the host file.
e.g.
"op2-1 slots=8"
and there is a
Yes. The host names are listed in the host file.
e.g.
"op2-1 slots=8"
and there is an IP address for op2-1 in the /etc/hosts directory
I've read the FAQ. Everything in there seems to assume I am starting
the process group with mpirun or one of its brothers. This is not the
case.
I've cre
OMPI doesn't care what your hosts are named - many of us use names
that have no numeric pattern or any other discernible pattern to them.
OMPI_MCA_rds_hostfile should point to a file that contains a list of
the hosts - have you ensured that it does, and that the hostfile
format is correct?
I listed the node names in the path named in ompi_info --param rds
hostfile -- no luck.
I also tried copying that file to another location and setting
OMPI_MCA_rds_hostfile_path -- no luck.
The remote hosts are named op2-1 and op2-2. Could this be another case
of the problem I saw a few days
For the 1.2 release, I believe you will find the enviro param is
OMPI_MCA_rds_hostfile_path - you can check that with "ompi_info".
On Jul 29, 2008, at 11:10 AM, Mark Borgerding wrote:
Umm ... what -hostfile file?
I am not starting anything via mpiexec/orterun so there is no "-
hostfile" ar
Umm ... what -hostfile file?
I am not starting anything via mpiexec/orterun so there is no
"-hostfile" argument AFAIK.
Is there some other way to communicate this? An environment variable or
mca param?
-- Mark
Ralph Castain wrote:
Are the hosts where you want the children to go in your -ho
Are the hosts where you want the children to go in your -hostfile
file? All of the hosts you intend to use have to be in that file, even
if they don't get used until the comm_spawn.
On Jul 29, 2008, at 9:08 AM, Mark Borgerding wrote:
I've tried lots of different values for the "host" key in
I've tried lots of different values for the "host" key in the info handle.
I've tried hardcoding the hostname+ip entries in the /etc/hosts file --
no luck. I cannot get my MPI_Comm_spawn children to go anywhere else on
the network.
mpiexec can start groups on the other machines just fine.
It
The string "localhost" may not be recognized in the 1.2 series for
comm_spawn. Do a "hostname" and use that string instead - should work.
Ralph
On Jul 28, 2008, at 10:38 AM, Mark Borgerding wrote:
When I add the info parameter in MPI_Comm_spawn, I get the error
"Some of the requested hosts a
When I add the info parameter in MPI_Comm_spawn, I get the error
"Some of the requested hosts are not included in the current allocation
for the application:
[...]
Verify that you have mapped the allocated resources properly using the
--host specification."
Here is a snippet of my code that cau
Thanks, I don't know how I missed that. Perhaps I got thrown off by
"Portable programs not requiring detailed control over process
locations should use MPI_INFO_NULL."
If there were a computing equivalent of Maslow's Hierarchy of Needs,
functioning would be more fundamental than portabilit
Take a look at the man page for MPI_Comm_spawn. It should explain that
you need to create an MPI_Info key that has the key of "host" and a
value that contains a comma-delimited list of hosts to be used for the
child processes.
Hope that helps
Ralph
On Jul 28, 2008, at 8:54 AM, Mark Borgerd
How does openmpi decide which hosts are used with MPI_Comm_spawn? All
the docs I've found talk about specifying hosts on the mpiexec/mpirun
command and so are not applicable.
I am unable to spawn on anything but localhost (which makes for a pretty
uninteresting cluster).
When I run
ompi_info -
31 matches
Mail list logo