Bonjour,
Since I'm very suspicious about the condition of the IB network on
my cluster,
I'm trying to use the csum pml feature of OMPI (1.4.3).
But I have a question: what happens if the Checksum is different on
both ends ?
Is there a warning printed, a flag set by the MPI_(I)recv or equ
Hello,
Is there anyway to spawn processes with the ompi-server option? I need the
child processes to open and publish ports for which I require this option.
Is there an alternative?
Thanks,
Suraj Prabhakaran
Not sure I fully understand the question. If you provide the --ompi-server
option to mpirun, that info will be passed along to all processes,
including those launched via comm_spawn, so they can subsequently connect to
the server.
On Dec 14, 2010, at 6:50 AM, Suraj Prabhakaran wrote:
> Hello
Hi James:
I can reproduce the problem on a single node with Open MPI 1.5 and the
trunk. I have submitted a ticket with
the information.
https://svn.open-mpi.org/trac/ompi/ticket/2656
Rolf
On 12/13/10 18:44, James Dinan wrote:
Hi,
I'm getting strange behavior using datatypes in a one-sided
About 9 months ago we had a new installation with a system of 1800 cores and at
the time we found that jobs with more than 1028 cores would not start. At the
time a colleague found that setting
OMPI_MCA_plm_rsh_num_concurrent=256
help with the problem.
We have now increased our processor co
David Mathog wrote:
Is there a tool in openmpi that will reveal how much "spin time" the
processes are using?
I don't know what sort of answer is helpful for you, but I'll describe
one option.
With Oracle Message Passing Toolkit (formerly Sun ClusterTools, anyhow,
an OMPI distribution from
So the 2/2 consensus is to use the collective. That is straightforward
for the send part of this, since all workers are sent the same data.
For the receive I do not see how to use a collective. Each worker sends
back a data structure, and the structures are of of varying size. This
is almost al
David Mathog wrote:
For the receive I do not see how to use a collective. Each worker sends
back a data structure, and the structures are of of varying size. This
is almost always the case in Bioinformatics, where what is usually
coming back from each worker is a count M of the number of signi
Hi Rolf,
Thanks for your help. I also noticed trouble with subarray data types.
I attached the same test again, but with subarray rather than indexed
types. It works correctly with MVAPICH on IB, but fails with OpenMPI
1.5 with the following message:
$ mpiexec -n 2 ./a.out
MPI RMA Strided
That's a big cluster to be starting with rsh! :-)
When you say it won't start, do you mean that it hangs? Or does it fail with
some error message? How many nodes are involved (this is the important number,
not the number of cores)?
Also, what version are you using?
On Dec 14, 2010, at 9:10 AM
I have experimented a bit more and found that if I set
OMPI_MCA_plm_rsh_num_concurrent=1024
a job with more than 2,500 processes will start and run.
However when I searched the open-mpi web site for the the variable I could not
find any indication.
Best wishes,
Lydia Heck
15. jobs with
On 14 December 2010 17:32, Lydia Heck wrote:
>
> I have experimented a bit more and found that if I set
>
> OMPI_MCA_plm_rsh_num_concurrent=1024
>
> a job with more than 2,500 processes will start and run.
>
> However when I searched the open-mpi web site for the the variable I could
> not find an
IF the checksum on both peers doesn't match, your MPI call will return with an
error. This is in addition of Open MPI printing a warning message on the output
(which can be silenced with the right mca parameter).
So, you're supposed to check the return values, and abort if something fishy is
go
Bonjour Ralph,
I wonder : is this plm_rsh_num_concurrent parameter standing ONLY for
rsh use,
or for ssh OR rsh, depending on plm_rsh_agent, please ?
Thanks, Best, G.
Le 14/12/2010 18:30, Ralph Castain a écrit :
That's a big cluster to be starting with rsh! :-)
When you say it won't s
It applies to both. In the rsh/ssh launcher, there is a limit on how many
concurrent ssh/rsh sessions we have open at any one time. This is required due
to OS limitations. As each daemon completes its launch, it "daemonizes" and
closes the ssh/rsh session, thus enabling another daemon to be laun
On Dec 10, 2010, at 11:00 AM, Prentice Bisbal wrote:
>> Would it make sense to implement this as an MPI extension, and then
>> perhaps propose something to the Forum for this purpose?
>
> I think that makes sense. As core and socket counts go up, I imagine the need
> for this information will be
On Dec 6, 2010, at 9:26 AM, Hicham Mouline wrote:
> Thanks, it is now clarified that a call to MPI_INIT has the same effect as a
> call to MPI_INIT_THREAD with
> a required = MPI_THREAD_SINGLE. Perhaps it should be added here:
> http://www.open-mpi.org/doc/v1.4/man3/MPI_Init_thread.3.php
> as w
17 matches
Mail list logo