What I missed in this whole conversation is that the pieces of text
that Ron and Dick are citing are *on the same page* in the MPI spec;
they're not disparate parts of the spec that accidentally overlap in
discussion scope.
Specifically, it says:
Resource limitations
Any pending com
Ron's comments are probably dead on for an application like bug3.
If bug3 is long running and libmpi is doing eager protocol buffer
management as I contend the standard requires then the producers will not
get far ahead of the consumer before they are forced to synchronous send
under the covers a
> Re: MPI_Ssend(). This indeed fixes bug3, the process at rank 0 has
> reasonable memory usage and the execution proceeds normally.
>
> Re scalable: One second. I know well bug3 is not scalable, and when to
> use MPI_Isend. The point is programmers want to count on the MPI spec as
> written, as Ri
,
fds
-Original Message-
From: Brightwell, Ronald [mailto:rbbr...@sandia.gov]
Sent: Monday, February 04, 2008 4:35 PM
To: Sacerdoti, Federico
Cc: Open MPI Users
Subject: Re: [OMPI users] openmpi credits for eager messages
On Mon Feb 4, 2008 14:23:13... Sacerdoti, Federico wrote
> To keep
So with an Isend your program becomes valid MPI and a very nice
illustrarion of why the MPI standard cannot limit envelops (or send/recv
descriptors) and why at some point the number of descriptors can blow the
limits. It also illustrates how the management of eager messages remains
workable. (Not
Wow this sparked a much more heated discussion than I was expecting. I
was just commenting that the behaviour the original author (Federico
Sacerdoti) mentioned would explain something I observed in one of my
early trials of OpenMPI. But anyway, because it seems that quite a few
people were interes
On Tue, Feb 05, 2008 at 08:07:59AM -0500, Richard Treumann wrote:
> There is no misunderstanding of the MPI standard or the definition of
> blocking in the bug3 example. Both bug 3 and the example I provided are
> valid MPI.
>
> As you say, blocking means the send buffer can be reused when the MP
Hi Gleb
There is no misunderstanding of the MPI standard or the definition of
blocking in the bug3 example. Both bug 3 and the example I provided are
valid MPI.
As you say, blocking means the send buffer can be reused when the MPI_Send
returns. This is exactly what bug3 is count on.
MPI is a r
On Mon, Feb 04, 2008 at 04:23:13PM -0500, Sacerdoti, Federico wrote:
> Bug3 is a test-case derived from a real, scalable application (desmond
> for molecular dynamics) that several experienced MPI developers have
> worked on. Note the MPI_Send calls of processes N>0 are *blocking*; the
> openmpi si
Richard,
You're absolutely right. What a shame :) If I have spent less time
drawing the boxes around the code I might have noticed the typo. The
Send should be an Isend.
george.
On Feb 4, 2008, at 5:32 PM, Richard Treumann wrote:
Hi George
Sorry - This is not a valid MPI program. It v
Hi George
Sorry - This is not a valid MPI program. It violates the requirement that
a program not depend on there being any system buffering. See page 32-33
of MPI 1.1
Lets simplify to:
Task 0:
MPI_Recv( from 1 with tag 1)
MPI_Recv( from 1 with tag 0)
Task 1:
MPI_Send(to 0 with tag 0)
MPI_Se
Please allow me to slightly modify your example. It still follow the
rules from the MPI standard, so I think it's a 100% standard compliant
parallel application.
++
| task 0:|
+
On Mon Feb 4, 2008 14:23:13... Sacerdoti, Federico wrote
> To keep this out of the weeds, I have attached a program called "bug3"
> that illustrates this problem on openmpi 1.2.5 using the openib BTL. In
> bug3 process with rank 0 uses all available memory buffering
> "unexpected" messages from it
org] On
Behalf Of Brightwell, Ronald
Sent: Monday, February 04, 2008 3:30 PM
To: Patrick Geoffray
Cc: Open MPI Users
Subject: Re: [OMPI users] openmpi credits for eager messages
> > I'm looking at a network where the number of endpoints is large
enough that
> > everybody can
> > I'm looking at a network where the number of endpoints is large enough that
> > everybody can't have a credit to start with, and the "offender" isn't any
> > single process, but rather a combination of processes doing N-to-1 where N
> > is sufficiently large. I can't just tell one process to s
Brightwell, Ronald wrote:
I'm looking at a network where the number of endpoints is large enough that
everybody can't have a credit to start with, and the "offender" isn't any
single process, but rather a combination of processes doing N-to-1 where N
is sufficiently large. I can't just tell one
On Mon, Feb 04, 2008 at 02:54:46PM -0500, Richard Treumann wrote:
> In my example, each sender task 1 to n-1 will have one rendezvous message
> to task 0 at a time. The MPI standard suggests descriptors be small enough
> and there be enough descriptor space for reasonable programs . The
> standar
Now that this discussion has gone way off into the MPI standard woods :).
Was your test using Open MPI 1.2.4 or 1.2.5 (the one with the segfault)?
There was definitely a bug in 1.2.4 that could cause exactly the behavior
you are describing when using the shared memory BTL, due to a silly
delay
Gleb
In my example, each sender task 1 to n-1 will have one rendezvous message
to task 0 at a time. The MPI standard suggests descriptors be small enough
and there be enough descriptor space for reasonable programs . The
standard is clear that unreasonable programs can run out of space and fail
> > Not to muddy the point, but if there's enough ambiguity in the Standard
> > for people to ignore the progress rule, then I think (hope) there's enough
> > ambiguity for people to ignore the sender throttling issue too ;)
>
> I understand your position, and I used to agree until I was forced to
Ron,
Brightwell, Ronald wrote:
Not to muddy the point, but if there's enough ambiguity in the Standard
for people to ignore the progress rule, then I think (hope) there's enough
ambiguity for people to ignore the sender throttling issue too ;)
I understand your position, and I used to agree un
On Mon, Feb 04, 2008 at 09:08:45AM -0500, Richard Treumann wrote:
> To me, the MPI standard is clear that a program like this:
>
> task 0:
> MPI_Init
> sleep(3000);
> start receiving messages
>
> each of tasks 1 to n-1:
> MPI_Init
> loop 5000 times
>MPI_Send(small message to 0)
> end loop
>
>
> I am well aware of the scaling problems related to the standard
> send requirements in MPI. I t is a very difficult issue.
>
> However, here is what the standard says: MPI 1.2, page 32 lines 29-37
>
> [...]
I'm well aware of those words. They are highlighted (in pink no less) in on
page 50
Hi Ron -
I am well aware of the scaling problems related to the standard send
requirements in MPI. I t is a very difficult issue.
However, here is what the standard says: MPI 1.2, page 32 lines 29-37
===
a standard send operation that cannot complete because of lack of buffer
space will me
> Is what George says accurate? If so, it sounds to me like OpenMPI
> does not comply with the MPI standard on the behavior of eager
> protocol. MPICH is getting dinged in this discussion because they
> have complied with the requirements of the MPI standard. IBM MPI
> also complies with the stand
The words 'eager', 'rendezvous' or 'credit' have a specific resonance
only for implementors and i think it's correct that the MPI
specification sidestep these words since they are artifacts of
implementation.
All implementations make their own guarantees and run into their own
different limitation
Is what George says accurate? If so, it sounds to me like OpenMPI does not
comply with the MPI standard on the behavior of eager protocol. MPICH is
getting dinged in this discussion because they have complied with the
requirements of the MPI standard. IBM MPI also complies with the standard.
If
Well ... this is exactly the kind of behavior a high performance
application try to achieve isn't it ?
The problem here is not the flow control. What you need is to avoid
buffering the messages on the receiver side. Luckily, Open MPI is
entirely configurable at runtime, so this situation is
That would make sense. I able to break OpenMPI by having Node A wait for
messages from Node B. Node B is in fact sleeping while Node C bombards
Node A with a few thousand messages. After a while Node B wakes up and
sends Node A the message it's been waiting on, but Node A has long since
been buried
The Voltaire tech was right. There is no credit management at the
upper level, each BTL is allowed to do it (if needed). This doesn't
means the transport is not reliable. Most of the devices have internal
flow control, and Open MPI rely on it instead of implementing our own.
However, the de
Hi,
I am readying an openmpi 1.2.5 software stack for use with a
many-thousand core cluster. I have a question about sending small
messages that I hope can be answered on this list.
I was under the impression that if node A wants to send a small MPI
message to node B, it must have a credit to do
31 matches
Mail list logo