Shaun Jackman wrote:
Eugene Loh wrote:
At 2500 bytes, all messages will presumably be sent "eagerly" --
without waiting for the receiver to indicate that it's ready to
receive that particular message. This would suggest congestion, if
any, is on the receiver side. Some kind of congestion could, I
suppose, still occur and back up on the sender side.
Can anyone chime in as to what the message size limit is for an
`eager' transmission?
ompi_info -a | grep eager
depends on the BTL. E.g., sm=4K but tcp is 64K. self is 128K.
On the other hand, I assume the memory imbalance we're talking about
is rather severe. Much more than 2500 bytes to be noticeable, I
would think. Is that really the situation you're imagining?
The memory imbalance is drastic. I'm expecting 2 GB of memory use per
process. The heaving processes (13/16) use the expected amount of
memory; the remainder (3/16) misbehaving processes use more than twice
as much memory. The specifics vary from run to run of course. So, yes,
there is gigs of unexpected memory use to track down.
Umm, how big of a message imbalance do you think you might have? (The
inflection in my voice doesn't come out well in e-mail.) Anyhow, that
sounds like, um, "lots" of 2500-byte messages.