On Wed, 2006-09-06 at 10:40 -0700, Tom Rosmond wrote:
> Brian,
> 
> I notice in the OMPI_INFO output the following parameters that seem
> relevant to this problem:
> 
>                  MCA btl: parameter "btl_self_free_list_num" (current
> value: "0")
>                  MCA btl: parameter "btl_self_free_list_max" (current
> value: "-1")
>                  MCA btl: parameter "btl_self_free_list_inc" (current
> value: "32")
>                  MCA btl: parameter "btl_self_eager_limit" (current
> value: "131072")
>                  MCA btl: parameter "btl_self_max_send_size" (current
> value: "262144")
>                  MCA btl: parameter "btl_self_max_rdma_size" (current
> value: "2147483647")
>                  MCA btl: parameter "btl_self_exclusivity" (current
> value: "65536")
>                  MCA btl: parameter "btl_self_flags" (current value:
> "2")
>                  MCA btl: parameter "btl_self_priority" (current
> value: "0")
> 
> Specifically the 'self_max_send_size=262144', which I assume is the
> maximum size (bytes?) message a processor can send to itself.  None of
> the messages in my above tests approached this limit.  However, I am
> puzzled by this, because the program below runs correctly for
> ridiculously large message sizes (as shown 200 Mbytes).

The self_max_send_size is the maximum size of a fragment that can be
sent with that btl.  The upper layer (the PML for point-to-point or the
one-sided component) is responsible for fragmenting the message into
small enough chunks.  There are actually a couple of papers on our web
site about how we do this (and even a bit of why we do it).  I'm pretty
sure this isn't the problem -- I think the one-sided implementation
violating an assumption of the point-to-point semantics internally,
which is causing the badness.

Brian

Reply via email to