Dear George,
                       Thank you for your reply. I understood your point that I 
should implement correct communication scheme instead of using eager limit as a 
parameter.

However, my intention for relying and taking benefit of eager limit is to avoid 
application memory allocation at the sending process.
I am doing pairwise communication(single in out buffer) and not all-to-all 
comm. I want to take advantage of the message buffering done by eager protocol 
only at the receiver process.
I want to save the application level buffering(required for the problem I am 
trying to solve) at the sending process.
(Since as per Eager Protocol, it is the responsibility of the receiving process 
to buffer the message upon its arrival, especially if the receive operation has 
not been posted.)

Is there any ompi API to query eager limit? Or I have to check the MCA 
variables and use it to pass to the application in case I want to use it.

With Regards,
S. Biplab Raut



From: George Bosilca <bosi...@icl.utk.edu>
Sent: Wednesday, March 25, 2020 9:58 PM
To: Raut, S Biplab <biplab.r...@amd.com>
Cc: Open MPI Users <users@lists.open-mpi.org>
Subject: Re: [OMPI users] Regarding eager limit relationship to send message 
size

[CAUTION: External Email]
On Wed, Mar 25, 2020 at 4:49 AM Raut, S Biplab 
<biplab.r...@amd.com<mailto:biplab.r...@amd.com>> wrote:

[AMD Official Use Only - Internal Distribution Only]

Dear George,
                        Thank you the reply. But my question is more 
particularly on the message size from application side.

Let’s say the application is running with 128 ranks.
Each rank is doing send() msg to rest of 127 ranks where the msg length sent is 
under question.
Now after all the sends are completed, each rank will recv() msg from rest of 
127 ranks.
Unless the msg length in the sending part is within eager_limit (4K size), this 
program will hang.

This is definitively not true, one can imagine many communication patterns that 
will ensure correctness for your all-to-all communications. As an example, you 
can place your processes in a virtual ring, and at each step send and recv 
to/from process (my_rank + step) % comm_size. This communication pattern will 
always be correct, independent of the eager size (for as long as you correctly 
order the send/recv for each pair).

 So, based on the above scenario, my questions are:-

  1.  Can each of the rank send message upto 4K size successfully, i.e all 128 
ranks sending (128 * 4K) bytes simultaneously?
Potentially yes, but there are physical constraints (aka number of network 
links, switches capabilities, ... ) and memory limits. But if you have enough 
memory, this could potentially work. I'm not saying this is correct and should 
be done.


  1.  If application has bigger msg to be sent by each rank, then how to derive 
the send message size? Is it equal to eager_limit and each rank needs to send 
multiple chunks of this size?
Definitively not! You should never rely on the eager size to fix a complex 
communication pattern. The rule of thumb should be: Is my application working 
correctly if the MPI forces a zero-bytes eager size. As suggested above, the 
most suitable approach is to define a communication scheme that would never 
deadlock.

  George.

With Regards,
S. Biplab Raut

From: George Bosilca <bosi...@icl.utk.edu<mailto:bosi...@icl.utk.edu>>
Sent: Tuesday, March 24, 2020 9:01 PM
To: Open MPI Users <users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>>
Cc: Raut, S Biplab <biplab.r...@amd.com<mailto:biplab.r...@amd.com>>
Subject: Re: [OMPI users] Regarding eager limit relationship to send message 
size

[CAUTION: External Email]
Biplab,

The eager is a constant for each BTL, and it represent the data that is sent 
eagerly with the matching information out of the entire message. So, if the 
question is how much memory is needed to store all the eager messages then the 
answer will depend on the communication pattern of your application:
- applications using only blocking messages might only have 1 pending 
communications per peer, so in the worst case any process will only need at 
most P * eager_size memory for local storage of the eager.
- applications using non-blocking communications, there is basically no limit.

However, the good news is that you can change this limit to adapt to the needs 
of your application(s).

Hope this answers your question,
George.


On Tue, Mar 24, 2020 at 1:46 AM Raut, S Biplab via users 
<users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>> wrote:
Dear Experts,
                        I would like to derive/calculate the maximum MPI send 
message size possible  given the known details of btl_vader_eager_limit and 
number of ranks.
Can anybody explain and confirm on this?

With Regards,
S. Biplab Raut

Reply via email to