Dear George, Thank you for your reply. I understood your point that I should implement correct communication scheme instead of using eager limit as a parameter.
However, my intention for relying and taking benefit of eager limit is to avoid application memory allocation at the sending process. I am doing pairwise communication(single in out buffer) and not all-to-all comm. I want to take advantage of the message buffering done by eager protocol only at the receiver process. I want to save the application level buffering(required for the problem I am trying to solve) at the sending process. (Since as per Eager Protocol, it is the responsibility of the receiving process to buffer the message upon its arrival, especially if the receive operation has not been posted.) Is there any ompi API to query eager limit? Or I have to check the MCA variables and use it to pass to the application in case I want to use it. With Regards, S. Biplab Raut From: George Bosilca <bosi...@icl.utk.edu> Sent: Wednesday, March 25, 2020 9:58 PM To: Raut, S Biplab <biplab.r...@amd.com> Cc: Open MPI Users <users@lists.open-mpi.org> Subject: Re: [OMPI users] Regarding eager limit relationship to send message size [CAUTION: External Email] On Wed, Mar 25, 2020 at 4:49 AM Raut, S Biplab <biplab.r...@amd.com<mailto:biplab.r...@amd.com>> wrote: [AMD Official Use Only - Internal Distribution Only] Dear George, Thank you the reply. But my question is more particularly on the message size from application side. Let’s say the application is running with 128 ranks. Each rank is doing send() msg to rest of 127 ranks where the msg length sent is under question. Now after all the sends are completed, each rank will recv() msg from rest of 127 ranks. Unless the msg length in the sending part is within eager_limit (4K size), this program will hang. This is definitively not true, one can imagine many communication patterns that will ensure correctness for your all-to-all communications. As an example, you can place your processes in a virtual ring, and at each step send and recv to/from process (my_rank + step) % comm_size. This communication pattern will always be correct, independent of the eager size (for as long as you correctly order the send/recv for each pair). So, based on the above scenario, my questions are:- 1. Can each of the rank send message upto 4K size successfully, i.e all 128 ranks sending (128 * 4K) bytes simultaneously? Potentially yes, but there are physical constraints (aka number of network links, switches capabilities, ... ) and memory limits. But if you have enough memory, this could potentially work. I'm not saying this is correct and should be done. 1. If application has bigger msg to be sent by each rank, then how to derive the send message size? Is it equal to eager_limit and each rank needs to send multiple chunks of this size? Definitively not! You should never rely on the eager size to fix a complex communication pattern. The rule of thumb should be: Is my application working correctly if the MPI forces a zero-bytes eager size. As suggested above, the most suitable approach is to define a communication scheme that would never deadlock. George. With Regards, S. Biplab Raut From: George Bosilca <bosi...@icl.utk.edu<mailto:bosi...@icl.utk.edu>> Sent: Tuesday, March 24, 2020 9:01 PM To: Open MPI Users <users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>> Cc: Raut, S Biplab <biplab.r...@amd.com<mailto:biplab.r...@amd.com>> Subject: Re: [OMPI users] Regarding eager limit relationship to send message size [CAUTION: External Email] Biplab, The eager is a constant for each BTL, and it represent the data that is sent eagerly with the matching information out of the entire message. So, if the question is how much memory is needed to store all the eager messages then the answer will depend on the communication pattern of your application: - applications using only blocking messages might only have 1 pending communications per peer, so in the worst case any process will only need at most P * eager_size memory for local storage of the eager. - applications using non-blocking communications, there is basically no limit. However, the good news is that you can change this limit to adapt to the needs of your application(s). Hope this answers your question, George. On Tue, Mar 24, 2020 at 1:46 AM Raut, S Biplab via users <users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>> wrote: Dear Experts, I would like to derive/calculate the maximum MPI send message size possible given the known details of btl_vader_eager_limit and number of ranks. Can anybody explain and confirm on this? With Regards, S. Biplab Raut