Hmmm...you probably can't without digging down into the diagnostics.

Perhaps we could help more if we had some idea how you are measuring this 
"latency". I ask because that is orders of magnitude worse than anything we 
measure - so I suspect the problem is in your app (i.e., that the time you are 
measuring is actually how long it takes you to get around to processing a 
message that was received some time ago).


On Oct 3, 2012, at 11:52 AM, "Hodge, Gary C" <gary.c.ho...@lmco.com> wrote:

> how do I tell the difference between when the message was received and when 
> the message was picked up in MPI_Test?
>  
> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On 
> Behalf Of Ralph Castain
> Sent: Wednesday, October 03, 2012 1:00 PM
> To: Open MPI Users
> Subject: EXTERNAL: Re: [OMPI users] unacceptable latency in gathering process
>  
> Out of curiosity, have you logged the time when the SP called "send" and 
> compared it to the time when the message was received, and when that message 
> is picked up in MPI_Test? In other words, have you actually verified that the 
> delay is in the MPI library as opposed to in your application?
>  
>  
> On Oct 3, 2012, at 9:40 AM, "Hodge, Gary C" <gary.c.ho...@lmco.com> wrote:
> 
> 
> Hi all,
> I am running on an IBM BladeCenter, using Open MPI 1.4.1, and opensm subnet 
> manager for Infiniband
>  
> Our application has real time requirements and it has recently been proven 
> that it does not scale to meet future requirements.
> Presently, I am re-organizing the application to process work in a more 
> parallel manner then it does now.
>  
> Jobs arrive at the rate of 200 per second and are sub-divided into groups of 
> objects by a master process (MP) on its own node.
> The MP then assigns the object groups to 20 slave processes (SP), each 
> running on their own node, to do the expensive computational work in parallel.
> The SPs then send their results to a gatherer process (GP) on its own node 
> that merges the results for the job and sends it onward for final processing.
> The highest latency for the last 1024 jobs that were processed is then 
> written to a log file that is displayed by a GUI.
> Each process uses the same controller method for sending and  receiving 
> messages as follows:
>  
> For (each CPU that sends us input)
> {
> MPI_Irecv(….)
> }
>  
> While (true)
> {
>                 For (each CPU that sends us input)
> {
> MPI_Test(….)
> If (message was received)
> {
>                 Copy the message
> Queue the copy to our input queue
>                 MPI_Irecv(…)
> }
> }
> If (there are messages on our input queue)
> {
>                 … process the FIRST message on queue (this may queue messages 
> for output) ….
>  
>                 For (each message on our output queue)
>                 {
>                                 MPI_Send(…)
>                 }
> }             
> }
>  
> My problem is that I do not meet our applications performance requirements 
> for a job (~ 20 ms) until I reduce the number of SPs from 20 to 4 or less.
> I added some debug into the GP and found that there are never more than 14 
> messages received in the for loop that calls MPI_Test.
> The messages that were sent from the other 6 SPs will eventually arrive at 
> the GP in a long stream after experiencing high latency (over 600 ms).
>  
> Going forward, we need to handle more objects per job and will need to have 
> more than 4 SPs to keep up.
> My thought is that I have to obey this 4 SPs to 1 GP ratio and create 
> intermediate GPs to gather results from every 4 slaves.
>  
> Is this a contention problem at the GP?
> Is there debugging or logging I can turn on in the MPI to prove that 
> contention is occurring?
> Can I configure MPI receive processing to improve upon the 4 to 1 ratio?
> Can I improve the controller method (listed above) to gain a performance 
> improvement?
>  
> Thanks for any suggestions.
> Gary Hodge
>  
>  
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>  
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to