Re: [OMPI users] perhaps an openmpi bug, how best to identify?

Jeff Squyres Wed, 14 Jul 2010 15:20:41 -0400

On Jul 12, 2010, at 11:14 AM, Olivier Marsden wrote:

> Hi again,
> after testing as suggested, it is indeed a massive slowdown rather than
> a full-blown machine hang.


Ok.

> Would the next test be to run with debug flags for openmpi ?

You might want to run with 

   mpirun --mca mpi_yield_when_idle 1 ...

This will tell the OMPI processing core to call sched_yield() when it's polling 
for progress (rather than spinning hard, polling for new messages, etc.).

You also mentioned that you're running 7 MPI processes.  How many processors 
does your workstation have?  If you have less than 7, then that could explain 
what you're seeing.  If all the MPI processes are aggressively polling for 
progress, it could bring the machine to a crawl.

That being said, Open MPI *should* auto-detect that it is oversubscribing the 
machine (i.e., that it's running more processes than available processors) and 
automatically set mpi_yield_when_idle to 1 by itself.  Perhaps the 
auto-detection is broken somehow...?

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI users] perhaps an openmpi bug, how best to identify?

Reply via email to