On Aug 24, 2012, at 10:38 AM, Jeff Squyres wrote:

> On Aug 24, 2012, at 10:28 AM, Brock Palen wrote:
> 
>> I grabbed the new OMPI 1.6.1 and ran my test that would cause a hang with 
>> 1.6.0 with low registered memory.  From reading the release notes rather 
>> than hang I would expect:
>> 
>> * lower performance/fall back to send/receive.
>> * a notice of failed to allocate registered memory
>> 
>> In my case I still get a hang, is this expected?  
> 
> It can still happen, yes.  The short version is that there are cases that 
> can't easily be fixed in the 1.6 series that involve lazy creation of QPs.  
> Do you see errors about OMPI failing to create CQ's or QP'?

No IMB (my simple test case) just hangs on an Alltoall indefinitely, 

> 
>> This is running with default registered memory limits and I do appreciate  
>> the message that I only have 4GB of registered memory of my 48.  We will 
>> also be fixing our load to raise this value, which should make this issue 
>> moot.
> 
> Did you get a warning about being able to register too little memory?

Correct I do and I like the warning at startup.

> 
>> Honestly I think what I would want is for MPI to blow up saying "can't 
>> allocate registered memory, fatal, contact your admin", rather than fall 
>> back to send/receive and just be slower.
> 
> Right now we should be just warning if we can't register 3/4 of your physical 
> memory (we can't really test for anything more than that).  But it doesn't 
> abort.
Ok

> 
> We could add a tunable that makes it abort in this case, if you think that 
> would be useful.
I think so, in my case that would mean a node is miss-configured, and rather 
than running slowly I want it brought to my attention, 

> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to