On Aug 24, 2012, at 10:38 AM, Jeff Squyres wrote: > On Aug 24, 2012, at 10:28 AM, Brock Palen wrote: > >> I grabbed the new OMPI 1.6.1 and ran my test that would cause a hang with >> 1.6.0 with low registered memory. From reading the release notes rather >> than hang I would expect: >> >> * lower performance/fall back to send/receive. >> * a notice of failed to allocate registered memory >> >> In my case I still get a hang, is this expected? > > It can still happen, yes. The short version is that there are cases that > can't easily be fixed in the 1.6 series that involve lazy creation of QPs. > Do you see errors about OMPI failing to create CQ's or QP'?
No IMB (my simple test case) just hangs on an Alltoall indefinitely, > >> This is running with default registered memory limits and I do appreciate >> the message that I only have 4GB of registered memory of my 48. We will >> also be fixing our load to raise this value, which should make this issue >> moot. > > Did you get a warning about being able to register too little memory? Correct I do and I like the warning at startup. > >> Honestly I think what I would want is for MPI to blow up saying "can't >> allocate registered memory, fatal, contact your admin", rather than fall >> back to send/receive and just be slower. > > Right now we should be just warning if we can't register 3/4 of your physical > memory (we can't really test for anything more than that). But it doesn't > abort. Ok > > We could add a tunable that makes it abort in this case, if you think that > would be useful. I think so, in my case that would mean a node is miss-configured, and rather than running slowly I want it brought to my attention, > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users