If this is with 1.10.x or older run with --mca memory_linux_disable 1. There is 
a bad interaction between ptmalloc2 and psm2 support. This problem is not 
present in v2.0.x and newer.

-Nathan

> On Mar 7, 2017, at 10:30 AM, Paul Kapinos <kapi...@itc.rwth-aachen.de> wrote:
> 
> Hi Dave,
> 
> 
>> On 03/06/17 18:09, Dave Love wrote:
>> I've been looking at a new version of an application (cp2k, for for what
>> it's worth) which is calling mpi_alloc_mem/mpi_free_mem, and I don't
> 
> Welcome to the club! :o)
> In our measures we see some 70% of time in 'mpi_free_mem'... and 15x 
> performance loss if using Open MPI vs. Intel MPI. So it goes.
> 
> https://www.mail-archive.com/users@lists.open-mpi.org//msg30593.html
> 
> 
>> think it did so the previous version I looked at.  I found on an
>> IB-based system it's spending about half its time in those allocation
>> routines (according to its own profiling) -- a tad surprising.
>> 
>> It turns out that's due to some pathological interaction with openib,
>> and just having openib loaded.  It shows up on a single-node run iff I
>> don't suppress the openib btl, and doesn't for multi-node PSM runs iff I
>> suppress openib (on a mixed Mellanox/Infinipath system).
> 
> we're lucky - our issue is on Intel OmniPath (OPA) network (and we will junk 
> IB hardware in near future, I think) - so we disabled the IB transport 
> failback,
> --mca btl ^tcp,openib
> 
> For single-node jobs this will also help on plain IB nodes, likely. (you can 
> disable IB if you do not use it)
> 
>> 
>> Can anyone say why, and whether there's a workaround?  (I can't easily
>> diagnose what it's up to as ptrace is turned off on the system
>> concerned, and I can't find anything relevant in archives.)
>> 
>> I had the idea to try libfabric instead for multi-node jobs, and that
>> doesn't show the pathological behaviour iff openib is suppressed.
>> However, it requires ompi 1.10, not 1.8, which I was trying to use.
>> _______________________________________________
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>> 
> 
> 
> -- 
> Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
> RWTH Aachen University, IT Center
> Seffenter Weg 23,  D 52074  Aachen (Germany)
> Tel: +49 241/80-24915
> 
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to