If this is with 1.10.x or older run with --mca memory_linux_disable 1. There is a bad interaction between ptmalloc2 and psm2 support. This problem is not present in v2.0.x and newer.
-Nathan > On Mar 7, 2017, at 10:30 AM, Paul Kapinos <kapi...@itc.rwth-aachen.de> wrote: > > Hi Dave, > > >> On 03/06/17 18:09, Dave Love wrote: >> I've been looking at a new version of an application (cp2k, for for what >> it's worth) which is calling mpi_alloc_mem/mpi_free_mem, and I don't > > Welcome to the club! :o) > In our measures we see some 70% of time in 'mpi_free_mem'... and 15x > performance loss if using Open MPI vs. Intel MPI. So it goes. > > https://www.mail-archive.com/users@lists.open-mpi.org//msg30593.html > > >> think it did so the previous version I looked at. I found on an >> IB-based system it's spending about half its time in those allocation >> routines (according to its own profiling) -- a tad surprising. >> >> It turns out that's due to some pathological interaction with openib, >> and just having openib loaded. It shows up on a single-node run iff I >> don't suppress the openib btl, and doesn't for multi-node PSM runs iff I >> suppress openib (on a mixed Mellanox/Infinipath system). > > we're lucky - our issue is on Intel OmniPath (OPA) network (and we will junk > IB hardware in near future, I think) - so we disabled the IB transport > failback, > --mca btl ^tcp,openib > > For single-node jobs this will also help on plain IB nodes, likely. (you can > disable IB if you do not use it) > >> >> Can anyone say why, and whether there's a workaround? (I can't easily >> diagnose what it's up to as ptrace is turned off on the system >> concerned, and I can't find anything relevant in archives.) >> >> I had the idea to try libfabric instead for multi-node jobs, and that >> doesn't show the pathological behaviour iff openib is suppressed. >> However, it requires ompi 1.10, not 1.8, which I was trying to use. >> _______________________________________________ >> users mailing list >> users@lists.open-mpi.org >> https://rfd.newmexicoconsortium.org/mailman/listinfo/users >> > > > -- > Dipl.-Inform. Paul Kapinos - High Performance Computing, > RWTH Aachen University, IT Center > Seffenter Weg 23, D 52074 Aachen (Germany) > Tel: +49 241/80-24915 > > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users _______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users