On Apr 24, 2013, at 6:01 AM, Derbunovich Andrei <abderbunov...@compcenter.org> wrote:
> Thank you to everybody for suggestions and comments. > > I have used relatively small number of nodes (4400). It looks like that > the main issue that I didn't disable dynamic components opening in my > openmpi build while keeping MPI installation directory on network file > system. Oh my god! > Ouch! Yep, that will slow things down a lot. Did you try using the LANL platform files? They build everything static just to avoid this problem. > I didn't check suggestion about using debrujin routed component yet. Nathan and I are debugging it. Meantime, please try the radix component I suggested in an earlier email. > > -Andrei > > -----Original Message----- > From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On > Behalf Of Ralph Castain > Sent: Tuesday, April 23, 2013 10:07 PM > To: Open MPI Users > Subject: Re: [OMPI users] OpenMPI at scale on Cray XK7 > > > On Apr 23, 2013, at 10:45 AM, Nathan Hjelm <hje...@lanl.gov> wrote: > >> On Tue, Apr 23, 2013 at 10:17:46AM -0700, Ralph Castain wrote: >>> >>> On Apr 23, 2013, at 10:09 AM, Nathan Hjelm <hje...@lanl.gov> wrote: >>> >>>> On Tue, Apr 23, 2013 at 12:21:49PM +0400, ???????????????????? > ???????????? wrote: >>>>> Hi, >>>>> >>>>> Nathan, could you please advise what is expected startup time for >>>>> OpenMPI job at such scale (128K ranks)? I'm interesting in >>>>> 1) time from mpirun start to completion of MPI_Init() >>>> >>>> It takes less than a minute to run: >>>> >>>> mpirun -n 131072 /bin/true >>>> >>>> >>>>> 2) time from MPI_Init() start to completion of MPI_Init() >>>> >>>> A simple MPI application took about about 1.25 mins to run. If you > want to see our setup you can take a look at > contrib/platform/lanl/cray_xe6. >>>> >>>>>> From my experience for 52800 rank job >>>>> 1) took around 20 min >>>>> 2) took around 12 min >>>>> that actually looks like a hung. >>>> >>>> How many nodes? I have never seen launch times that bad on Cielo. You > could try adding -mca routed debruijn -novm and see if that helps. It will > reduce the amount of communication between compute nodes and the login > node. >>> >>> I believe the debrujin module was turned off a while ago due to a bug >>> that wasn't fixed. However, try using >> >> Was it turned off or was the priority lowered? If it was lowered then > -mca routed debruijn should work. The -novm is to avoid the bug (as I > understand it). I am working on fixing the bug now in hope it will be > ready for 1.7.2. > > Pretty sure it is ompi_ignored and thus, not in the tarball > >> >> -Nathan >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users