I was able to reproduce the issue on ubuntu with a 3.13 kernel. I think I know what is going wrong and I am working on a fix.
-Nathan On Tue, Mar 17, 2015 at 12:02:43PM +0100, Tobias Kloeffel wrote: > Hello Nathan, > > I am using: > IMB 4.0 Update 2 > gcc version 4.8.1 > Intel compilers 15.0.1 20141023 > xpmem from your github > > I also tested pwscf (QuatumEespresso), here I can observe the same > behavior. The entire calculation runs without problems, but a few mpi > procs just stay alive and refuse to die, even with signal 9. > openmpi and pw was build with the intel compilers, xpmem with gcc. > > Kind regards, > Tobias > > On 03/16/2015 05:56 PM, Nathan Hjelm wrote: > > What program are you using for the benchmark? Are you using the xpmem > branch in my github? For my testing I used a stock ubuntu 3.13 kernel > but I have not full stress-tested my xpmem branch. > > I will see if I can reproduce and fix the hang. > > -Nathan > > On Mon, Mar 16, 2015 at 05:32:26PM +0100, Tobias Kloeffel wrote: > > Hello everyone, > > currently I am benchmarking the different single copy mechanisms > knem/cma/xpmem on a Xeon E5 V3 machine. > I am using openmpi 1.8.4 with the CMA patch for vader. > > While it turns out that xpmem is the clear winner (reproducing Nathan > Hjelm's results) I always ran into a problem at the mpi finalizing step. At > this step, at least one process hangs, and can't be killed anymore. To get > rid of the hanging process, the server has to be rebooted. > > The applications finish successfully. > > Unfortunately, I can't find any further development of the xpmem module. Is > this bug known to anyone? What kernel versions do you use? > > Any help would be appreciated. > > Tested kernel versions: > 3.11.25-desktop (openSUSE) > 3.18.9 (vanilla) > 3.19.1 (vanilla) > > -- > M.Sc. Tobias Klo:ffel > ======================================================= > Interdisciplinary Center for Molecular Materials (ICMM) > and Computer-Chemistry-Center (CCC) > Department Chemie und Pharmazie > Friedrich-Alexander-Universita:t Erlangen-Nu:rnberg > Na:gelsbachstr. 25 > D-91052 Erlangen, Germany > > Room: 2.307 > Phone: +49 (0) 9131 / 85 - 20421 > Fax: +49 (0) 9131 / 85 - 26565 > > ======================================================= > Department of Materials Science and Engineering > Institute I: General Materials Properties > Friedrich-Alexander-Universita:t Erlangen-Nu:rnberg > Martensstr. 5, D-91058 Erlangen, Germany > Office 3.40 > Phone: (+49) 9131 85 27 -486 > http://www.gmp.ww.uni-erlangen.de > > E-mail: tobias.kloef...@fau.de > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/03/26479.php > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/03/26480.php > > -- > M.Sc. Tobias Klo:ffel > ======================================================= > Interdisciplinary Center for Molecular Materials (ICMM) > and Computer-Chemistry-Center (CCC) > Department Chemie und Pharmazie > Friedrich-Alexander-Universita:t Erlangen-Nu:rnberg > Na:gelsbachstr. 25 > D-91052 Erlangen, Germany > > Room: 2.307 > Phone: +49 (0) 9131 / 85 - 20421 > Fax: +49 (0) 9131 / 85 - 26565 > > ======================================================= > Department of Materials Science and Engineering > Institute I: General Materials Properties > Friedrich-Alexander-Universita:t Erlangen-Nu:rnberg > > Martensstr. 5, D-91058 Erlangen, Germany > Office 3.40 > Phone: (+49) 9131 85 27 -486 > http://www.gmp.ww.uni-erlangen.de > > E-mail: tobias.kloef...@fau.de > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/03/26483.php
pgpdwzxHSTmlR.pgp
Description: PGP signature