Re: [OMPI users] mpicc and libstdc++, general building question

2017-04-08 Thread Christof Koehler
0, Reuti wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > Hi, > > Am 07.04.2017 um 19:11 schrieb Christof Koehler: > > > […] > > > > On top, all OpenMPI libraries when checked with ldd (gcc 6.3 module > > still loaded) reference the /usr/lib64

[OMPI users] mpicc and libstdc++, general building question

2017-04-07 Thread Christof Koehler
Hello everybody, we are trying to build an appilcation with specific C++ requirements and for some reason we get the "wrong" libstdc++ library in the end. At the moment the working hypothesis is that this might be related to our OpenMPI built. I will try to explain below. The application, which i

Re: [OMPI users] Abort/ Deadlock issue in allreduce (Gilles Gouaillardet)

2016-12-12 Thread Christof Koehler
6-8886-5308-bfed9af2a...@rist.or.jp> > Content-Type: text/plain; charset="windows-1252"; Format="flowed" > > Christof, > > > Ralph fixed the issue, > > meanwhile, the patch can be manually downloaded at > https://patch-diff.githubusercontent.com/ra

Re: [OMPI users] Abort/ Deadlock issue in allreduce

2016-12-09 Thread Christof Koehler
Hello, our case is. The libwannier.a is a "third party" library which is built seperately and the just linked in. So the vasp preprocessor never touches it. As far as I can see no preprocessing of the f90 source is involved in the libwannier build process. I finally managed to set a breakpoint a

Re: [OMPI users] Abort/ Deadlock issue in allreduce

2016-12-08 Thread Christof Koehler
nd mpirun does nothing with it ? Cheers Christof On Thu, Dec 08, 2016 at 12:39:06PM +0100, Christof Koehler wrote: > Hello, > > On Thu, Dec 08, 2016 at 08:05:44PM +0900, Gilles Gouaillardet wrote: > > Christof, > > > > > > There is something really odd with this st

Re: [OMPI users] Abort/ Deadlock issue in allreduce

2016-12-08 Thread Christof Koehler
316K rw--- [ stack ] 7ffd8cfa4000 8K r-x-- [ anon ] ff60 4K r-x-- [ anon ] total 539352K > > Cheers, > > Gilles > > On Thursday, December 8, 2016, Christof Koehler < > christof.koeh...@bccms.uni-bremen.de> wrote: > > &g

Re: [OMPI users] Abort/ Deadlock issue in allreduce

2016-12-08 Thread Christof Koehler
lem that is pending > in PR #2528: https://github.com/open-mpi/ompi/pull/2528 > <https://github.com/open-mpi/ompi/pull/2528> > > Would you be able to try that? > > Ralph > > > On Dec 7, 2016, at 9:37 AM, Christof Koehler > > wrote: > > > > Hello, &

Re: [OMPI users] Abort/ Deadlock issue in allreduce

2016-12-07 Thread Christof Koehler
Hello, On Wed, Dec 07, 2016 at 10:19:10AM -0500, Noam Bernstein wrote: > > On Dec 7, 2016, at 10:07 AM, Christof Koehler > > wrote: > >> > > I really think the hang is a consequence of > > unclean termination (in the sense that the non-root ranks are not >

Re: [OMPI users] Abort/ Deadlock issue in allreduce

2016-12-07 Thread Christof Koehler
my interpretation of what I see. Would you have any suggestion to catch signals sent between orterun (mpirun) and the child tasks ? I will try to get the information you want, but I will have to figure out how to do that first. Cheers Christof > > Cheers, > > Gilles > >

Re: [OMPI users] Abort/ Deadlock issue in allreduce

2016-12-07 Thread Christof Koehler
(argc=13, argv=0x7ffef20a79f8) at main.c:13 Best Regards Christof On Wed, Dec 07, 2016 at 02:07:27PM +0100, Christof Koehler wrote: > Hello, > > thank you for the fast answer. > > On Wed, Dec 07, 2016 at 08:23:43PM +0900, Gilles Gouaillardet wrote: > > Christoph, > &

Re: [OMPI users] Abort/ Deadlock issue in allreduce

2016-12-07 Thread Christof Koehler
an try to catch the contents of ivec, but I do not think that would be helpful ? If you need them I can try of course, I have no idea hwo large the vector is. Best Regards Christof [1] https://www.vasp.at/ [2] http://www.wannier.org/, Old version 1.2 > > > > Cheers, > > Gille

[OMPI users] Abort/ Deadlock issue in allreduce

2016-12-07 Thread Christof Koehler
Hello everybody, I am observing a deadlock in allreduce with openmpi 2.0.1 on a Single node. A stack tracke (pstack) of one rank is below showing the program (vasp 5.3.5) and the two psm2 progress threads. However: In fact, the vasp input is not ok and it should abort at the point where it hangs.

Re: [OMPI users] ScaLapack tester fails with 2.0.1, works with 1.10.4; Intel Omni-Path

2016-11-28 Thread Christof Koehler
ght inherit different values, and in > > some cases these values might be erroneous (in your example the printed > > values should not be NaN). > > > > Moreover, the eigenvalue tester is a pretty sensitive piece of code. I > > would strongly suggest you send an email wit

Re: [OMPI users] ScaLapack tester fails with 2.0.1, works with 1.10.4; Intel Omni-Path

2016-11-24 Thread Christof Koehler
lue tester is a pretty sensitive piece of code. I > would strongly suggest you send an email with your findings to the > ScaLAPACK mailing list. > > George. > > > > On Wed, Nov 23, 2016 at 9:41 AM, Christof Koehler < > christof.koeh...@bccms.uni-bremen.de> wrote:

Re: [OMPI users] ScaLapack tester fails with 2.0.1, works with 1.10.4; Intel Omni-Path

2016-11-23 Thread Christof Koehler
ty, could you try to > mpirun --mca coll ^tuned ... > and see if it helps ? > > Cheers, > > Gilles > > > On Tue, Nov 22, 2016 at 7:21 PM, Christof Koehler > wrote: > > Hello again, > > > > I tried to replicate the situation on the workstation at

Re: [OMPI users] ScaLapack tester fails with 2.0.1, works with 1.10.4; Intel Omni-Path

2016-11-22 Thread Christof Koehler
row. Thank you all for your help ! This is really strange. Cheers Christof > > Cheers, > > Gilles > > > On Tue, Nov 22, 2016 at 7:21 PM, Christof Koehler > wrote: > > Hello again, > > > > I tried to replicate the situation on the workstation at my d

Re: [OMPI users] ScaLapack tester fails with 2.0.1, works with 1.10.4; Intel Omni-Path

2016-11-22 Thread Christof Koehler
Hello again, I tried to replicate the situation on the workstation at my desk, running ubuntu 14.04 (gcc 4.8.4) and with the OS supplied lapack and blas libraries. With openmpi 2.0.1 (mpirun -np 4 xdsyevr) I get "136 tests completed and failed." with the "IL, IU, VL or VU altered by PDSYEVR" mes