Siegmar, Is this issue specific to the PGI compiler ? What if you ulimit -s before invoking mpirun, is that good enough to workaround the problem ?
Cheers, Gilles On Tue, Mar 26, 2019 at 6:32 PM Siegmar Gross <siegmar.gr...@informatik.hs-fulda.de> wrote: > > Hi, > > I've installed openmpi-v4.0.x-201903220241-97aa434 and > openmpi-master-201903260242-dfbc144 on my "SUSE Linux Enterprise Server 12.3 > (x86_64)" with pgcc-18.4. Unfortunately, I still get the following error for > my Java programs for openmpi-master. Everything works as expected for > openmpi-4.0.x. I've already reported the error some time ago. > https://users.open-mpi.narkive.com/qz90agAO/ompi-users-stack-overflow-in-routine-alloca-for-java-programs-in-openmpi-master-with-pgcc-18-4 > > > loki java 127 ompi_info | grep "Configure command line:" > Configure command line: '--prefix=/usr/local/openmpi-master_64_pgcc' > '--libdir=/usr/local/openmpi-master_64_pgcc/lib64' > '--with-jdk-bindir=/usr/local/jdk-11/bin' > '--with-jdk-headers=/usr/local/jdk-11/include' 'JAVA_HOME=/usr/local/jdk-11' > 'LDFLAGS=-m64 -Wl,-z -Wl,noexecstack -L/usr/local/pgi/linux86-64/18.4/lib > -R/usr/local/pgi/linux86-64/18.4/lib' 'LIBS=-lpgm' 'CC=pgcc' 'CXX=pgc++' > 'FC=pgfortran' 'CFLAGS=-c11 -m64' 'CXXFLAGS=-m64' 'FCFLAGS=-m64' 'CPP=cpp' > 'CXXCPP=cpp' '--enable-mpi-cxx' '--enable-cxx-exceptions' '--enable-mpi-java' > '--with-valgrind=/usr/local/valgrind' '--with-hwloc=internal' > '--without-verbs' > '--with-wrapper-cflags=-c11 -m64' '--with-wrapper-cxxflags=-m64' > '--with-wrapper-fcflags=-m64' '--enable-debug' > > > loki java 128 mpiexec -np 4 --host loki:4 java > MatMultWithAnyProc2DarrayIn1DarrayMain > Error: in routine alloca() there is a > stack overflow: thread 0, max 8180KB, used 0KB, request 42B > Error: in routine alloca() there is a > stack overflow: thread 0, max 8180KB, used 0KB, request 42B > Error: in routine alloca() there is a > stack overflow: thread 0, max 8180KB, used 0KB, request 42B > -------------------------------------------------------------------------- > Primary job terminated normally, but 1 process returned > a non-zero exit code. Per user-direction, the job has been aborted. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > mpiexec detected that one or more processes exited with non-zero status, thus > causing > the job to be terminated. The first process to do so was: > > Process name: [[44592,1],1] > Exit code: 127 > -------------------------------------------------------------------------- > loki java 129 > > > > I would be grateful, if somebody can fix the problem. Do you need anything > else? Thank you very much for any help in advance. > > > Kind regards > > Siegmar > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users _______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users