Not for this test case size. You should be just fine with the default values.
If I understand you correctly, you've run this app at scale before on another cluster without problem? On Jul 19, 2014, at 1:34 PM, Lane, William <william.l...@cshs.org> wrote: > Ralph, > > It's hard to imagine it's the openMPI code because I've tested this code > extensively on another cluster with 400 nodes and never had any problems. > But I'll try using the hello_c example in any case. Is it still recommended to > raise the open files soft and hard limits to 4096? Or should even larger > values > be necessary? > > Thank you for your help. > > -Bill Lane > > From: users [users-boun...@open-mpi.org] on behalf of Ralph Castain > [r...@open-mpi.org] > Sent: Saturday, July 19, 2014 8:07 AM > To: Open MPI Users > Subject: Re: [OMPI users] Mpirun 1.5.4 problems when request > 28 slots > > That's a pretty old OMPI version, and we don't really support it any longer. > However, I can provide some advice: > > * have you tried running the simple "hello_c" example we provide? This would > at least tell you if the problem is in your app, which is what I'd expect > given your description > > * try using gdb (or pick your debugger) to look at the corefile and see where > it is failing > > I'd also suggest updating OMPI to the 1.6.5 or 1.8.1 versions, but I doubt > that's the issue behind this problem. > > > On Jul 19, 2014, at 1:05 AM, Lane, William <william.l...@cshs.org> wrote: > >> I'm getting consistent errors of the form: >> >> "mpirun noticed that process rank 3 with PID 802 on node csclprd3-0-8 exited >> on signal 11 (Segmentation fault)." >> >> whenever I request more than 28 slots. These >> errors even occur when I run mpirun locally >> on a compute node that has 32 slots (8 cores, 16 with hyperthreading). >> >> When I run less than 28 slots I have no problems whatsoever. >> >> OS: >> CentOS release 6.3 (Final) >> >> openMPI information: >> Package: Open MPI mockbu...@c6b8.bsys.dev.centos.org >> Distribution >> Open MPI: 1.5.4 >> Open MPI SVN revision: r25060 >> Open MPI release date: Aug 18, 2011 >> Open RTE: 1.5.4 >> Open RTE SVN revision: r25060 >> Open RTE release date: Aug 18, 2011 >> OPAL: 1.5.4 >> OPAL SVN revision: r25060 >> OPAL release date: Aug 18, 2011 >> Ident string: 1.5.4 >> Prefix: /usr/lib64/openmpi >> Configured architecture: x86_64-unknown-linux-gnu >> Configure host: c6b8.bsys.dev.centos.org >> Configured by: mockbuild >> Configured on: Fri Jun 22 06:42:03 UTC 2012 >> Configure host: c6b8.bsys.dev.centos.org >> Built by: mockbuild >> Built on: Fri Jun 22 06:46:48 UTC 2012 >> Built host: c6b8.bsys.dev.centos.org >> C bindings: yes >> C++ bindings: yes >> Fortran77 bindings: yes (all) >> Fortran90 bindings: yes >> Fortran90 bindings size: small >> C compiler: gcc >> C compiler absolute: /usr/bin/gcc >> C compiler family name: GNU >> C compiler version: 4.4.6 >> C++ compiler: g++ >> C++ compiler absolute: /usr/bin/g++ >> Fortran77 compiler: gfortran >> Fortran77 compiler abs: /usr/bin/gfortran >> Fortran90 compiler: gfortran >> Fortran90 compiler abs: /usr/bin/gfortran >> C profiling: yes >> C++ profiling: yes >> Fortran77 profiling: yes >> Fortran90 profiling: yes >> C++ exceptions: no >> Thread support: posix (MPI_THREAD_MULTIPLE: no, progress: no) >> Sparse Groups: no >> Internal debug support: no >> MPI interface warnings: no >> MPI parameter check: runtime >> Memory profiling support: no >> Memory debugging support: no >> libltdl support: yes >> Heterogeneous support: no >> mpirun default --prefix: no >> MPI I/O support: yes >> MPI_WTIME support: gettimeofday >> Symbol vis. support: yes >> MPI extensions: affinity example >> FT Checkpoint support: no (checkpoint thread: no) >> MPI_MAX_PROCESSOR_NAME: 256 >> MPI_MAX_ERROR_STRING: 256 >> MPI_MAX_OBJECT_NAME: 64 >> MPI_MAX_INFO_KEY: 36 >> MPI_MAX_INFO_VAL: 256 >> MPI_MAX_PORT_NAME: 1024 >> MPI_MAX_DATAREP_STRING: 128 >> MCA backtrace: execinfo (MCA v2.0, API v2.0, Component v1.5.4) >> MCA memchecker: valgrind (MCA v2.0, API v2.0, Component v1.5.4) >> MCA memory: linux (MCA v2.0, API v2.0, Component v1.5.4) >> MCA paffinity: hwloc (MCA v2.0, API v2.0, Component v1.5.4) >> MCA carto: auto_detect (MCA v2.0, API v2.0, Component v1.5.4) >> MCA carto: file (MCA v2.0, API v2.0, Component v1.5.4) >> MCA maffinity: first_use (MCA v2.0, API v2.0, Component v1.5.4) >> MCA maffinity: libnuma (MCA v2.0, API v2.0, Component v1.5.4) >> MCA timer: linux (MCA v2.0, API v2.0, Component v1.5.4) >> MCA installdirs: env (MCA v2.0, API v2.0, Component v1.5.4) >> MCA installdirs: config (MCA v2.0, API v2.0, Component v1.5.4) >> MCA dpm: orte (MCA v2.0, API v2.0, Component v1.5.4) >> MCA pubsub: orte (MCA v2.0, API v2.0, Component v1.5.4) >> MCA allocator: basic (MCA v2.0, API v2.0, Component v1.5.4) >> MCA allocator: bucket (MCA v2.0, API v2.0, Component v1.5.4) >> MCA coll: basic (MCA v2.0, API v2.0, Component v1.5.4) >> MCA coll: hierarch (MCA v2.0, API v2.0, Component v1.5.4) >> MCA coll: inter (MCA v2.0, API v2.0, Component v1.5.4) >> MCA coll: self (MCA v2.0, API v2.0, Component v1.5.4) >> MCA coll: sm (MCA v2.0, API v2.0, Component v1.5.4) >> MCA coll: sync (MCA v2.0, API v2.0, Component v1.5.4) >> MCA coll: tuned (MCA v2.0, API v2.0, Component v1.5.4) >> MCA mpool: fake (MCA v2.0, API v2.0, Component v1.5.4) >> MCA mpool: rdma (MCA v2.0, API v2.0, Component v1.5.4) >> MCA mpool: sm (MCA v2.0, API v2.0, Component v1.5.4) >> MCA pml: bfo (MCA v2.0, API v2.0, Component v1.5.4) >> MCA pml: csum (MCA v2.0, API v2.0, Component v1.5.4) >> MCA pml: ob1 (MCA v2.0, API v2.0, Component v1.5.4) >> MCA pml: v (MCA v2.0, API v2.0, Component v1.5.4) >> MCA bml: r2 (MCA v2.0, API v2.0, Component v1.5.4) >> MCA rcache: vma (MCA v2.0, API v2.0, Component v1.5.4) >> MCA btl: ofud (MCA v2.0, API v2.0, Component v1.5.4) >> MCA btl: openib (MCA v2.0, API v2.0, Component v1.5.4) >> MCA btl: self (MCA v2.0, API v2.0, Component v1.5.4) >> MCA btl: sm (MCA v2.0, API v2.0, Component v1.5.4) >> MCA btl: tcp (MCA v2.0, API v2.0, Component v1.5.4) >> MCA topo: unity (MCA v2.0, API v2.0, Component v1.5.4) >> MCA osc: pt2pt (MCA v2.0, API v2.0, Component v1.5.4) >> MCA osc: rdma (MCA v2.0, API v2.0, Component v1.5.4) >> MCA iof: hnp (MCA v2.0, API v2.0, Component v1.5.4) >> MCA iof: orted (MCA v2.0, API v2.0, Component v1.5.4) >> MCA iof: tool (MCA v2.0, API v2.0, Component v1.5.4) >> MCA oob: tcp (MCA v2.0, API v2.0, Component v1.5.4) >> MCA odls: default (MCA v2.0, API v2.0, Component v1.5.4) >> MCA ras: cm (MCA v2.0, API v2.0, Component v1.5.4) >> MCA ras: gridengine (MCA v2.0, API v2.0, Component v1.5.4) >> MCA ras: loadleveler (MCA v2.0, API v2.0, Component v1.5.4) >> MCA ras: slurm (MCA v2.0, API v2.0, Component v1.5.4) >> MCA rmaps: load_balance (MCA v2.0, API v2.0, Component v1.5.4) >> MCA rmaps: rank_file (MCA v2.0, API v2.0, Component v1.5.4) >> MCA rmaps: resilient (MCA v2.0, API v2.0, Component v1.5.4) >> MCA rmaps: round_robin (MCA v2.0, API v2.0, Component v1.5.4) >> MCA rmaps: seq (MCA v2.0, API v2.0, Component v1.5.4) >> MCA rmaps: topo (MCA v2.0, API v2.0, Component v1.5.4) >> MCA rml: oob (MCA v2.0, API v2.0, Component v1.5.4) >> MCA routed: binomial (MCA v2.0, API v2.0, Component v1.5.4) >> MCA routed: cm (MCA v2.0, API v2.0, Component v1.5.4) >> MCA routed: direct (MCA v2.0, API v2.0, Component v1.5.4) >> MCA routed: linear (MCA v2.0, API v2.0, Component v1.5.4) >> MCA routed: radix (MCA v2.0, API v2.0, Component v1.5.4) >> MCA routed: slave (MCA v2.0, API v2.0, Component v1.5.4) >> MCA plm: rsh (MCA v2.0, API v2.0, Component v1.5.4) >> MCA plm: rshd (MCA v2.0, API v2.0, Component v1.5.4) >> MCA plm: slurm (MCA v2.0, API v2.0, Component v1.5.4) >> MCA filem: rsh (MCA v2.0, API v2.0, Component v1.5.4) >> MCA errmgr: default (MCA v2.0, API v2.0, Component v1.5.4) >> MCA ess: env (MCA v2.0, API v2.0, Component v1.5.4) >> MCA ess: hnp (MCA v2.0, API v2.0, Component v1.5.4) >> MCA ess: singleton (MCA v2.0, API v2.0, Component v1.5.4) >> MCA ess: slave (MCA v2.0, API v2.0, Component v1.5.4) >> MCA ess: slurm (MCA v2.0, API v2.0, Component v1.5.4) >> MCA ess: slurmd (MCA v2.0, API v2.0, Component v1.5.4) >> MCA ess: tool (MCA v2.0, API v2.0, Component v1.5.4) >> MCA grpcomm: bad (MCA v2.0, API v2.0, Component v1.5.4) >> MCA grpcomm: basic (MCA v2.0, API v2.0, Component v1.5.4) >> MCA grpcomm: hier (MCA v2.0, API v2.0, Component v1.5.4) >> MCA notifier: command (MCA v2.0, API v1.0, Component v1.5.4) >> MCA notifier: smtp (MCA v2.0, API v1.0, Component v1.5.4) >> MCA notifier: syslog (MCA v2.0, API v1.0, Component v1.5.4) >> >> IMPORTANT WARNING: This message is intended for the use of the person or >> entity to which it is addressed and may contain information that is >> privileged and confidential, the disclosure of which is governed by >> applicable law. If the reader of this message is not the intended recipient, >> or the employee or agent responsible for delivering it to the intended >> recipient, you are hereby notified that any dissemination, distribution or >> copying of this information is STRICTLY PROHIBITED. If you have received >> this message in error, please notify us immediately by calling (310) >> 423-6428 and destroy the related message. Thank You for your cooperation. >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2014/07/24815.php > > IMPORTANT WARNING: This message is intended for the use of the person or > entity to which it is addressed and may contain information that is > privileged and confidential, the disclosure of which is governed by > applicable law. If the reader of this message is not the intended recipient, > or the employee or agent responsible for delivering it to the intended > recipient, you are hereby notified that any dissemination, distribution or > copying of this information is STRICTLY PROHIBITED. If you have received this > message in error, please notify us immediately by calling (310) 423-6428 and > destroy the related message. Thank You for your cooperation. > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/07/24817.php