> On Oct 16, 2015, at 3:25 PM, McGrattan, Kevin B. Dr. > <kevin.mcgrat...@nist.gov> wrote: > > I cannot nail this down any better because this happens like every other > night, with about 1 out of a hundred jobs. Can anyone think of a reason why > the job would seg fault in MPI_FINALIZE, but only under conditions where the > jobs are tightly packed onto our cluster?
There have been a bunch of fixes in the ORTE / MPI_FINALIZE areas of Open MPI since 1.8.4. Is there any chance you can upgrade to 1.8.8, or, better yet, 1.10.0? (note that even though it's 1.10, it's actually effectively a continuation of the v1.8 series -- the v1.10 series does not represent a new fork from our development master -- see the full version/roadmap details here, if you're interested: http://blogs.cisco.com/performance/open-mpi-new-versioning-scheme-and-roadmap). -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/