I'm running OpenMPI 1.8.7 tests on a mixed bag cluster of various systems
under CentOS 6.3, I've been intermittently getting warnings about not having
the proper NUMA libraries installed. Which NUMA libraries should be installed
for CentOS 6.3 and OpenMPI 1.8.7?
Here's what I currently have instal
Hi Bill
You need numactl-devel on the nodes. Not having them means we cannot ensure
memory is bound local to the procs, which will hurt performance but not
much else. There is an MCA param to turn off the warnings if you choose not
to install the libs: hwloc_base_mem_bind_failure_action=silent
Ra
Hi Nate,
Sorry for the delay in getting back. Thanks for the sanity check. You may
have a point about the args string to MPI.init -
there's nothing the Open MPI is needing from this but that is a difference
with your use case - your app has an argument.
Would you mind adding a
System.gc()
cal
I read @
https://www.open-mpi.org/faq/?category=sge
that for OpenMPI Parallel Environments there's
a special consideration for Son of Grid Engine:
'"qsort_args" is necessary with the Son of Grid Engine distribution,
version 8.1.1 and later, and probably only applicable to it. For
very
You know, I honestly don't know - there is a patch in there for qsort, but
I haven't checked it against SGE. Let us know if you hit a problem and
we'll try to figure it out.
Glad to hear your cluster is working - nice to have such challenges to
shake the cobwebs out :-)
On Wed, Aug 5, 2015 at 12:
Howard,
Thanks for looking at all this. Adding System.gc() did not cause it to
segfault. The segfault still comes much later in the processing.
I was able to reduce my code to a single test file without other
dependencies. It is attached. This code simply opens a text file and reads
its lines, on
Hi,
I updated our OpenMPI install from 1.8.3 to 1.8.8 today and I'm getting
this error:
XRC error: bad XRC API (require XRC from OFED pre 3.12).
This happens even using the exact same node to compile and run an
example program. I saw a thread from a few weeks ago discussing this
issue as well. I
thanks Nate. We will give the test a try.
--
sent from my smart phonr so no good type.
Howard
On Aug 5, 2015 2:42 PM, "Nate Chambers" wrote:
> Howard,
>
> Thanks for looking at all this. Adding System.gc() did not cause it to
> segfault. The segfault still comes much later in the proc
Actually, we're still having problems submitting OpenMPI 1.8.7 jobs
to the cluster thru SGE (which we need to do in order to track usage
stats on the cluster). I suppose I'll make a PE w/the appropriate settings
and see if that makes a difference.
-Bill L
From: us
Well that stinks! Let me know what's going on and I'll take a look. FWIW,
the best method is generally to configure OMPI with --enable-debug, and
then run with "--leave-session-attached --mca plm_base_verbose 5". That
will tell us what the launcher thinks it is doing and what the daemons
think is w
Yeah, I recall your earlier email on the subject. Sadly, I need someone
from Mellanox to look at this as I don't have access to such equipment.
Josh? Mike? Gilles? Can someone please look at this?
On Wed, Aug 5, 2015 at 2:31 PM, Andy Wettstein
wrote:
> Hi,
>
> I updated our OpenMPI install fro
11 matches
Mail list logo