Specifically, it means that Open MPI could not find the "orted" executable on
some nodes ("orted" is the Open MPI helper daemon). Hence, your Open MPI
install is either not in your PATH / LD_LIBRARY_PATH on those nodes, or, as
Gilles mentioned, Open MPI is not installed on those nodes.
Check o
Wow - that is one sick puppy! I see that some nodes are reporting not-bound
for their procs, and the rest are binding to socket (as they should). Some
of your nodes clearly do not have hyper threads enabled (or only have
single-thread cores on them), and have 2 cores/socket. Other nodes have 8
core
Gilles,
The nodes do not all have the same configuration. There are probably 6 different
hardware configurations (as to memory, number of sockets populated, types of
CPU).
Some of the systems are older dual core Xeons (5160 and L5240 CPU's) installed
in a blade chassis (some
of these blades hav
Ralph,
There is something funny going on, the trace from the
runs w/the debug build aren't showing any differences from
what I got earlier. However, I did do a run w/the --bind-to core
switch and was surprised to see that hyperthreading cores were
sometimes being used.
Here's the traces that I ha
Jeff,
it sounds like openmpI is not available on sone nodes !
an other possibility is it is installed but in an other directory
or mabye it is not in your path and you did not configure with
--enable-mpirun-prefix-by-default
Cheers,
Gilles
On Wednesday, June 24, 2015, Jeff Layton wrote:
> Go
You shouldn't need any special flags for mpicc or mpirun to replicate the
problem. This will just let us see the line numbers associated with the
crash so we can narrow down the problem. Once we get that, we may need to
rerun with specific params to narrow it down further.
BTW: when you get the ba
Ralph,
I've had OpenMPI 1.8.6 installed on our cluster w/the --enable-debug
option. Here's what I think are the relevant flags returned from ompi_info:
openMPI 1.8.6 build info
Fort MPI_SIZEOF: no
C profiling: yes
C++ profiling: yes
Fort mpif.h profiling: yes
Fort use mpi profiling: y
Good afternoon sports fans!
I'm trying to run the ft code of NPB, class D, over 128 processors. I built the
code with gfortran 4.4.7 (CentOS 6 platform) and Open MPI 1.8.1. I'm using
openlava as the resource manager. The error output is the following:
[ec2-user@ip-172-31-42-106 bin]$ more runit