Thank you for the help so far. Here is the information that the
debugging gives me. Looks like the daemon on on the non-local node
never makes contact. If I step NP back two though, it does.
Dan
[root@compute-2-1 etc]# /home/apps/openmpi-1.6.3/bin/mpirun -host
compute-2-0,compute-2-1 -v -
Sorry - I forgot that you built from a tarball, and so debug isn't enabled by
default. You need to configure --enable-debug.
On Dec 14, 2012, at 1:52 PM, Daniel Davidson wrote:
> Oddly enough, adding this debugging info, lowered the number of processes
> that can be used down to 42 from 46. W
Oddly enough, adding this debugging info, lowered the number of
processes that can be used down to 42 from 46. When I run the MPI, it
fails giving only the information that follows:
[root@compute-2-1 ssh]# /home/apps/openmpi-1.6.3/bin/mpirun -host
compute-2-0,compute-2-1 -v -np 44 --leave-se
Folks,
I'm trying to track down an instance of openMPI writing to a freed block of
memory.
This occurs with the most recent release (1.6.3) as well as 1.6, on a 64 bit
intel architecture, fedora 14.
It occurs with a very simple reduction (allreduce minimum), over a single int
value.
Has anyone
It wouldn't be ssh - in both cases, only one ssh is being done to each node (to
start the local daemon). The only difference is the number of fork/exec's being
done on each node, and the number of file descriptors being opened to support
those fork/exec's.
It certainly looks like your limits ar
I have had to cobble together two machines in our rocks cluster without
using the standard installation, they have efi only bios on them and
rocks doesnt like that, so it is the only workaround.
Everything works great now, except for one thing. MPI jobs (openmpi or
mpich) fail when started fr
Add -mca plm_base_verbose 5 --leave-session-attached to the cmd line - that
will show the ssh command being used to start each orted.
On Dec 14, 2012, at 12:17 PM, "Blosch, Edwin L" wrote:
> I am having a weird problem launching cases with OpenMPI 1.4.3. It is most
> likely a problem with a p
I am having a weird problem launching cases with OpenMPI 1.4.3. It is most
likely a problem with a particular node of our cluster, as the jobs will run
fine on some submissions, but not other submissions. It seems to depend on the
node list. I just am having trouble diagnosing which node, and
Hi Siegmar
On Dec 14, 2012, at 5:54 AM, Siegmar Gross
wrote:
> Hi,
>
> some weeks ago (mainly in the beginning of October) I reported
> several problems and I would be grateful if you can tell me if
> and probably when somebody will try to solve them.
>
> 1) I don't get the expected results,
Disturbing, but I don't know if/when someone will address it. The problem
really is that few, if any, of the developers have access to hetero systems. So
developing and testing hetero support is difficult to impossible.
I'll file a ticket about it and direct it to the attention of the person who
Hi,
some weeks ago (mainly in the beginning of October) I reported
several problems and I would be grateful if you can tell me if
and probably when somebody will try to solve them.
1) I don't get the expected results, when I try to send or scatter
the columns of a matrix in Java. The received
Hi,
some weeks ago I reported a problem with my matrix multiplication
program in a heterogeneous environment (little endian and big endian
machines). The problem occurs in openmpi-1.6.x, openmpi-1.7, and
openmpi-1.9. Now I implemented a small program which only scatters
the columns of an integer m
12 matches
Mail list logo