Topics:
>
> 1. Re: RE : MPI hangs on multiple nodes (Gus Correa)
> 2. Typo in MPI_Cart_coords man page (Jeremiah Willcock)
> 3. Re: RE : MPI hangs on multiple nodes (Gus Correa)
> 4. How could OpenMPI (or MVAPICH) affect floating-point results?
> (Blosch, Edwin L)
&g
r
MPI.
Cheers
Ole Nielsen
--
Here's the output which shows the freeze in the third iteration:
nielso@alamba:~/sandpit/pypar/source$ mpirun --hostfile /etc/mpihosts --host
node5,node6 --npernode 2 a.out
Number of processes = 4
Test repeated 3 times for reliability
I am process 2 on nod
Thanks for your suggestion Gus, we need a way of debugging what is going on.
I am pretty sure the problem lies with our cluster configuration. I know MPI
simply relies on the underlying network. However, we can ping and ssh to all
nodes (and in between and pair as well) so it is currently a mystery
The test program is available here:
http://code.google.com/p/pypar/source/browse/source/mpi_test.c
Hopefully, someone can help us troubleshoot why communications stop when
multiple nodes are involved and CPU usage goes to 100% for as long as we
leave the program running.
Many thanks
Ole Nielsen
anyone else seen this behavior or can anyone give me a hint on how to
troubleshoot.
Cheers and thanks
Ole Nielsen
Output:
nielso@alamba:~/sandpit/pypar/source$ mpirun --hostfile /etc/mpihosts --host
node17,node18 --npernode 2 a.out
Number of processes = 4
Test repeated 3 times for reliability
I am
Cheers and thanks
Ole Nielsen
Test output across two nodes (This one hangs)
--
nielso@alamba:~/sandpit/pypar/source$ mpirun --hostfile /etc/mpihosts --host
node17,node18 --npernode 2 a.out
Number of processes = 4
Test repeated 3 times for reliability
I am