So, an update on how this all turned out.
Basically, everything is working correctly. The problem is the poor Ethernet
comms performance of the Pi (and my rather cheap 10Mbit switch).
The main resolver application I use requires talking to all the other nodes
pretty much continuously. The Pi Eth
can you be more specific on how you measure time ?
is this wall time (e.g. it does include mpirun, MPI_Init and MPI_Finalize) ?
is this elapsed time between MPI_Init() and MPI_Finalize() ?
assuming extra time is spent in MPI, do you know in which subroutine the
extra time is spent ?
OpenMPI 1.4.1
I compile the same program by using 1.4.1 and 1.10.2rc3 and then run them
under the same environment. 1.4.1 is 8.89% faster than 1.10.2rc3. Is there
any official performance report for each version upgrade?