I'm hoping this is just user error...
I'm running a single-node job with a node that has two dual-core opterons
(Open MPI 1.0.2).
compiler=gcc 4.1.0
arch=x86_64 (64-bit)
OS=linux 2.6.16
****
My machine file looked like this:
node1 slots=4
I have an HPL configuration for 4 processors (PxQ=2x2)
I started with 'mpirun -np 4 -machinefile foo ./xhpl'
And the problem takes 15 seconds to complete.
I change the machinefile to read:
node1 slots=2
-or, simply-
node1
It doesn't matter which machinefile I use; I still execute it with:
'mpirun -np 4 -machinefile foo ./xhpl'
Except now the problem takes 0.1 sec to complete.
It's perfectly repeatable...
Is there something about the machine file format I'm not aware of (with
respect to dual-core CPUs)? IIRC, slots=(num of processes to run per
node); so two dual-cores should be slots=4. Except 'slots=4' makes it run
a few orders of magnitude slower.
Thoughts?
--
Troy Telford