On Dec 12, 2008, at 3:46 PM, douglas.gupt...@dal.ca wrote:
This is probably because most users running MPI jobs tend to devote
the entire core/CPU/server to the MPI job and don't try to run other
jobs concurrently on the same resources.
Our situation is different. While our number-cruncher application is
running, we would like to be able to do some editing, compiling,
post-processing.
I once ran three jobs, hence 6 processes, on our 4-cpu system, and was
unable to ssh into the machine. Or maybe I did not wait long
enough...
I can believe it. OMPI is very aggressive about using CPU cycles when
not yielding; very tight loops spinning for progress, sometimes not
even invoking system calls, so there's few "natural" opportunities for
the OS to swap out the process to a different one. If you used the
yield_when_idle, you should have been able to ssh into the machine.
The number-cruncher has two processes, and each needs intermediate
results from the other, inside a
do i=1,30000
enddo
As I mentioned earlier, most of the time, only one process is
executing, and the other is waiting for results. My guess is that,
with the blocking feature you describe, I could double the number of
number-cruncher jobs running at one time, thus doubling throughput.
Possibly. MPI is not the only issue at play here, as Eugene noted.
You do only have so many CPU cores and only so much memory bandwidth.
You might want to do a few back-of-the-envelope calculations and/or
non-MPI experiments to figure out what your actual speedup will be.
--
Jeff Squyres
Cisco Systems