I very much doubt that either of those mappers has ever been tested against
comm_spawn. Just glancing thru them, I don't see an immediate reason why
loadbalance wouldn't work, but the error indicates that the system wound up
mapping one or more processes to an unknown node.
We are revising the
I'm trying to use MPI_Comm_spawn_multiple and it doesn't seem to always work
like I'd expect.
The simple test code I have starts a couple of master processes and then tries
to spawn a couple of worker threads on each of the nodes running the master
processes.
I was using 1.5.1, but gave 1.5.2r
It's because you're waiting on the receive request to complete before the send
request. This likely works locally because the message transfer is through
shared memory and is fast, but it's still an inherently unsafe way to block
waiting for completion (i.e., the receive might not complete if t
Random thought: is there a check to ensure that the SL MCA param is not set in
a RoCE environment? If not, we should probably add a show_help warning if the
SL MCA param is set when using RoCE (i.e., that its value will be ignored).
On Feb 19, 2011, at 12:22 AM, Shamis, Pavel wrote:
> As far
There is no restriction to use the C/R functionality in Open MPI in a TM
environment (that I am aware of), if you use the ompi-checkpoint/ompi-restart
commands directly.
If you want TM to checkpoint/restart Open MPI processes for you as part of the
resource management role, then there is a bit
On Feb 21, 2011, at 12:50 AM, DOHERTY, Greg wrote:
> blcr needs cr_mpirun to start the job without torque support to be able
> to checkpoint the mpi job correctly.
Josh --
Do we have a restriction on BLCR support when used with TM?
--
Jeff Squyres
jsquy...@cisco.com
For corporate legal informa
Simplest soln: add -bynode to your mpirun cmd line
On Feb 20, 2011, at 10:50 PM, DOHERTY, Greg wrote:
> In order to be able to checkpoint openmpi jobs with blcr, we have
> configured openmpi as follows
>
> ./configure --prefix=/data1/packages/openmpi/1.5.1-blcr-without-tm
> --disable-openib-co
In order to be able to checkpoint openmpi jobs with blcr, we have
configured openmpi as follows
./configure --prefix=/data1/packages/openmpi/1.5.1-blcr-without-tm
--disable-openib-connectx-xrc --disable-openib-rdmacm --with-ft=cr
--enable-mpi-threads --enable-ft-thread --with-blcr=/usr
--with-blc