Hi Josh,
Thanks for helping. That solved the problem!!!
cheers,
Jonathan
Josh Hursey wrote:
So I tried to reproduce this problem today, and everything worked fine
for me using the trunk. I haven't tested v1.3/v1.4 yet.
I tried checkpointing with one hostfile then restarting with each of
th
So I tried to reproduce this problem today, and everything worked fine
for me using the trunk. I haven't tested v1.3/v1.4 yet.
I tried checkpointing with one hostfile then restarting with each of
the following:
- No hostfile
- a hostfile with completely different machines
- a hostfile wit
I did the same test using 1.3.4 and still the same issue I also
tried to use the tm interface instead of specifying the hostfile, same
result.
thanks,
Jonathan
Josh Hursey wrote:
Though I do not test this scenario (using hostfiles) very often, it
used to work. The ompi-restart command t
Hi Josh,
In case it help, I am running 1.3.3 compiled as follow :
../configure --enable-ft-thread --with-ft=cr --enable-mpi-threads
--with-blcr=... --with-blcr-libdir=...--disable-openib-rdmacm --prefix=
I ran my application like this :
mpirun -am ft-enable-cr --hostfile host -np 2 ./a.out
Though I do not test this scenario (using hostfiles) very often, it
used to work. The ompi-restart command takes a --hostfile (or --
machinefile) argument that is passed directly to the mpirun command. I
wonder if something broke recently with this handoff. I can certainly
checkpoint with on
Hi,
I am trying to use BLCR checkpointing in mpi. I am currently able to run
my application using some hostfile, checkpoint the run, and then restart
the application using the same hostfile. The thing I would like to do is
to restart the application with a different hostfile. But this leads to