[OMPI users] Fault tolerant ompi - Error: Unable to find a list of active MPIRUN processes on this machine.

2011-03-30 Thread Hellmüller Roman
Hi I'm trying to get fault tolerant ompi running on our cluster for my semesterthesis. On the login node i was successful, checkpointing works. Since the compute nodes have different kernels, i had to compile blcr on the compute nodes again. blcr on the compute nodes works. after that i instal

Re: [OMPI users] Fault tolerant ompi - Error: Unable to find a list of active MPIRUN processes on this machine.

2011-03-31 Thread Hellmüller Roman
roman Von: users-boun...@open-mpi.org [users-boun...@open-mpi.org]" im Auftrag von "Hellmüller Roman [hro...@student.ethz.ch] Gesendet: Mittwoch, 30. März 2011 16:33 Bis: us...@open-mpi.org Betreff: [OMPI users] Fault tolerant ompi - Error: Unable to find a list of active M

Re: [OMPI users] Fault tolerant ompi - Error: Unable to find a list of active MPIRUN processes on this machine.

2011-03-31 Thread Hellmüller Roman
solved don't know exactly how. just work on it, set some other parameters/directorys. cheers roman Von: users-boun...@open-mpi.org [users-boun...@open-mpi.org]" im Auftrag von "Hellmüller Roman [hro...@student.ethz.ch] Gesendet: Donne

[OMPI users] openmpi self checkpointing - error while running example

2011-04-06 Thread Hellmüller Roman
Hi I'm trying to get fault tolerant ompi running on our cluster for my semesterthesis. Build & compile were successful, blcr checkpointing works. openmpi 1.5.3, blcr 0.8.2 Now i'm trying to set up the SELF checkpointing. the example from http://osl.iu.edu/research/ft/ompi-cr/examples.php does

Re: [OMPI users] openmpi self checkpointing - error while running example

2011-04-06 Thread Hellmüller Roman
] Gesendet: Mittwoch, 6. April 2011 13:20 Bis: Open MPI Users Betreff: Re: [OMPI users] openmpi self checkpointing - error while running example Hi Roman, Did you try to checkpoint and restart with the parameter "-machinefile". It may work. Regards, Nguyen Toan On Wed, Apr 6, 2011 at 7:05

Re: [OMPI users] openmpi self checkpointing - error while running example

2011-04-06 Thread Hellmüller Roman
e MACHINES_FILE". Hope it works. On Wed, Apr 6, 2011 at 9:13 PM, Hellmüller Roman mailto:hro...@student.ethz.ch>> wrote: Hi Toan Thx for your suggestion. It gives me the following result, which does not tell anything more. hroman@cbl1 ~/checkpoints $ ompi-restart -v -machin