[OMPI users] How to restart a job twice

2008-04-18 Thread Tamer
Dear all, I installed the developer's version r14519 and was able to get it running. I successfully checkpointed a parallel job and restarted it. My question is how can I checkpoint the restarted job? The problem is once the original job is terminated and restarted later on, the mpirun does

Re: [OMPI users] How to restart a job twice

2008-04-18 Thread Tamer
n completely and I would have to go to r18208? Thank you in advance for your help. Tamer On Apr 18, 2008, at 6:03 AM, Josh Hursey wrote: When you use 'ompi-restart' to restart a job it fork/execs the completely new job using the restarted processes for the ranks. However instead o

Re: [OMPI users] How to restart a job twice

2008-04-18 Thread Tamer
ng without calling "finalize". This may have caused other processes in the application to be terminated by signals sent by mpirun (as reported here). Thank you in advance for your help. Tamer On Apr 18, 2008, at 7:07 AM, Josh Hursey wrote: This problem has come up in the past and

Re: [OMPI users] How to restart a job twice

2008-04-24 Thread Tamer
checkpoints and restarts as many times as I want to without any problems. This means that the issue above must be platform dependent and I must be missing some option in building the code. Cheers, Tamer On Apr 22, 2008, at 5:52 PM, Josh Hursey wrote: Tamer, This should now be fixed in

Re: [OMPI users] openmpi-1.3a1r18241 ompi-restart issue

2008-05-13 Thread Tamer
x27;t give me an error message. Has this problem been reported before? All the required executables and libraries are in my path. Thanks, Tamer On Apr 29, 2008, at 1:37 PM, Sharon Brunett wrote: Thanks, I'll try the version you recommend below! Josh Hursey wrote: Your previous emai