Re: [OMPI users] Checkpointing hangs with OpenMPI-1.3.1

2009-04-28 Thread Josh Hursey
12:34 AM Please respond to Open MPI Users To Open MPI Users cc Subject Re: [OMPI users] Checkpointing hangs with OpenMPI-1.3.1 I still have not been able to reproduce the hang, but I'm still looking into it. I did commit a fix for the datatype copy error that I mentioned (r21080 i

Re: [OMPI users] Checkpointing hangs with OpenMPI-1.3.1

2009-04-28 Thread neeraj
Owned Subsidiary of TATA SONS Ltd) P: +91.9225520634 Josh Hursey Sent by: users-boun...@open-mpi.org 04/28/2009 12:34 AM Please respond to Open MPI Users To Open MPI Users cc Subject Re: [OMPI users] Checkpointing hangs with OpenMPI-1.3.1 I still have not been able to reproduce the

Re: [OMPI users] Checkpointing hangs with OpenMPI-1.3.1

2009-04-27 Thread Josh Hursey
I still have not been able to reproduce the hang, but I'm still looking into it. I did commit a fix for the datatype copy error that I mentioned (r21080 in the Open MPI trunk, and it is in the pipeline for v1.3). Can you put in a print statement before MPI_Finalize, then try the program a

Re: [OMPI users] Checkpointing hangs with OpenMPI-1.3.1

2009-04-27 Thread Josh Hursey
Sorry for the long delay to respond. It is a bit odd that the hang does not occur when running on only one host. I suspect that is more due to timing than anything else. I am not able to reproduce the hang at the moment, but I do get an occasional datatype copy error which could be symptoma

[OMPI users] Checkpointing hangs with OpenMPI-1.3.1

2009-04-10 Thread neeraj
Dear All, I am trying to checkpoint a test application using openmpi-1.3.1, but fails to do so, when run multiple process on different nodes. Checkpointing runs fine, if process is running on the same node along with mpirun process. But the moment i launch MPI process from different node,