I am trying to test orte-checkpoint with a MPI JOB. It how ever hangs for all jobs. This is how i submit the job is started mpirun -np 8 -mca ft-enable cr /apps/nwchem-5.1.1/bin/LINUX64/nwchem siosi6.nw >From another terminal i try the orte-checkpoint
ompi-checkpoint -v --term 9338 [compute-19-12.local:09377] orte_checkpoint: Checkpointing... [compute-19-12.local:09377] PID 9338 [compute-19-12.local:09377] Connected to Mpirun [[5009,0],0] [compute-19-12.local:09377] Terminating after checkpoint [compute-19-12.local:09377] orte_checkpoint: notify_hnp: Contact Head Node Process PID 9338 [compute-19-12.local:09377] orte_checkpoint: notify_hnp: Requested a checkpoint of jobid [INVALID] [compute-19-12.local:09377] orte_checkpoint: hnp_receiver: Receive a command message. [compute-19-12.local:09377] orte_checkpoint: hnp_receiver: Status Update. Is there any way to debug the issue to get more information or log messages. Rangam