I'd also like to re-emphasize something Andreas said earlier: SIGTERM *usually* means that some external entity is killing your application. It *could* be coming from within the application itself, but that's not too common.

You might want to look into that to find out where the SIGTERM is coming from. The Microtar maintainers might have some better ideas.


On May 30, 2008, at 9:17 AM, Andreas Schäfer wrote:

On 12:28 Fri 30 May     , Lee Amy wrote:
2008/5/29 Andreas Schäfer <gent...@gmx.de>:
Thank you very much. If I do a shorter job it seems run well. And the job dosen't repeatedly fail at the same time, but it will fail at this error messages. Anyway, I'm not using a scheduling system. So any suggestions?

At least no easy ones, sorry. ;-) You could ask the Microtar guys if
they know anything about that problem. And of course you could use a
debugger to dig into Microtar and find the problem yourself. ^^ Open
MPI has some doc how to attach gdb to a parallel job: (and how to use
valgrind etc.)

http://www.open-mpi.org/faq/?category=debugging

Good luck!
-Andi


--
============================================
Andreas Schäfer
Cluster and Metacomputing Working Group
Friedrich-Schiller-Universität Jena, Germany
PGP/GPG key via keyserver
I'm a bright... http://www.the-brights.net
============================================

(\___/)
(+'.'+)
(")_(")
This is Bunny. Copy and paste Bunny into your
signature to help him gain world domination!
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


--
Jeff Squyres
Cisco Systems


Reply via email to