Hi Amy, On 16:10 Thu 29 May , Lee Amy wrote: > MicroTar parallel version was terminated after 463 minutes with following > error messages: > ================================================ > [gnode5:31982] [ 0] /lib64/tls/libpthread.so.0 [0x345460c430] > [gnode5:31982] [ 1] microtar(LocateNuclei+0x137) [0x403037] > [gnode5:31982] [ 2] microtar(main+0x4ac) [0x40431c] > [gnode5:31982] [ 3] /lib64/tls/libc.so.6(__libc_start_main+0xdb) > [0x3453b1c3fb] > [gnode5:31982] [ 4] microtar [0x402e6a] > [gnode5:31982] *** End of error message *** > mpirun noticed that job rank 0 with PID 18710 on node gnode1 exited on > signal 15 (Terminated). > 19 additional processes aborted (not shown) > ================================================
if I'm not mistaken, signal 15 is SIGTERM, which is sent to processes to terminate them. To me this sounds like your application is terminated from an external instance, maybe because your job exceeded the wall clock time limit of your scheduling system. Does the job repeatedly fail at the same time? Do shorter jobs finish successfully? Just my 0.02 Euros (-8 Cheers -Andreas -- ============================================ Andreas Schäfer Cluster and Metacomputing Working Group Friedrich-Schiller-Universität Jena, Germany PGP/GPG key via keyserver I'm a bright... http://www.the-brights.net ============================================ (\___/) (+'.'+) (")_(") This is Bunny. Copy and paste Bunny into your signature to help him gain world domination!
pgp8TQOHKBqEK.pgp
Description: PGP signature