sorry if I didn't answer: Have you checked to ensure that the job manager is not killing your job?
I am not quite sure what you mean by job manager, but, this is the personal computer of mine. Much to my surprise, I have also open suse on my laptop, took the similar procedure then the same message appeared !!!! Is there a local system administrator that you can talk to about this? Not a very good one, but I asked someone who had seen this message on his own works and this was his answer: It means that the program corresponding to the process identifier 2407 (the PID you can find on the second column from the "ps aux" command) running on one of you cluster's node (named linux-4pel) has stopped because it has received the signal SIGTERM (termination signal 15). Sorry if this is a long explanation of things you already know :-). Let's say thay you have a program running on your system ; you can figure out its process ID number nnnnn by doing a "ps aux". Now if you want to stop it - f.e. because it is out of control - a convenient way is to send a termination request to the process by issuing the "kill -s SIGTERM nnnnn". Here, openmpi notified to you that one of the spawned processes has been terminated because it has received the SIGTERM signal and, as a consequence, has stopped all the other distributed processes running on the nodes - as PID 2407 process has acknowledged SIGTERM, openmpi has sent SIGTERM to all the processes associated to your parallel run. Now ... how to avoid this? I am afraid there is no easy answer. The 2407 process has probably received a SIGTERM from another application - I mean it has not died by accident (a hanging or faulting process exits without invoking the MPI_FINALYZE and produces a different error message). The difficulty is that you have to investigate what application has issued the SIGTERM - what application has told your 2407 process to terminate. If you are working on a cluster managing the MPI distributed processes to the nodes with a resource manager (like SLURM, PBS or Torque), I would check if the manager is not limiting the memory size footprint or the CPU time of the jobs accepted by the linux-4pel computer. It is tricky for me to figure out what could have asked your program to stop ... does it stops immediately or during a long run (CPU time?), with small jobs or large ones (memory?) ; is MPI running on a personal computer or a huge cluster (resource manager?), do you have sufficient privileges to have a look on /var/log/messages on linux-4pel? 1. The code stops running immediately. 2. The computers are my personal ones and no administrator has limited the 7.9 GiB memory I have. 3. Sequentially the run takes 500-700MiB memory. 3. Lokking at the message after I executed the run this was the message in /var/log/messages: Jan 23 16:24:32 linux-jzqs gdm[2566]: GLib-CRITICAL: g_key_file_get_string: assertion `key_file != NULL' failed Jan 23 16:24:32 linux-jzqs gdm[2566]: GLib-CRITICAL: g_key_file_get_string: assertion `key_file != NULL' failed Jan 23 16:24:32 linux-jzqs gdm[2566]: GLib-CRITICAL: g_key_file_free: assertion `key_file != NULL' failed Jan 23 16:24:33 linux-jzqs seahorse-agent[24718]: Failed to send buffer Jan 23 16:24:33 linux-jzqs seahorse-agent[24718]: Failed to send buffer Jan 23 16:24:35 linux-jzqs pulseaudio[24742]: main.c: This program is not intended to be run as root (unless --system is specified). Jan 23 16:24:35 linux-jzqs pulseaudio[24742]: pid.c: Stale PID file, overwriting. Jan 23 16:24:35 linux-jzqs pulseaudio[24743]: main.c: This program is not intended to be run as root (unless --system is specified). Jan 23 16:24:35 linux-jzqs pulseaudio[24743]: pid.c: Daemon already running. Jan 23 16:24:35 linux-jzqs pulseaudio[24743]: main.c: pa_pid_file_create() failed. Jan 23 16:24:35 linux-jzqs pulseaudio[24745]: main.c: This program is not intended to be run as root (unless --system is specified). Jan 23 16:24:35 linux-jzqs pulseaudio[24745]: pid.c: Daemon already running. Jan 23 16:24:35 linux-jzqs pulseaudio[24745]: main.c: pa_pid_file_create() failed. Jan 23 16:24:37 linux-jzqs gconfd (root-24630): Resolved address "xml:readwrite:/root/.gconf" to a writable configuration source at position 0 Jan 23 16:24:39 linux-jzqs kernel: CPU0 attaching NULL sched-domain. Jan 23 16:24:39 linux-jzqs kernel: CPU1 attaching NULL sched-domain. Jan 23 16:24:39 linux-jzqs kernel: CPU0 attaching sched-domain: Jan 23 16:24:39 linux-jzqs kernel: domain 0: span 00000000,00000000,00000000,00000003 Jan 23 16:24:39 linux-jzqs kernel: groups: 00000000,00000000,00000000,00000001 00000000,00000000,00000000,00000002 Jan 23 16:24:39 linux-jzqs kernel: CPU1 attaching sched-domain: Jan 23 16:24:39 linux-jzqs kernel: domain 0: span 00000000,00000000,00000000,00000003 Jan 23 16:24:39 linux-jzqs kernel: groups: 00000000,00000000,00000000,00000002 00000000,00000000,00000000,00000001