I'm getting the following error with a new version of R, using Rmpi and
a few other modules. I've already had a couple of good suggestions from
this group about how to diagnose the cause of the fork error using
"strace" but we don't have it on our LSF Linux cluster. This is my
first use of R/mpi/parallel etc so am a bit naive. Also the code I'm
running involves random number generation so will always give slightly
different answers.
My normal routine is to :
a) try the code with a small number of iterations on my own
Linux/R/open-mpi pc using 8 cores, then
b) make the job bigger and run it to the cluster.
I only get the warning on the cluster which suggests that it caused by
something related to R and/or Rmpi and/or LSF and/or open MPI ???
Could someone suggest some rigorous R test-code that I could run on my
pc, ok if it takes some time, and then rerun it on cluster to confirm
that I get the same results, and thus the warning in inconsequential?
Thanks
Jim
=========================
An MPI process has executed an operation involving a call to the
"fork()" system call to create a child process. Open MPI is currently
operating in a condition that could result in memory corruption or
other system errors; your MPI job may hang, crash, or produce silent
data corruption. The use of fork() (or system() or other calls that
create child processes) is strongly discouraged.
The process that invoked fork was:
Local host: cn159.private.dns.zone (PID 12792)
MPI_COMM_WORLD rank: 7
If you are*absolutely sure* that your application will successfully
and correctly survive a call to fork(), you may disable this warning
by setting the mpi_warn_on_fork MCA parameter to 0.
--
Dr. Jim Maas
University of East Anglia