Hi Rizwan,

If you need to rewrite your fork system calls, you may want to check out
mpi's spawn functionality. I recently found out about it and it's really
useful if you haven't heard of it already. I am using it through python's
mpi4py and it seems to be working well.

Best,
Jason

Jason Maldonis
Research Assistant of Professor Paul Voyles
Materials Science Grad Student
University of Wisconsin, Madison
1509 University Ave, Rm M142
Madison, WI 53706
maldo...@wisc.edu
608-295-5532

On Mon, Jun 20, 2016 at 8:38 AM, Gilles Gouaillardet <
gilles.gouaillar...@gmail.com> wrote:

> There is no guarantee that will work on a multiple mode job.
> tcp should be fine, infiniband might not work.
>
> the best way to be on the safe side is you rewrite your MPI app so it does
> not invoke the fork system call. this is generally invoked directly, or via
> the "system" subroutine.
>
> Cheers,
>
> Gilles
>
> On Monday, June 20, 2016, Ahmed Rizwan <rizwan.ah...@aalto.fi> wrote:
>
>> Hi Gilles,
>>
>> Thanks for the support. :)
>>
>> This is a test which I am running on a single node, but I am intending to
>> run calculations on multiple nodes. You mean it wouldn't work on multiple
>> nodes? If I run on multiple nodes, how can I avoid these errors then? I
>> would just test it for multiple nodes.
>>
>> Regards,
>> Rizwan
>> ------------------------------
>> *From:* users [users-boun...@open-mpi.org] on behalf of Gilles
>> Gouaillardet [gilles.gouaillar...@gmail.com]
>> *Sent:* Monday, June 20, 2016 3:10 PM
>> *To:* Open MPI Users
>> *Subject:* [OMPI users] memory cg '(null)'
>>
>> There are two points here
>> 1. slurm(stepd) is unable to put the processes in the (null) cgroup.
>>    at first glance, this looks more of a slurm jus configuration
>> 2. the MPI process forking. though this has a much better support than in
>> the past, that might not always work, especially with fast interconnects.
>> since you are running on a single node, you should be fine. simply
>> export OMPI_MCA_mpi_warn_on_fork=0
>> before invoking srun, in order to silence this message.
>>
>> Cheers,
>>
>> Gilles
>>
>> On Monday, June 20, 2016, Ahmed Rizwan <rizwan.ah...@aalto.fi
>> <http://UrlBlockedError.aspx>> wrote:
>>
>>> Dear MPI users,
>>>
>>> I am getting the errors below while submitting/executing following
>>> script,
>>>
>>> #!/bin/sh
>>> #SBATCH -p short
>>> #SBATCH -J layers
>>> #SBATCH -n 12
>>> #SBATCH -N 1
>>> #SBATCH -t 01:30:00
>>> #SBATCH --mem-per-cpu=2500
>>> #SBATCH --exclusive
>>> #SBATCH --mail-type=END
>>> #SBATCH --mail-user=rizwan.ah...@aalto.fi
>>> #SBATCH -o output_%j.out
>>> #SBATCH -e errors_%j.err
>>>
>>> srun --mpi=pmi2 gpaw-python layers.py
>>>
>>>
>>> --------------------------------------------------------------------------
>>> slurmstepd: error: task/cgroup: unable to add task[pid=126453] to memory
>>> cg '(null)'
>>> slurmstepd: error: task/cgroup: unable to add task[pid=80379] to memory
>>> cg '(null)'
>>> slurmstepd: error: task/cgroup: unable to add task[pid=124258] to memory
>>> cg '(null)'
>>> slurmstepd: error: task/cgroup: unable to add task[pid=124259] to memory
>>> cg '(null)'
>>> slurmstepd: error: task/cgroup: unable to add task[pid=124261] to memory
>>> cg '(null)'
>>> slurmstepd: error: task/cgroup: unable to add task[pid=124266] to memory
>>> cg '(null)'
>>> slurmstepd: error: task/cgroup: unable to add task[pid=124264] to memory
>>> cg '(null)'
>>> slurmstepd: error: task/cgroup: unable to add task[pid=124262] to memory
>>> cg '(null)'
>>> slurmstepd: error: task/cgroup: unable to add task[pid=124260] to memory
>>> cg '(null)'
>>> slurmstepd: error: task/cgroup: unable to add task[pid=124265] to memory
>>> cg '(null)'
>>> slurmstepd: error: task/cgroup: unable to add task[pid=124263] to memory
>>> cg '(null)'
>>>
>>> --------------------------------------------------------------------------
>>> An MPI process has executed an operation involving a call to the
>>> "fork()" system call to create a child process.  Open MPI is currently
>>> operating in a condition that could result in memory corruption or
>>> other system errors; your MPI job may hang, crash, or produce silent
>>> data corruption.  The use of fork() (or system() or other calls that
>>> create child processes) is strongly discouraged.
>>>
>>> The process that invoked fork was:
>>>
>>>   Local host:          pe38 (PID 80379)
>>>   MPI_COMM_WORLD rank: 1
>>>
>>> If you are *absolutely sure* that your application will successfully
>>> and correctly survive a call to fork(), you may disable this warning
>>> by setting the mpi_warn_on_fork MCA parameter to 0.
>>>
>>> --------------------------------------------------------------------------
>>>
>>> Is this error fatal or should it be ignored? Thanks
>>> Regards,
>>> Rizwan
>>>
>>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/06/29488.php
>

Reply via email to