Hmmm...tell you what. I'll add the ability for OMPI to set the limit to a 
user-specified level upon launch of each process. This will give you some 
protection and flexibility.

I forget, so please forgive the old man's fading memory - what version of OMPI 
are you using? I'll backport a patch for you.

On Apr 2, 2013, at 8:40 AM, Duke Nguyen <duke.li...@gmx.com> wrote:

> On 3/30/13 8:46 PM, Patrick Bégou wrote:
>> Ok, so your problem is identified as a stack size problem. I went into these 
>> limitations using Intel fortran compilers on large data problems.
>> 
>> First, it seems you can increase your stack size as "ulimit -s unlimited" 
>> works (you didn't enforce the system hard limit). The best way  is to set 
>> this setting in your .bashrc file so it will works on every node.
>> But setting it to unlimited may not be really safe. IE, if you got in a 
>> badly coded recursive function calling itself without a stop condition you 
>> can request all the system memory and crash the node. So set a large but 
>> limited value, it's safer.
>> 
> 
> Now I feel the pain you mentioned :). With -s unlimited now some of our nodes 
> are easily down (completely) and needed to be hard reset!!! (whereas we never 
> had any node down like that before even with the killed or badly coded jobs).
> 
> Looking for a safer number of ulimit -s other than "unlimited" now... :(
> 
>> I'm managing a cluster and I always set a maximum value to stack size. I 
>> also limit the memory available for each core for system stability. If a 
>> user request only one of the 12 cores of a node he can only access 1/12 of 
>> the node memory amount. If he needs more memory he has to request 2 cores, 
>> even if he uses a sequential code. This avoid crashing jobs of other users 
>> on the same node with memory requirements. But this is not configured on 
>> your node.
>> 
>> Duke Nguyen a écrit :
>>> On 3/30/13 3:13 PM, Patrick Bégou wrote:
>>>> I do not know about your code but:
>>>> 
>>>> 1) did you check stack limitations ? Typically intel fortran codes needs 
>>>> large amount of stack when the problem size increase.
>>>> Check ulimit -a
>>> 
>>> First time I heard of stack limitations. Anyway, ulimit -a gives
>>> 
>>> $ ulimit -a
>>> core file size          (blocks, -c) 0
>>> data seg size           (kbytes, -d) unlimited
>>> scheduling priority             (-e) 0
>>> file size               (blocks, -f) unlimited
>>> pending signals                 (-i) 127368
>>> max locked memory       (kbytes, -l) unlimited
>>> max memory size         (kbytes, -m) unlimited
>>> open files                      (-n) 1024
>>> pipe size            (512 bytes, -p) 8
>>> POSIX message queues     (bytes, -q) 819200
>>> real-time priority              (-r) 0
>>> stack size              (kbytes, -s) 10240
>>> cpu time               (seconds, -t) unlimited
>>> max user processes              (-u) 1024
>>> virtual memory          (kbytes, -v) unlimited
>>> file locks                      (-x) unlimited
>>> 
>>> So stack size is 10MB??? Does this one create problem? How do I change this?
>>> 
>>>> 
>>>> 2) did your node uses cpuset and memory limitation like fake numa to set 
>>>> the maximum amount of memory available for a job ?
>>> 
>>> Not really understand (also first time heard of fake numa), but I am pretty 
>>> sure we do not have such things. The server I tried was a dedicated server 
>>> with 2 x5420 and 16GB physical memory.
>>> 
>>>> 
>>>> Patrick
>>>> 
>>>> Duke Nguyen a écrit :
>>>>> Hi folks,
>>>>> 
>>>>> I am sorry if this question had been asked before, but after ten days of 
>>>>> searching/working on the system, I surrender :(. We try to use mpirun to 
>>>>> run abinit (abinit.org) which in turns will call an input file to run 
>>>>> some simulation. The command to run is pretty simple
>>>>> 
>>>>> $ mpirun -np 4 /opt/apps/abinit/bin/abinit < input.files >& output.log
>>>>> 
>>>>> We ran this command on a server with two quad core x5420 and 16GB of 
>>>>> memory. I called only 4 core, and I guess in theory each of the core 
>>>>> should take up to 2GB each.
>>>>> 
>>>>> In the output of the log, there is something about memory:
>>>>> 
>>>>> P This job should need less than                     717.175 Mbytes of 
>>>>> memory.
>>>>>  Rough estimation (10% accuracy) of disk space for files :
>>>>>  WF disk file :     69.524 Mbytes ; DEN or POT disk file : 14.240 Mbytes.
>>>>> 
>>>>> So basically it reported that the above job should not take more than 
>>>>> 718MB each core.
>>>>> 
>>>>> But I still have the Segmentation Fault error:
>>>>> 
>>>>> mpirun noticed that process rank 0 with PID 16099 on node biobos exited 
>>>>> on signal 11 (Segmentation fault).
>>>>> 
>>>>> The system already has limits up to unlimited:
>>>>> 
>>>>> $ cat /etc/security/limits.conf | grep -v '#'
>>>>> * soft memlock unlimited
>>>>> * hard memlock unlimited
>>>>> 
>>>>> I also tried to run
>>>>> 
>>>>> $ ulimit -l unlimited
>>>>> 
>>>>> before the mpirun command above, but it did not help at all.
>>>>> 
>>>>> If we adjust the parameters of the input.files to give the reported mem 
>>>>> per core is less than 512MB, then the job runs fine.
>>>>> 
>>>>> Please help,
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> D.
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> 
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> 
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to