Note that you were successful in changing the value on the right side of that error message. So, you may just need to continue increasing it to a number expected to fit the calculation, while, of course, checking that the total memory available on a node is enough. Sometimes I have done a representative test run and used sbatch --exclusive --mem=0 job.sh and closely followed the memory usage of that job - logging in and using ps or top and/or sacct to find the RSS value for the completed job to round up and use next time. I believe the --exclusive option would typically allocate an entire node to just that one job and --mem=0 would effectively disable slurm memory limits. That depends on the slurm setup though...
On Wed, Sep 6, 2017, 03:38 Sema Atasever <[email protected]> wrote: > Dear Batsirai, > > I tried the line of code what you recommended but the code still generates > an error unfortunately. > > On Thu, Aug 24, 2017 at 5:19 PM, Batsirai Mabvakure <[email protected]> > wrote: > >> >> Try : >> >> >> >> sbatch -J jobname --mem=18000 -D $(pwd) submit_job.sh >> >> >> >> >> >> >> >> From: Sema Atasever <[email protected]<mailto:[email protected]>> >> >> Reply-To: slurm-dev <[email protected]<mailto:[email protected]>> >> >> Date: Thursday 24 August 2017 at 15:58 >> >> To: slurm-dev <[email protected]<mailto:[email protected]>> >> >> Subject: [slurm-dev] Re: Exceeded job memory limit problem >> >> >> >> Dear Lev, >> >> >> >> I already have tried --mem parameter with different values. >> >> >> >> For example: >> >> >> >> sbatch --mem=5GB submit_job. >> >> sbatch --mem=18000 submit_job. >> >> >> >> but every time it gave the same error again unfortunately. >> >> >> >> >> >> >> >> On Thu, Aug 24, 2017 at 2:32 AM, Lev Lafayette < >> [email protected]<mailto:[email protected]>> wrote: >> >> On Wed, 2017-08-23 at 01:26 -0600, Sema Atasever wrote: >> >> >> >> > >> >> > >> >> > Computing predictions by SVM... >> >> > slurmstepd: Job 3469 exceeded memory limit (4235584 > 2048000), being >> >> > killed >> >> > slurmstepd: Exceeded job memory limit >> >> > >> >> > >> >> > How can i fix this problem. >> >> > >> >> >> >> Error messages often give useful information. In this case you haven't >> >> requested enough memory in your Slurm script. >> >> >> >> Memory can be set with `#SBATCH --mem=[mem][M|G|T]` directive (entire >> >> job) or `#SBATCH --mem-per-cpu=[mem][M|G|T]` (per core). >> >> >> >> As a rule of thumb, the maximum request per node should be based around >> >> total cores -1 (for system processes). >> >> >> >> All the best, >> >> >> >> >> >> -- >> >> Lev Lafayette, BA (Hons), GradCertTerAdEd (Murdoch), GradCertPM, MBA >> >> (Tech Mngmnt) (Chifley) >> >> HPC Support and Training Officer +61383444193<tel:%2B61383444193> >> +61432255208<tel:%2B61432255208> >> >> Department of Infrastructure Services, University of Melbourne >> >> >> >> >> >> >> >> >> >> The views expressed in this email are, unless otherwise stated, those of >> the author and not those of the National Health Laboratory Service or its >> management. The information in this e-mail is confidential and is intended >> solely for the addressee. >> >> Access to this e-mail by anyone else is unauthorized. If you are not the >> intended recipient, any disclosure, copying, distribution or any action >> taken or omitted in reliance on this, is prohibited and may be unlawful. >> >> Whilst all reasonable steps are taken to ensure the accuracy and >> integrity of information and data transmitted electronically and to >> preserve the confidentiality thereof, no liability or responsibility >> whatsoever is accepted if information or data is, for whatever reason, >> corrupted or does not reach its intended destination. >> > > -- Chris Harwell
