>alternatively, you can ask slurm not to limit VSZ: in cgroup.conf, have >ConstrainSwapSpace=no >this does not actually permit arbitrary VSZ, since there are mechanisms >outside the cgroup limit that affect max VSZ (overcommit sysctls, swap space)
Hi Mark, ConstrainSwapSpace=no or ConstrainSwapSpace=yes has no effect... [root@hpc ~]$ cat /etc/slurm/cgroup.conf CgroupAutomount=yes CgroupReleaseAgentDir="/etc/slurm/cgroup" ConstrainCores=no ConstrainRAMSpace=no ConstrainSwapSpace=no [root@hpc ~]# systemctl restart slurmd [root@hpc ~]# systemctl restart slurmdbd [root@hpc ~]# systemctl restart slurmctld [root@hpc ~]# su - shams [shams@hpc ~]$ cat slurm_blast.sh #!/bin/bash #SBATCH --job-name=blast1 #SBATCH --output=my_blast.log #SBATCH --partition=SEA #SBATCH --account=fish #SBATCH --mem=38GB #SBATCH --nodelist=hpc #SBATCH --nodes=1 #SBATCH --ntasks-per-node=2 export PATH=~/ncbi-blast-2.9.0+/bin:$PATH blastx -db ~/ncbi-blast-2.9.0+/bin/nr -query ~/khTrinityfilterless1.fasta -max_target_seqs 5 -outfmt 6 -evalue 1e-5 -num_threads 2 [shams@hpc ~]$ sbatch slurm_blast.sh Submitted batch job 299 [shams@hpc ~]$ squeue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) [shams@hpc ~]$ cat my_blast.log Error memory mapping:/home/shams/ncbi-blast-2.9.0+/bin/nr.52.psq openedFilesCount=151 threadID=0 Error: NCBI C++ Exception: T0 "/home/coremake/release_build/build/PrepareRelease_Linux64-Centos_JSID_01_560232_130.14.18.6_9008__PrepareRelease_Linux64-Centos_1552331742/c++/compilers/unix/../../src/corelib/ncbiobj.cpp", line 981: Critical: ncbi::CObject::ThrowNullPointerException() - Attempt to access NULL pointer. Stack trace: blastx ???:0 ncbi::CStackTraceImpl::CStackTraceImpl() offset=0x77 addr=0x1d95da7 blastx ???:0 ncbi::CStackTrace::CStackTrace(std::string const&) offset=0x25 addr=0x1d98465 blastx ???:0 ncbi::CException::x_GetStackTrace() offset=0xA0 addr=0x1ec7330 blastx ???:0 ncbi::CException::SetSeverity(ncbi::EDiagSev) offset=0x49 addr=0x1ec2169 blastx ???:0 ncbi::CObject::ThrowNullPointerException() offset=0x2D2 addr=0x1f42582 blastx ???:0 ncbi::blast::CBlastTracebackSearch::Run() offset=0x61C addr=0xf2929c blastx ???:0 ncbi::blast::CLocalBlast::Run() offset=0x404 addr=0xed4684 blastx ???:0 CBlastxApp::Run() offset=0xC9C addr=0x9cbf7c blastx ???:0 ncbi::CNcbiApplication::x_TryMain(ncbi::EAppDiagStream, char const*, int*, bool*) offset=0x8E3 addr=0x1da0e13 blastx ???:0 ncbi::CNcbiApplication::AppMain(int, char const* const*, char const* const*, ncbi::EAppDiagStream, char const*, std::string const&) offset=0x782 addr=0x1d9f6b2 blastx ???:0 main offset=0x5E5 addr=0x9caa05 /lib64/libc.so.6 ???:0 __libc_start_main offset=0xF5 addr=0x7f6119e0e505 blastx ???:0 blastx() [0x9ca345] offset=0x0 addr=0x9ca345 [shams@hpc ~]$ free -mh total used free shared buff/cache available Mem: 62G 9.2G 289M 277M 53G 52G Swap: 9G 1.0M 9G Please note that when I run the command [shams@hpc ~]$ blastx -db ~/ncbi-blast-2.9.0+/bin/nr -query ~/khTrinityfilterless1.fasta -max_target_seqs 5 -outfmt 6 -evalue 1e-5 -num_threads 2 The top command shows the following memory values PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 24449 shams 20 0 80.1g 1.3g 1.3g S 199.7 2.1 7:20.67 blastx The Res value increases slightly over time but at the beginning of the run, the VIRT is 80GB. Even, if my have specified small --mem (now 38GB), the error should be thrown in the middle of the run. However, when I use sbatch, the program quickly terminates. Regards, Mahmood