Dear Kilian, thanks for pointing this out. I should have mentioned that I had already browsed the croups.conf man page up and down but did not find any specific hints on how to achieve the desired behavior. Maybe I am still missing something obvious?
Also the kernel cgroups documentation indicates that page cache and anonymous memory, are both tied to userland memory[1]: --- snip --- While not completely water-tight, all major memory usages by a given cgroup are tracked so that the total memory consumption can be accounted and controlled to a reasonable extent. Currently, the following types of memory usages are tracked. Userland memory - page cache and anonymous memory. Kernel data structures such as dentries and inodes. TCP socket buffers. --- snip --- That's why I'm somewhat unsure whether KmemSpace options in cgroups.conf can address this issue. I guess my question simply boils down to whether there is a Slurm-ish way to prevent active page caches from being counted against memory constraints when ConstrainRAMSpace=yes is set? Best regards Jürgen [1] https://www.kernel.org/doc/html/v4.18/admin-guide/cgroup-v2.html -- Jürgen Salk Scientific Software & Compute Services (SSCS) Kommunikations- und Informationszentrum (kiz) Universität Ulm Telefon: +49 (0)731 50-22478 Telefax: +49 (0)731 50-22471 * Kilian Cavalotti <kilian.cavalotti.w...@gmail.com> [190613 17:27]: > Hi Jürgen, > > I would take a look at the various *KmemSpace options in > cgroups.conf, they can certainly help with this. > > Cheers, -- Kilian > > On Thu, Jun 13, 2019 at 2:41 PM Juergen Salk > <juergen.s...@uni-ulm.de> wrote: > > > > Dear all, > > > > I'm just starting to get used to Slurm and play around with it in > > a small test environment within our old cluster. > > > > For our next system we will probably have to abandon our current > > exclusive user node access policy in favor of a shared user > > policy, i.e. jobs from different users will then run side by side > > on the same node at the same time. In order to prevent the jobs > > from interfering with each other, I have set both > > ConstrainCores=yes and ConstrainRAMSpace=yes in cgroups.conf, > > which works as expected for limiting the memory of the processes > > to the value requested at job submission (e.g. by --mem=... > > option). > > > > However, I've noticed that ConstrainRAMSpace=yes does also cap the > > available page cache for which the Linux kernel normally exploits > > any unused areas of the memory in a flexible way. This may result > > in a significant performance impact as we do have quite a number > > of IO demanding applications (predominated by read operations) > > that are known to benefit a lot from page caching. > > > > Here comes a small example to illustrate this issue. The job > > writes a 16 GB file to a local scratch file system, measures the > > amount of data cached in memory and then reads the file previously > > written. > > > > $ cat job.slurm #!/bin/bash #SBATCH --partition=standard #SBATCH > > --nodes=1 #SBATCH --ntasks-per-node=1 #SBATCH --time=00:10:00 > > > > # Get amount of data cached in memory before writing the file > > cached1=`awk '$1=="Cached:" {print $2}' /proc/meminfo` > > > > # Write 16 GB file to local scratch SSD dd if=/dev/zero > > of=$SCRATCH/testfile count=16 bs=1024M > > > > # Get amount of data cached in memory after writing the file > > cached2=`awk '$1=="Cached:" {print $2}' /proc/meminfo` > > > > # Print difference of data cached in memory echo -e "\nIncreased > > cached data by $(((cached2-cached1)/1000000)) GB\n" > > > > # Read the file previously written dd if=$SCRATCH/testfile > > of=/dev/null count=16 bs=1024M > > > > $ > > > > For reference, this is the result *without* ConstrainRAMSpace=yes > > set in cgroups.conf and submitted with `sbatch --mem=2G > > --gres=scratch:16 job.slurm´ > > > > --- snip --- 16+0 records in 16+0 records out 17179869184 bytes > > (17 GB) copied, 10.9839 s, 1.6 GB/s > > > > Increased cached data by 16 GB > > > > 16+0 records in 16+0 records out 17179869184 bytes (17 GB) copied, > > 5.03225 s, 3.4 GB/s --- snip --- > > > > Note that there is 16 GB of data cached and the read performance > > is 3.4 GB/s as the data is actually read from page cache. > > > > And this is the result *with* ConstrainRAMSpace=yes set in > > cgroups.conf and submitted with the very same command: > > > > --- snip --- 16+0 records in 16+0 records out 17179869184 bytes > > (17 GB) copied, 13.3163 s, 1.3 GB/s > > > > Increased cached data by 1 GB > > > > 16+0 records in 16+0 records out 17179869184 bytes (17 GB) copied, > > 11.1098 s, 1.5 GB/s --- snip --- > > > > Now only 1 GB of data has been cached (which is roughly the 2 GB > > that have been requested for the job minus 1 GB allocated by the > > dd buffer) resulting in a read performance degradation to 1.5 GB/s > > (compared to 3.4 GB/s as above). > > > > Finally, this is the result with *with* ConstrainRAMSpace=yes set > > in cgroups.conf and the job submitted with `sbatch --mem=18G > > --gres=scratch:16 job.slurm´: > > > > --- snip --- 16+0 records in 16+0 records out 17179869184 bytes > > (17 GB) copied, 11.0601 s, 1.6 GB/s > > > > Increased cached data by 16 GB > > > > 16+0 records in 16+0 records out 17179869184 bytes (17 GB) copied, > > 5.01643 s, 3.4 GB/s --- snip --- > > > > This is almost the same result as in the unconstrained case (i.e. > > without ConstrainRAMSpace=yes set in cgroups.conf) as the amount > > of memory requested for the job (18 GB) is large enough to allow > > the file to be fully cached in memory. > > > > I do not think this is an issue with Slurm itself but how cgroups > > are supposed to work. However, I wonder how others cope with this. > > > > Maybe we have to teach our users to also consider page cache when > > requesting a certain amount of memory for their jobs? > > > > Any comment or idea would be highly appreciated. > > > > Thank you in advance. > > > > Best regards Jürgen > > > > -- Jürgen Salk Scientific Software & Compute Services (SSCS) > > Kommunikations- und Informationszentrum (kiz) Universität Ulm > > Telefon: +49 (0)731 50-22478 Telefax: +49 (0)731 50-22471 > > > > > -- Kilian > -- GPG A997BA7A | 87FC DA31 5F00 C885 0DC3 E28F BD0D 4B33 A997 BA7A