Thanks for the reply... I will look into how to configure it. Sid Young Translational Research Institute
On Wed, Jun 23, 2021 at 7:06 AM Prentice Bisbal <pbis...@pppl.gov> wrote: > Yes, > > You need to use the cgroups plugin. > > > On Fri, Jun 18, 2021, 12:29 AM Sid Young <sid.yo...@gmail.com> wrote: > >> G'Day all, >> >> I've had a question from a user of our new HPC, the following should >> explain it: >> >> ➜ srun -N 1 --cpus-per-task 8 --time 01:00:00 --mem 2g --pty python3 >> Python 3.6.8 (default, Nov 16 2020, 16:55:22) >> [GCC 4.8.5 20150623 (Red Hat 4.8.5-44)] on linux >> Type "help", "copyright", "credits" or "license" for more information. >> >>> import os >> >>> os.cpu_count() >> 256 >> >>> len(os.sched_getaffinity(0)) >> 256 >> >>> >> >> The output of os.cpu_count() is correct: there are 256 CPUs on the >> server, but the output of len(os.sched_getaffinity(0)) is still 256 when I >> was expecting it to be 8 - the number of CPUs this process is restricted >> to. Is my slurm command incorrect? When I run a similar test on XXXXXX I >> get the expected behaviour: >> >> ➜ qsub -I -l select=1:ncpus=4:mem=1gb >> qsub: waiting for job 9616042.pbs to start >> qsub: job 9616042.pbs ready >> ➜ python3 >> Python 3.4.10 (default, Dec 13 2019, 16:20:47) [GCC] on linux >> Type "help", "copyright", "credits" or "license" for more information. >> >>> import os >> >>> os.cpu_count() >> 72 >> >>> len(os.sched_getaffinity(0)) >> 4 >> >>> >> >> This seems to be a problem for me as I have a program provided by a >> third-party company that keeps trying to run with 256 threads and crashes. >> The program is a compiled binary so I don't know if they're just grabbing >> the number of CPUs or correctly getting the scheduler affinity, but it >> seems as though TRI's HPC will return the total number of CPUs in any case. >> There aren't any options with the program to set the number of threads >> manually. >> >> My question to the group is what's causing this? Do I need a cgroups >> plugin? >> >> I think these are the relevant lines from the slurm.conf file: >> >> SelectType=select/cons_res >> SelectTypeParameters=CR_CPU_Memory >> ReturnToService=1 >> CpuFreqGovernors=OnDemand,Performance,UserSpace >> CpuFreqDef=Performance >> >> >> >> >> Sid Young >> Translational Research Institute >> >