Re: [slurm-users] Disable --no-allocate support for a node/SlurmD
Hi, Ah okay, so your requirements include completely insulating (some) jobs from outside access, including root? Correct. I've seen this kind of requirements on e.g. working non-defaced medical data - generally a tough problem imo because this level of data security seems more or less incompatible with the idea of a multi-user HPC system. I remember that this year's ZKI-AK Supercomputing spring meeting had Sebastian Krey from GWDG presenting the KISSKI ("KI-Servicezentrum für Sensible und Kritische Infrastrukturen", https://kisski.gwdg.de/ ) project, which works in this problem domain, are you involved in that? The setup with containerization and 'node hardening' sounds very similar to me. Indeed. We (ZIH TU Dresden) are working together with Hendrik Nolte from GWDG to implement their concept of a "secure Workflow on HPC" on our system. In short the idea here is to have nodes with additional (cryptographic) authentication of jobs. I'm just double-checking alternatives for some details which may result in easier implementation of the concept. Re "preventing the scripts from running": I'd say it's about as easy as to otherwise manipulate any job submission that goes through slurmctld (e.g. by editing slurm.conf), so without knowing your exact use case and requirements, I can't think of a simple solution. The resource manager, i.e. slurmctld, and slurmd run on different machines. There is a local copy of slurm.conf for slurmctld, and the node(s), i.e. slurmd, each using only the relevant parts. So the slurmd doesn't care about the submit plugins and slurmctld doesn't (need to) know about the Prolog, correct? The idea in the workflow is that only the node itself needs to be considered secure and access to the node is only possible via the slurmd running on the node. So that slurmd can be configured to always execute the Prolog (a local script) prior to each job and deny its execution on failed authentication. Circumventing this authentication now requires modifying the slurm.conf on that node, which has to be considered impossible as an attacker with that capability could also modify anything else (e.g. the Prolog to remove the checks). But the possibility of slurmd handling a `--no-alloc` job introduces a new way to circumvent the authentication. Using the slurm.conf of the slurmctld effectively only disables requests to the slurmd to not run the Prolog (i.e. -Z flag), but if the slurmd somehow receives such an request it would still handle it. So now the security relies additionally on the security of the resource manager. It would be more secure if slurmd on that node(s) could be configured to never skip the Prolog, even if the user seems to be privileged. As the node could be rebooted prior to each job using a readonly image the security of each job can be ensured without any influence on the rest of the cluster. So in summary: We don't want to trust the slurmctld (running somewhere else) but only the slurmd (running on the node) to always execute the Prolog. I hope that explains it well enough. Kind regards, Alex smime.p7s Description: S/MIME Cryptographic Signature
[slurm-users] Slurm version 23.02.3 is now available
We are pleased to announce the availability of Slurm version 23.02.3. The 23.02.3 release includes a number of fixes to Slurm stability, including potential slurmctld crashes when the backup slurmctld takes over. This also fixes some issues when using older versions of the command line tools with a 23.02 controller. Slurm can be downloaded from https://www.schedmd.com/downloads.php . -Tim -- Tim McMullan Release Management, Support, and Development SchedMD LLC - Commercial Slurm Development and Support * Changes in Slurm 23.02.3 == -- Fix regression in 23.02.2 that ignored the partition DefCpuPerGPU setting on the first pass of scheduling a job requesting --gpus --ntasks. -- openapi/dbv0.0.39/users - If a default account update failed, resulting in a no-op, the query returned success without any warning. Now a warning is sent back to the client that the default account wasn't modified. -- srun - fix issue creating regular and interactive steps because *_PACK_GROUP* environment variables were incorrectly set on non-HetSteps. -- Fix dynamic nodes getting stuck in allocated states when reconfiguring. -- Avoid job write lock when nodes are dynamically added/removed. -- burst_buffer/lua - allow jobs to get scheduled sooner after slurm_bb_data_in completes. -- mpi/pmix - fix regression introduced in 23.02.2 which caused PMIx shmem backed files permissions to be incorrect. -- api/submit - fix memory leaks when submission of batch regular jobs or batch HetJobs fails (response data is a return code). -- openapi/v0.0.39 - fix memory leak in _job_post_het_submit(). -- Fix regression in 23.02.2 that set the SLURM_NTASKS environment variable in sbatch jobs from --ntasks-per-node when --ntasks was not requested. -- Fix regression in 23.02 that caused sbatch jobs to set the wrong number of tasks when requesting --ntasks-per-node without --ntasks, and also requesting one of the following options: --sockets-per-node, --cores-per-socket, --threads-per-core (or --hint=nomultithread), or -B,--extra-node-info. -- Fix double counting suspended job counts on nodes when reconfiguring, which prevented nodes with suspended jobs from being powered down or rebooted once the jobs completed. -- Fix backfill not scheduling jobs submitted with --prefer and --constraint properly. -- Avoid possible slurmctld segfault caused by race condition with already completed slurmdbd_conn connections. -- Slurmdbd.conf checks included conf files for 0600 permissions -- slurmrestd - fix regression "oversubscribe" fields were removed from job descriptions and submissions from v0.0.39 end points. -- accounting_storage/mysql - Query for indiviual QOS correctly when you have more than 10. -- Add warning message about ignoring --tres-per-tasks=license when used on a step. -- sshare - Fix command to work when using priority/basic. -- Avoid loading cli_filter plugins outside of salloc/sbatch/scron/srun. This fixes a number of missing symbol problems that can manifest for executables linked against libslurm (and not libslurmfull). -- Allow cloud_reg_addrs to update dynamically registered node's addrs on subsequent registrations. -- switch/hpe_slingshot - Fix hetjob components being assigned different vnis. -- Revert a change in 22.05.5 that prevented tasks from sharing a core if --cpus-per-task > threads per core, but caused incorrect accounting and cpu binding. Instead, --ntasks-per-core=1 may be requested to prevent tasks from sharing a core. -- Correctly send assoc_mgr lock to mcs plugin. -- Fix regression in 23.02 leading to error() messages being sent at INFO instead of ERR in syslog. -- switch/hpe_slingshot - Fix bad instant-on data due to incorrect parsing of data from jackaloped. -- Fix TresUsageIn[Tot|Ave] calculation for gres/gpumem and gres/gpuutil. -- Avoid unnecessary gres/gpumem and gres/gpuutil TRES position lookups. -- Fix issue in the gpu plugins where gpu frequencies would only be set if both gpu memory and gpu frequencies were set, while one or the other suffices. -- Fix reservations group ACL's not working with the root group. -- slurmctld - Fix backup slurmctld crash when it takes control multiple times. -- Fix updating a job with a ReqNodeList greater than the job's node count. -- Fix inadvertent permission denied error for --task-prolog and --task-epilog with filesystems mounted with root_squash. -- switch/hpe_slingshot - remove the unused vni_pids option. -- Fix missing detailed cpu and gres information in json/yaml output from scontrol, squeue and sinfo. -- Fix regression in 23.02 that causes a failure to allocate job steps that request --cpus-per-gpu and gpus with types. -- sacct - when printing PLANNED time, use end time instead of start time for jobs cancelled before they started. -- Fix potentially waiting indefinitely for a defunct proc
[slurm-users] Fwd: task/cgroup plugin causes "srun: error: task 0 launch failed: Plugin initialization failed" error on Ubuntu 22.04
Hi, I am maintaining the SLURM cluster of my research group. Recently I updated to Ubuntu 22.04 and Slurm 21.08.5 and ever since, I am unable to launch jobs. When launching a job, I receive the following error: /$ srun --nodes=1 --ntasks-per-node=1 -c 1 --mem-per-cpu 1G --time=01:00:00 --pty -p amd -w cn02 --pty bash -i// //srun: error: task 0 launch failed: Plugin initialization failed/ Strangely, I cannot find any indication of this problem in the logs (find the logs attached). The problem must be related to the task/cgroup plugin, as it does not occur when I disable it. After reading in the documentation, I tried adding the /cgroup_enable=memory swapaccount=1/ kernel parameters, but the problem persisted. I would be very grateful for any advice where to look since I have no idea how to investigate this issue further. Thanks a lot in advance. Best, Tim ### # Slurm cgroup support configuration file ### CgroupAutomount=yes CgroupMountpoint=/sys/fs/cgroup ConstrainKmemSpace=no ConstrainCores=yes ConstrainRAMSpace=yes ConstrainSwapSpace=yes # This will be necessary for controlling GPU access ConstrainDevices=yes # # slurmd -D -vv --conf-server nas:6817 slurmd: debug: Log file re-opened slurmd: debug2: hwloc_topology_init slurmd: debug2: hwloc_topology_load slurmd: debug2: hwloc_topology_export_xml slurmd: debug: CPUs:16 Boards:1 Sockets:1 CoresPerSocket:16 ThreadsPerCore:1 slurmd: debug4: CPU map[0]=>0 S:C:T 0:0:0 slurmd: debug4: CPU map[1]=>1 S:C:T 0:1:0 slurmd: debug4: CPU map[2]=>2 S:C:T 0:2:0 slurmd: debug4: CPU map[3]=>3 S:C:T 0:3:0 slurmd: debug4: CPU map[4]=>4 S:C:T 0:4:0 slurmd: debug4: CPU map[5]=>5 S:C:T 0:5:0 slurmd: debug4: CPU map[6]=>6 S:C:T 0:6:0 slurmd: debug4: CPU map[7]=>7 S:C:T 0:7:0 slurmd: debug4: CPU map[8]=>8 S:C:T 0:8:0 slurmd: debug4: CPU map[9]=>9 S:C:T 0:9:0 slurmd: debug4: CPU map[10]=>10 S:C:T 0:10:0 slurmd: debug4: CPU map[11]=>11 S:C:T 0:11:0 slurmd: debug4: CPU map[12]=>12 S:C:T 0:12:0 slurmd: debug4: CPU map[13]=>13 S:C:T 0:13:0 slurmd: debug4: CPU map[14]=>14 S:C:T 0:14:0 slurmd: debug4: CPU map[15]=>15 S:C:T 0:15:0 slurmd: debug3: _set_slurmd_spooldir: initializing slurmd spool directory `/var/spool/slurmd` slurmd: debug2: hwloc_topology_init slurmd: debug2: xcpuinfo_hwloc_topo_load: xml file (/var/spool/slurmd/hwloc_topo_whole.xml) found slurmd: debug: CPUs:16 Boards:1 Sockets:1 CoresPerSocket:16 ThreadsPerCore:1 slurmd: debug4: CPU map[0]=>0 S:C:T 0:0:0 slurmd: debug4: CPU map[1]=>1 S:C:T 0:1:0 slurmd: debug4: CPU map[2]=>2 S:C:T 0:2:0 slurmd: debug4: CPU map[3]=>3 S:C:T 0:3:0 slurmd: debug4: CPU map[4]=>4 S:C:T 0:4:0 slurmd: debug4: CPU map[5]=>5 S:C:T 0:5:0 slurmd: debug4: CPU map[6]=>6 S:C:T 0:6:0 slurmd: debug4: CPU map[7]=>7 S:C:T 0:7:0 slurmd: debug4: CPU map[8]=>8 S:C:T 0:8:0 slurmd: debug4: CPU map[9]=>9 S:C:T 0:9:0 slurmd: debug4: CPU map[10]=>10 S:C:T 0:10:0 slurmd: debug4: CPU map[11]=>11 S:C:T 0:11:0 slurmd: debug4: CPU map[12]=>12 S:C:T 0:12:0 slurmd: debug4: CPU map[13]=>13 S:C:T 0:13:0 slurmd: debug4: CPU map[14]=>14 S:C:T 0:14:0 slurmd: debug4: CPU map[15]=>15 S:C:T 0:15:0 slurmd: debug3: Trying to load plugin /usr/lib/x86_64-linux-gnu/slurm-wlm/gres_gpu.so slurmd: debug: gres/gpu: init: loaded slurmd: debug3: Success. slurmd: debug3: _merge_gres2: From gres.conf, using gpu:rtx2080:1:/dev/nvidia0 slurmd: debug3: Trying to load plugin /usr/lib/x86_64-linux-gnu/slurm-wlm/gpu_generic.so slurmd: debug: gpu/generic: init: init: GPU Generic plugin loaded slurmd: debug3: Success. slurmd: debug3: gres_device_major : /dev/nvidia0 major 195, minor 0 slurmd: Gres Name=gpu Type=rtx2080 Count=1 slurmd: debug3: Trying to load plugin /usr/lib/x86_64-linux-gnu/slurm-wlm/topology_none.so slurmd: topology/none: init: topology NONE plugin loaded slurmd: debug3: Success. slurmd: debug3: Trying to load plugin /usr/lib/x86_64-linux-gnu/slurm-wlm/route_default.so slurmd: route/default: init: route default plugin loaded slurmd: debug3: Success. slurmd: debug2: Gathering cpu frequency information for 16 cpus slurmd: debug: Resource spec: No specialized cores configured by default on this node slurmd: debug: Resource spec: Reserved system memory limit not configured for this node slurmd: debug3: NodeName= cn02 slurmd: debug3: TopoAddr= cn02 slurmd: debug3: TopoPattern = node slurmd: debug3: ClusterName = iascluster slurmd: debug3: Confile = `/var/spool/slurmd/conf-cache/slurm.conf' slurmd: debug3: Debug = 5 slurmd: debug3: CPUs= 16 (CF: 16, HW: 16) slurmd: debug3: Boards = 1 (CF: 1, HW: 1) slurmd: debug3: Sockets = 1 (CF: 1, HW: 1) slurmd: debug3: Cores = 16 (CF: 16, HW: 16) slurmd: debug3: Threads = 1 (CF: 1, HW: 1) slurmd: debug3: UpTime = 2377 = 00:39:37 slurmd: debug3: Block Map = 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15 slurmd: debug3: Inverse Map = 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15 slurmd: debug3: RealMemory = 64216 slurmd: debug3: TmpDisk = 32108 slurmd: debug3: Epilog =
[slurm-users] SLUG Early Bird Ends Tomorrow!
Early Bird registration for Slurm User Group 2023 ends tomorrow, Friday, June 16th! This year’s SLUG event will take place September 12th - 13th at Brigham Young University with a Welcome Reception at the Provo Marriott Hotel and Conference Center on the evening of Monday, September 11th. Registration includes the Monday evening reception and both days of main conference activity. SLUG Early Bird registration ends Friday, June 16th. Register now: https://www.eventbrite.com/e/631240546467 SLUG 2023 Call for Papers also ends tomorrow. All interested parties should send an abstract to sl...@schedmd.com by EOD, Friday, June 16th. SchedMD has secured a room block at the Provo Marriott and Conference Center at a discounted rate of 139 USD/night. This rate is good for check in on September 11th and checkout on September 13th. This discount is available until Monday, August 14th on a first come first serve basis. Book your Provo Marriott stay now: https://www.marriott.com/events/start.mi?id=1685745988857&key=GRP For more information on other hotels and travel information check out the SLUG registration page. -- Victoria Hobson Vice President of Marketing SchedMD LLC
Re: [slurm-users] task/cgroup plugin causes "srun: error: task 0 launch failed: Plugin initialization failed" error on Ubuntu 22.04
I don’t have any direct advice off-hand, but I figure I will try to help steer the conversation in the right direction for figuring it out. I’m going to assume that since you mention 21.08.5, that this means you are using the slurm-wlm packages from the ubuntu repos, and not building yourself? And have all the components (slurmctld(s), slurmdbd, slurmd(s)) been upgraded as well? The only thing that immediately comes to mind is that I remember reading a good bit about Ubuntu 22.04’s use of cgroups v2, which as I understand it are very different from cgroups v1, and plenty of people have had issues with v1/v2 mismatches with slurm and other applications. https://www.reddit.com/r/SLURM/comments/vjquih/error_cannot_find_cgroup_plugin_for_cgroupv2/ https://groups.google.com/g/slurm-users/c/0dJhe5r6_2Q?pli=1 https://discuss.linuxcontainers.org/t/after-updated-to-more-recent-ubuntu-version-with-cgroups-v2-ubuntu-16-04-container-is-not-working-properly/14022 Hope that at least steers the conversation in a good direction. Reed > On Jun 15, 2023, at 5:04 PM, Tim Schneider > wrote: > > Hi, > I am maintaining the SLURM cluster of my research group. Recently I updated > to Ubuntu 22.04 and Slurm 21.08.5 and ever since, I am unable to launch jobs. > When launching a job, I receive the following error: > > $ srun --nodes=1 --ntasks-per-node=1 -c 1 --mem-per-cpu 1G --time=01:00:00 > --pty -p amd -w cn02 --pty bash -i > srun: error: task 0 launch failed: Plugin initialization failed > > Strangely, I cannot find any indication of this problem in the logs (find the > logs attached). The problem must be related to the task/cgroup plugin, as it > does not occur when I disable it. > > After reading in the documentation, I tried adding the cgroup_enable=memory > swapaccount=1 kernel parameters, but the problem persisted. > > I would be very grateful for any advice where to look since I have no idea > how to investigate this issue further. > > Thanks a lot in advance. > > Best, > > Tim > > > > smime.p7s Description: S/MIME cryptographic signature
Re: [slurm-users] task/cgroup plugin causes "srun: error: task 0 launch failed: Plugin initialization failed" error on Ubuntu 22.04
Indeed, the issue seems to be that Ubuntu 22.04 does not support cgroups v1 anymore. Does SLURM support cgroupsv2? It seems so: https://slurm.schedmd.com/cgroup_v2.html /Abel > On Jun 15, 2023, at 20:20, Reed Dier wrote: > > I don’t have any direct advice off-hand, but I figure I will try to help > steer the conversation in the right direction for figuring it out. > > I’m going to assume that since you mention 21.08.5, that this means you are > using the slurm-wlm packages from the ubuntu repos, and not building yourself? > > And have all the components (slurmctld(s), slurmdbd, slurmd(s)) been upgraded > as well? > > The only thing that immediately comes to mind is that I remember reading a > good bit about Ubuntu 22.04’s use of cgroups v2, which as I understand it are > very different from cgroups v1, and plenty of people have had issues with > v1/v2 mismatches with slurm and other applications. > > https://www.reddit.com/r/SLURM/comments/vjquih/error_cannot_find_cgroup_plugin_for_cgroupv2/ > https://groups.google.com/g/slurm-users/c/0dJhe5r6_2Q?pli=1 > https://discuss.linuxcontainers.org/t/after-updated-to-more-recent-ubuntu-version-with-cgroups-v2-ubuntu-16-04-container-is-not-working-properly/14022 > > Hope that at least steers the conversation in a good direction. > > Reed > >> On Jun 15, 2023, at 5:04 PM, Tim Schneider >> wrote: >> >> Hi, >> I am maintaining the SLURM cluster of my research group. Recently I updated >> to Ubuntu 22.04 and Slurm 21.08.5 and ever since, I am unable to launch >> jobs. When launching a job, I receive the following error: >> >> $ srun --nodes=1 --ntasks-per-node=1 -c 1 --mem-per-cpu 1G --time=01:00:00 >> --pty -p amd -w cn02 --pty bash -i >> srun: error: task 0 launch failed: Plugin initialization failed >> >> Strangely, I cannot find any indication of this problem in the logs (find >> the logs attached). The problem must be related to the task/cgroup plugin, >> as it does not occur when I disable it. >> >> After reading in the documentation, I tried adding the cgroup_enable=memory >> swapaccount=1 kernel parameters, but the problem persisted. >> >> I would be very grateful for any advice where to look since I have no idea >> how to investigate this issue further. >> >> Thanks a lot in advance. >> >> Best, >> >> Tim >> >> >> >> >