[slurm-users] Re: srun weirdness

2024-05-15 Thread Dj Merrill via slurm-users
I completely missed that, thank you! -Dj Laura Hild via slurm-users wrote: PropagateResourceLimitsExcept won't do it? Sarlo, Jeffrey S wrote: You might look at the PropagateResourceLimits and PropagateResourceLimitsExcept settings in slurm.conf -- slurm-users mailing list -- slurm-users@l

[slurm-users] Re: srun weirdness

2024-05-15 Thread Laura Hild via slurm-users
PropagateResourceLimitsExcept won't do it? Od: Dj Merrill via slurm-users Poslano: sreda, 15. maj 2024 09:43 Za: slurm-users@lists.schedmd.com Zadeva: [EXTERNAL] [slurm-users] Re: srun weirdness Thank you Hemann and Tom! That was it. The new cluster ha

[slurm-users] Re: Location of Slurm source packages?

2024-05-15 Thread Jeffrey Layton via slurm-users
Chris, Good to hear from you too (I need to post more often so I can see everyone). Thanks for the tip. I forgot about looking on the web. This is perfect. Thanks! Jeff On Wed, May 15, 2024 at 11:05 AM Christopher Samuel via slurm-users < slurm-users@lists.schedmd.com> wrote: > Hi Jeff! > >

[slurm-users] Re: Location of Slurm source packages?

2024-05-15 Thread Renfro, Michael via slurm-users
Forgot to add that Debian/Ubuntu packages are pretty much whatever version was stable at the time of the Debian/Ubuntu .0 release. They’ll backport security fixes to those older versions as needed, but they never change versions unless absolutely required. The backports repositories may have lo

[slurm-users] Re: Location of Slurm source packages?

2024-05-15 Thread Renfro, Michael via slurm-users
Debian/Ubuntu sources can always be found in at least two ways: 1. Pages like https://packages.ubuntu.com/jammy/slurm-wlm (see the .dsc, .orig.tar.gz, and .debian.tar.xz links there). 2. Commands like ‘apt-get source slurm-wlm’ (may require ‘dpkg-dev’ or other packages – probably easiest

[slurm-users] Re: Location of Slurm source packages?

2024-05-15 Thread Lloyd Brown via slurm-users
Jeff, Dang.  That's really old.  I'm not sure I would run one that old, to be honest.  Too many missing security fixes and added features.  It's never been that hard to do a 'git clone' and the normal configure/make/make install process with slurm. Someone else made me aware of this, in case

[slurm-users] Re: Location of Slurm source packages?

2024-05-15 Thread Christopher Samuel via slurm-users
Hi Jeff! On 5/15/24 10:35 am, Jeffrey Layton via slurm-users wrote: I have an Ubuntu 22.04 server where I installed Slurm from the Ubuntu packages. I now want to install pyxis but it says I need the Slurm sources. In Ubuntu 22.04, is there a package that has the source code? How to download t

[slurm-users] Re: Location of Slurm source packages?

2024-05-15 Thread Jeffrey Layton via slurm-users
Lloyd, Good to hear from you! I was hoping to avoid the use of git but that may be the only way. The version is 21.08.5. I checked the "old" packages from SchedMD and they begin part way through 2024 so that won't work. I'm very surprised Ubuntu let a package through without a source package for

[slurm-users] Re: Location of Slurm source packages?

2024-05-15 Thread Lloyd Brown via slurm-users
Jeff, I'm not sure what version is in the Ubuntu packages, as I don't think they're provided by SchedMD, and I'm having trouble finding the right one on packages.ubuntu.com.  Having said that, SchedMD is pretty good about using tags in their github repo (https://github.com/schedmd/slurm), to

[slurm-users] Location of Slurm source packages?

2024-05-15 Thread Jeffrey Layton via slurm-users
Good morning, I have an Ubuntu 22.04 server where I installed Slurm from the Ubuntu packages. I now want to install pyxis but it says I need the Slurm sources. In Ubuntu 22.04, is there a package that has the source code? How to download the sources I need from github? Thanks! Jeff -- slurm-us

[slurm-users] Re: srun weirdness

2024-05-15 Thread Dj Merrill via slurm-users
Thank you Hemann and Tom!  That was it. The new cluster has a virtual memory limit on the login host, and the old cluster did not. It doesn't look like there is any way to set a default to override the srun behaviour of passing those resource limits to the shell, so I may consider removing t

[slurm-users] Re: Job Invalid Account

2024-05-15 Thread joao.damas--- via slurm-users
Hey, I've just created a thread on something similar (https://lists.schedmd.com/mailman3/hyperkitty/list/slurm-users@lists.schedmd.com/message/MGV6YUIIIPFVUSZPBBXS3YG6BW5K553M/), but we have an extra "error" line. Maybe it's related? -- slurm-users mailing list -- slurm-users@lists.schedmd.com

[slurm-users] _refresh_assoc_mgr_qos_list: no new list given back keeping cached one

2024-05-15 Thread joao.damas--- via slurm-users
Hi all, We are doing a simple setup for a Slurm cluster (version 23.11.6). We follow the documentation and we are trying a setup still without accounting or slurmdbd. The slurm.conf is really simple: ``` ClusterName=Develop SlurmctldHost=head # Slurm configuration AuthType=auth/munge CryptoType

[slurm-users] Re: Slurm Cleaning Up $XDG_RUNTIME_DIR Before It Should?

2024-05-15 Thread Arnuld via slurm-users
Hi Ward, Thanks for replying. I tried these but the error is exactly the same (everything under "/shared" has permissions 777 and owned by "nobody:nogroup"): /etc/slurm/slurm.conf JobContainerType=job_container/tmpfs Prolog=/shared/SlurmScripts/prejob PrologFlags=contain /etc/slurm/job_container

[slurm-users] Re: srun weirdness

2024-05-15 Thread greent10--- via slurm-users
Hi, When we first migrated to Slurm from PBS one of the strangest issues we hit was that ulimit settings are inherited from the submission host which could explain the different between ssh'ing into the machine (and the default ulimit being applied) and with running a job via srun. You could u

[slurm-users] Re: Slurm Cleaning Up $XDG_RUNTIME_DIR Before It Should?

2024-05-15 Thread Ward Poelmans via slurm-users
Hi, This is systemd, not slurm. We've also seen it being created and removed. As far as I understood something about the session that systemd clean up. We've worked around by adding this to the prolog: MY_XDG_RUNTIME_DIR=/dev/shm/${USER} mkdir -p $MY_XDG_RUNTIME_DIR echo "export XDG_RUNTIME_DI

[slurm-users] Re: srun weirdness

2024-05-15 Thread Hermann Schwärzler via slurm-users
Hi Dj, could be a memory-limits related problem. What is the output of ulimit -l -m -v -s in both interactive job-shells? You are using cgroups-v1 now, right? In that case what is the respective content of /sys/fs/cgroup/memory/slurm_*/uid_$(id -u)/job_*/memory.limit_in_bytes in both shell

[slurm-users] Slurm Cleaning Up $XDG_RUNTIME_DIR Before It Should?

2024-05-15 Thread Arnuld via slurm-users
I am using the latest slurm. It runs fine for scripts. But if I give it a container then it kills it as soon as I submit the job. Is slurm cleaning up the $XDG_RUNTIME_DIR before it should? This is the log: [2024-05-15T08:00:35.143] [90.0] debug2: _generate_patterns: StepId=90.0 TaskId=-1 [2024-

[slurm-users] Best practice for jobs resuming from suspended state

2024-05-15 Thread Paul Jones via slurm-users
Hi, We use PreemptMode and PriorityTier within Slurm to suspend low priority jobs when more urgent work needs to be done. This generally works well, but on occasion resumed jobs fail to restart - which is to say Slurm sets the job status to running but the actual code doesn't recover from being