Re: [slurm-users] Slurmd enabled crash with CgroupV2

2023-05-23 Thread Alan Orth
ontrol > > > (And i'm not sure if this is something good to do ?) > > > If you have an idea how to correct this situation > > Have a nice day > > Thank you > > Tristan LEFEBVRE > CONFIDENTIALITE : ce courriel et les éventuelles pièces attachées sont la > propriété de l’IRT Jules Verne, sont confidentiels et sont réservés à > l’usage de la ou des personne(s) identifées(s) comme destinataire(s). Si > vous avez reçu ce courriel par erreur, toute utilisation, divulgation, ou > copie de ce courriel est interdite. Dans ce cas, merci d’en informer > immédiatement l'expéditeur et de supprimer le courriel et ses pièces > jointes. > CONFIDENTIALITY : This e-mail and any attachments are IRT Jules Verne’s > property and are intended solely for the person or entity to whom it is > addressed, and may contain confidential or privileged information. Should > you have received this e-mail in error, any use, disclosure, or copy of > this email is prohibited. In this case, please inform the sender > immediately and delete this email and its attachments. > > -- Alan Orth alan.o...@gmail.com https://picturingjordan.com https://englishbulgaria.net https://mjanja.ch

Re: [slurm-users] cgroups issue

2023-05-23 Thread Alan Orth
unable to set hierarchical accounting for > /slurm/uid_1000 > slurmstepd: error: Could not find task_cpuacct_cg, this should never happen > slurmstepd: error: Cannot get cgroup accounting data for 0 > > This happens for both batch and interactive jobs. > > any pointers will be most

Re: [slurm-users] Problems with cgroupsv2

2022-09-04 Thread Alan Orth
For what it's worth I've rolled back to cgroups v1 on CentOS Stream 8. I will be watching future SLURM release notes carefully to see if anything changes here, as well as to see people's experiences here on the list. Regards, On Wed, Aug 17, 2022 at 12:36 AM Alan Orth wrote: &

Re: [slurm-users] Problems with cgroupsv2

2022-08-16 Thread Alan Orth
rors: > - /var/log/munge/munged.log > - sudo systemctl status munge > > If it's a munge error, usually restarting munge does the trick: > > sudo systemctl restart munge > > Regards > --Mick > -- > *From:* slurm-users on beh

Re: [slurm-users] Problems with cgroupsv2

2022-08-16 Thread Alan Orth
#x27;s going on... anyways, at least it's working now! Regards, On Tue, Aug 16, 2022 at 12:53 PM Alan Orth wrote: > Dear list, > > I've been using cgroupsv2 with SLURM 22.05 on CentOS Stream 8 successfully > for a few months now. Recently a few of my nodes have started h

[slurm-users] Problems with cgroupsv2

2022-08-16 Thread Alan Orth
And my slurm.conf has: ProctrackType=proctrack/cgroup TaskPlugin=task/affinity,task/cgroup And cgroup.conf: CgroupAutomount=yes CgroupPlugin=autodetect What else should I look for before giving up and reverting to cgroupsv1? My current version is 22.05.3, but it was happening in 22.05.2 as w

[slurm-users] "Incompatible plugin version" after upgrade

2022-08-16 Thread Alan Orth
h context for K12 I'm running SLURM 22.05.3. The slurmctld is running on CentOS 7, and compute nodes are on CentOS Stream 8 (not sure if this matters?). Thanks for any advice, -- Alan Orth alan.o...@gmail.com https://picturingjordan.com https://englishbulgaria.net https://mjanja.ch

Re: [slurm-users] Slurm version 22.05 is now available

2022-05-31 Thread Alan Orth
php . > > - Tim > > -- > Tim Wickberg > Chief Technology Officer, SchedMD LLC > Commercial Slurm Development and Support > > -- Alan Orth alan.o...@gmail.com https://picturingjordan.com https://englishbulgaria.net https://mjanja.ch

[slurm-users] CentOS Stream 9

2022-03-07 Thread Alan Orth
e jump yet. My gut feeling is that there will be significantly more technical debt to pay off if we go directly to CentOS Stream 9. Thank you, ¹ https://indico.cern.ch/event/1070475/contributions/4511844/attachments/2309304/3929738/lfc03-20210915-NoNDA.pdf -- Alan Orth alan.o...@gmail

Re: [slurm-users] Issue with AccountingStoreFlags after SLURM 21.08.4 upgrade

2021-11-28 Thread Alan Orth
configuration option. Sorry for the confusion. A painful lesson to learn! Regards, On Sun, Nov 28, 2021 at 2:32 PM Alan Orth wrote: > Dear list, > > I just upgraded my cluster from SLURM 20.11.8 to 21.08.4. Before the > upgrade I updated my configuration based on this comment from

[slurm-users] Issue with AccountingStoreFlags after SLURM 21.08.4 upgrade

2021-11-28 Thread Alan Orth
uplicate jobid". *sigh*. What happened here? Is this a bug? This is the messiest SLURM upgrade I've had in years... thank you for any advice, ¹ https://github.com/SchedMD/slurm/blob/slurm-21.08/RELEASE_NOTES#L135 -- Alan Orth alan.o...@gmail.com https://picturingjordan.com https://englishbulgaria.net https://mjanja.ch

Re: [slurm-users] Suspending jobs for file system maintenance

2021-10-25 Thread Alan Orth
resume > > > > > > Finally bring back the partitions: > > > > > > # for p in foo bar baz; do scontrol update PartitionName=$p State=UP; > done > > > > > > Does that make sense? Is that common practice? Are there any caveats > that > > > we must think about? > > > > > > Thank you in advance for your thoughts. > > > > > > Best regards > > > Jürgen > > > > > -- Alan Orth alan.o...@gmail.com https://picturingjordan.com https://englishbulgaria.net https://mjanja.ch

Re: [slurm-users] Cluster usage, filtered by partition

2021-05-26 Thread Alan Orth
officially > > abadonware. Too many things that can go wrong and can't be patched, > IMVHO. > > > > -- > > Diego Zuccato > > DIFA - Dip. di Fisica e Astronomia > > Servizi Informatici > > Alma Mater Studiorum - Università di Bologna > > V.le Berti

Re: [slurm-users] What is an easy way to prevent users run programs on the master/login node.

2021-05-19 Thread Alan Orth
gt; * hardrss 5000 > * harddata5000 > * softstack 4000 > * hard stack 5000 > * hardnproc 250 > > /Ole > > -- Alan Orth alan.o...@gmail.com https://picturingjordan.com https://englishbulgaria.net https://mjanja.ch

Re: [slurm-users] Compute node process monitoring tools updated

2021-01-19 Thread Alan Orth
> Ole Holm Nielsen > PhD, Senior HPC Officer > Department of Physics, Technical University of Denmark > > -- Alan Orth alan.o...@gmail.com https://picturingjordan.com https://englishbulgaria.net https://mjanja.ch

Re: [slurm-users] Do not upgrade mysql to 5.7.30!

2020-06-26 Thread Alan Orth
At the height of Oracle's hostility towards the open source community in 2014 Ubuntu CEO Mark Shuttleworth announced that Ubuntu 14.04 would keep using MySQL, even after Debian itself (and other distros) switched to MariaDB. https://www.zdnet.com/article/shuttleworth-says-ubuntu-is-sticking-with-m

Re: [slurm-users] Issue with x11

2019-05-17 Thread Alan Orth
s to understand this error. There they mention it possibly being related to the new X11 code in SLURM 18.08. Regards, ¹ https://bugs.schedmd.com/show_bug.cgi?id=6307 On Thu, May 16, 2019 at 7:02 PM Christopher Samuel wrote: > On 5/16/19 1:04 AM, Alan Orth wrote: > > > but now we get

Re: [slurm-users] Issue with x11

2019-05-16 Thread Alan Orth
All the best, > Chris > -- > Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA > > -- Alan Orth alan.o...@gmail.com https://picturingjordan.com https://englishbulgaria.net https://mjanja.ch "In heaven all the interesting people are missing." ―Friedrich Nietzsche

Re: [slurm-users] ntpd or chrony?

2018-02-19 Thread Alan Orth
our ecosystem is collapsing, yet we are still here — and we > > are creative agents who can shape our destinies. Apocalyptic civics is > > the conviction that the only way out is through, and the only way > > through is together. " > > > > /Greg Bloom/ @greggish > > https://twitter.com/greggish/status/873177525903609857 > > -- Alan Orth alan.o...@gmail.com https://picturingjordan.com https://englishbulgaria.net https://mjanja.ch

Re: [slurm-users] Strange problem with Slurm 17.11.0: "batch job complete failure"

2018-02-04 Thread Alan Orth
; execve job > > drain_nodes: node node048 state set to DRAIN > > If anyone can shine some light on where I should start looking, I shall be > most obliged! > > Andy > > -- > Andy riebsandy.ri...@hpe.com > Hewlett-Packard Enterprise > High Performance Computing Software Engineering+1 404 648 9024 > <(404)%20648-9024> > My opinions are not necessarily those of HPE > May the source be with you! > > > > -- Alan Orth alan.o...@gmail.com https://picturingjordan.com https://englishbulgaria.net https://mjanja.ch

Re: [slurm-users] Missing systemd unit files in SLURM 17.11.0 RPMs

2017-11-30 Thread Alan Orth
Dear Ole, You are absolutely right! Thank you for pointing this out. I hadn't noticed the RPMs were re-arranged so much as of 17.11. Thanks again, On Thu, Nov 30, 2017 at 4:04 PM Ole Holm Nielsen wrote: > On 11/30/2017 01:40 PM, Alan Orth wrote: > > I just built SLURM 17.11.0

[slurm-users] Missing systemd unit files in SLURM 17.11.0 RPMs

2017-11-30 Thread Alan Orth
86_64.rpm | egrep "init.d|service$" /etc/init.d/slurm /usr/lib/systemd/system/slurmctld.service /usr/lib/systemd/system/slurmd.service I see the service files in the source tarball. Is this a problem with the spec file in 17.10.0 or perhaps something wrong in my build environment