Re: [slurm-users] slurm-wlm package OpenMPI PMIx implementation

2020-12-10 Thread Avery Grieve
Hey Luke, Just wanted to let you know that your tips helped a lot and I now have srun able to call openmpi as required. Took a bit of finangling with the Unit files but it does indeed work. I've got some warning messages about "undefined symbol: pmix_cb_t_class" but they are ignored so they're mo

Re: [slurm-users] Extremely sluggish squeue -p partition

2020-12-10 Thread Williams, Jenny Avis
We do have one partition that uses AllowGroups instead of AllowAccounts. Testing with that partition closed did not change things. This started Dec 2nd or 3rd - I noticed it on the 3rd. From: slurm-users On Behalf Of Williams, Jenny Avis Sent: Monday, December 7, 2020 11:43 PM To: Slurm User

[slurm-users] Slurm versions 20.11.1 is now available

2020-12-10 Thread Tim Wickberg
We are pleased to announce the availability of Slurm version 20.11.1. This includes a number of fixes made in the month since 20.11 was initially released, including critical fixes to nss_slurm and the Perl API when used with the newer configless mode of operation. Slurm can be downloaded fro

[slurm-users] Database backup best practices

2020-12-10 Thread Ole Holm Nielsen
Backing up the Slurm database is important for disaster recovery or migration of the database, but this is one of those Slurm tasks which isn't well documented. I would like to find out some best practices, and I'm sure there's a lot of experience on this in the Slurm community. I've been wor

Re: [slurm-users] slurm-wlm package OpenMPI PMIx implementation

2020-12-10 Thread Gennaro Oliva
Hi Avery, On Thu, Dec 10, 2020 at 10:51:34AM -0500, Avery Grieve wrote: > Not the case, unfortunately as slurm doesn't have any idea what pmix_v3 > means without being compiled against it I guess. I have also attempted to > compile openmpi from source with the --with-pmi option but the slurm-wlm >

Re: [slurm-users] slurm-wlm package OpenMPI PMIx implementation

2020-12-10 Thread Avery Grieve
Hey Chris, No code to test -- mpi works just not with slurm to call it. Thanks for your help, I think I'm going to be recompiling from source. ~Avery Grieve They/Them/Theirs please! University of Michigan On Thu, Dec 10, 2020 at 12:58 PM Christopher J Cawley wrote: > Hi Avery - > > No worries

Re: [slurm-users] slurm-wlm package OpenMPI PMIx implementation

2020-12-10 Thread Avery Grieve
Luke, Thanks again, I'll have a look at your links here. I think I'm going to have to go and compile slurm from source. Hopefully it goes better than it did last time! I'm just trying to get myself a functional test cluster, really. Technically I already have that by logging into my compute node

Re: [slurm-users] slurm-wlm package OpenMPI PMIx implementation

2020-12-10 Thread Luke Yeager
The ubuntu package is here: https://packages.ubuntu.com/focal/libpmix-dev Yes, we rewrote the service files (see here) and we let debhelper install them to the appropriate location. It seems

Re: [slurm-users] slurm-wlm package OpenMPI PMIx implementation

2020-12-10 Thread Avery Grieve
Hey Luke, Thanks for the response. I should have mentioned I'm on debian. What's the name of the ubuntu package for pmix? I'll see if I can track down the debian equivalent. When you build slurm from scratch you have to place the .service files into /etc/init.d and the daemon files in /etc/system

Re: [slurm-users] slurm-wlm package OpenMPI PMIx implementation

2020-12-10 Thread Luke Yeager
Hi Avery, * pmix: we just use the standard Ubuntu packages on 20.04. Unfortunately the standard packages on 18.04 are too out of date for us. * openmpi: we build our own, using ./configure --with-pmix=internal … * slurm: we build our own, using ./configure --with-pmix=PATH … (see he

Re: [slurm-users] How to assign temporary priority bonuses or penalties?

2020-12-10 Thread Alex Chekholko
Hi Luke, Yes, I think your request is unusual. I believe in the past there have been a number of middle-wares that helped with this kind of bureaucracy, things like http://docs.adaptivecomputing.com/gold/ Regards, Alex On Thu, Dec 10, 2020 at 9:23 AM Luke Yeager wrote: > (originally posted at

Re: [slurm-users] slurm-wlm package OpenMPI PMIx implementation

2020-12-10 Thread Christopher J Cawley
Hi Avery - No worries, I took all of the defaults for software, etc. Jetson's are locked into ubuntu 18.04.5, slurm 19.05.7, etc. Do you have code to test? I'm more of a python person. Thanks Chris Christopher J. Cawley Systems Engineer/Linux Engineer, Information Technology Services 223 A

[slurm-users] How to assign temporary priority bonuses or penalties?

2020-12-10 Thread Luke Yeager
(originally posted at https://bugs.schedmd.com/show_bug.cgi?id=10322) There are some great tools for assigning discounts or penalties to jobs before they are allocated resources (QOS.UsageFactor, Partition.TRESBillingWeights, etc.). But what if I want to change the cost of a job after the fact?

Re: [slurm-users] slurm-wlm package OpenMPI PMIx implementation

2020-12-10 Thread Avery Grieve
Oop, sorry I meant to also include the following: # srun --mpi=list srun: MPI types are... srun: none srun: pmi2 srun: openmpi running srun with --mpi=openmpi gives the same errors as with MpiDefault=none. ~Avery Grieve They/Them/Theirs please! University of Michigan On Thu, Dec 10, 2020 at 11

Re: [slurm-users] slurm-wlm package OpenMPI PMIx implementation

2020-12-10 Thread Avery Grieve
Hi Chris, Thank you for the offer. Here's some quick information on my system: All nodes on Debian 10 (armbian buster converted to DietPi v6.33.3). sinfo --version: slurm-wlm 18.08.5-2 With MpiDefault=pmix I get the following srun errors: srun: error: Couldn't find the specified plugin name for

Re: [slurm-users] slurm-wlm package OpenMPI PMIx implementation

2020-12-10 Thread Christopher J Cawley
I have a 7 node jetson nano cluster running at home. Send me what you want me to take a look at . If it's not a big deal, then I can let you know. Ubuntu 18 / slurm Thanks Chris Christopher J. Cawley Systems Engineer/Linux Engineer, Information Technology Services 223 Aquia Building, Ffx,

[slurm-users] slurm-wlm package OpenMPI PMIx implementation

2020-12-10 Thread Avery Grieve
Hi Forum, I've been putting together an ARM cluster for fun/learning and I've been a bit lost about how to get OpenMPI and slurm to behave together. I have installed the slurm-wlm package from the Debian apt search and compiled OpenMPI from source on

Re: [slurm-users] Backfill pushing jobs back

2020-12-10 Thread David Baker
Hi Chris, Thank you for your reply. It isn't long since we upgraded to Slurm v19, however it sounds like we should start to actively look at v20 since this issue is causing significant problems on our cluster. We're download and install v20 on our dev cluster, and experiment. Best regards, Dav