[slurm-users] slurmdbd archive format

2024-05-28 Thread O'Neal, Doug (NIH/NCI) [C] via slurm-users
to a secondary MariaDB instance, but that train has passed. The format of the archive files is not well documented. Does anyone have a program (python/C/whatever) that will read a job_table_archive file and decode it into a parsable structure? Douglas O'Neal, Ph.D. (contractor) Manager

[slurm-users] Couldn't find the specified plugin name for auth/munge looking at all files

2023-10-29 Thread C
Hello, I need to use SLURM for a project. I installed it by this quick start guide ( https://ibmimaster.cs.uni-tuebingen.de/quickstart_admin.html ). First I just want to run it on one cluster. - I did steps 1 to 7, create the slurm user with my slurm binaries as home dir - created the necessary d

Re: [slurm-users] slurm array with non-numeric index values

2020-07-15 Thread c b
Script, > Multiple Datasets). We eventually wrote an abstract utility to try to help > them with the process: > > > https://github.com/jtfrey/job-templating-tool > > > > May be of some use to you. > > > > > On Jul 15, 2020, at 16:13 , c b wrote: > > I'

[slurm-users] slurm array with non-numeric index values

2020-07-15 Thread c b
I'm trying to run an embarrassingly parallel experiment, with 500+ tasks that all differ in one parameter. e.g.: job 1 - script.py foo job 2 - script.py bar job 3 - script.py baz and so on. This seems like a case where having a slurm array hold all of these jobs would help, so I could just submi

[slurm-users] slurm status says jobs are running but they aren't

2020-03-02 Thread c b
Hi, I have a bunch of jobs that according to the slurm status have been running for 30+ minutes, but in reality aren't running. When i go to the node where the job is supposed to be, the processes aren't there (not showing up in top or ps) and the job's stdout/stderr logs are empty. I know it's

Re: [slurm-users] job priority keeping resources from being used?

2019-11-05 Thread c b
running simultaneously on each machine. thanks > Best regards > Jürgen > > -- > Jürgen Salk > Scientific Software & Compute Services (SSCS) > Kommunikations- und Informationszentrum (kiz) > Universität Ulm > Telefon: +49 (0)731 50-22478 > Telefax: +49 (0)731 50

Re: [slurm-users] job priority keeping resources from being used?

2019-11-01 Thread c b
ailable as far as slurm is concerned. > > Brian > On 11/1/2019 10:52 AM, c b wrote: > > yes, there is enough memory for each of these jobs, and there is enough > memory to run the high resource and low resource jobs at the same time. > > On Fri, Nov 1, 2019 at 1:37 PM Brian Andrus

Re: [slurm-users] job priority keeping resources from being used?

2019-11-01 Thread c b
e isn't enough memory available for it. > > Brian Andrus > On 11/1/2019 7:42 AM, c b wrote: > > I have: > SelectType=select/cons_res > SelectTypeParameters=CR_CPU_Memory > > On Fri, Nov 1, 2019 at 10:39 AM Mark Hahn wrote: > >> > In theory, these sm

Re: [slurm-users] job priority keeping resources from being used?

2019-11-01 Thread c b
I tried setting a 5 minute time limit on some low resource jobs, and one hour on high resource jobs, but my 5 minute jobs are still waiting behind the hourlong jobs. Can you suggest some combination of time limits that would work here? On Fri, Nov 1, 2019 at 11:08 AM c b wrote: > On my

Re: [slurm-users] job priority keeping resources from being used?

2019-11-01 Thread c b
rm knows, > the low priority jobs will take longer to finish than just waiting for the > current running jobs to finish. > > > > John > > > > > > *From: *slurm-users on behalf of > c b > *Reply-To: *Slurm User Community List > *Date: *Friday, November 1,

Re: [slurm-users] job priority keeping resources from being used?

2019-11-01 Thread c b
I have: SelectType=select/cons_res SelectTypeParameters=CR_CPU_Memory On Fri, Nov 1, 2019 at 10:39 AM Mark Hahn wrote: > > In theory, these small jobs could slip in and run alongside the large > jobs, > > what are your SelectType and SelectTypeParameters settings? > ExclusiveUser=YES on partitio

[slurm-users] job priority keeping resources from being used?

2019-11-01 Thread c b
Hi, Apologies for the weird subject line...I don't know how else to describe what I'm seeing. Suppose my cluster has machines with 8 cores each. I have many large high priority jobs that each require 6 cores, so each machine in my cluster runs one of each of these jobs at a time. However, I als

[slurm-users] understanding resource reservations

2019-10-21 Thread c b
he cluster, and on some other machines just restrict the cores allocated to slurm. For example, I want machine A to be unavailable to slurm from 9am-5pm Monday-Friday, machine B to only have 50% of its cores available during this time, but machine C to be 100% available at all times. It sounds lik

[slurm-users] GPU + no_consume

2018-07-10 Thread Félix C . Morency
son, SLURM doesn't allow access to the devices Jul 10 13:54:23 imk-dl-01 slurmstepd[2232]: debug: Not allowing access to device c 195:0 rwm(/dev/nvidia0) for job Jul 10 13:54:23 imk-dl-01 slurmstepd[2232]: debug: Not allowing access to device c 195:1 rwm(/dev/nvidia1) for job Jul 10 13:

Re: [slurm-users] Queue size, slow/unresponsive head node

2018-01-11 Thread Nicholas C Santucci
Why do you have? SchedulerParameters = (null) Is that even allowed ​?​ https://slurm.schedmd.com/sched_config.html On Thu, Jan 11, 2018 at 1:39 PM, Colas Rivière wrote: > Hello, > > I'm managing a small cluster (one head node, 24 workers, 1160 total worker > threads). The head node has t