Re: [slurm-users] [EXTERNAL] Re: slurmdbd database usage

2023-08-02 Thread Greg Wickham
Yup – Slurm is specifically tied to MySQL/MariaDB. To get around this I wrote an C++ application that will extract job records from Slurm using “sacct” and write them into a PostgreSQL database. https://gitlab.com/greg.wickham/sminer The schema used in PostgreSQL is more conduci

Re: [slurm-users] Unconfigured GPUs being allocated

2023-08-02 Thread Christopher Samuel
On 7/14/23 1:10 pm, Wilson, Steven M wrote: It's not so much whether a job may or may not access the GPU but rather which GPU(s) is(are) included in $CUDA_VISIBLE_DEVICES. That is what controls what our CUDA jobs can see and therefore use (within any cgroups constraints, of course). In my case

Re: [slurm-users] slurmdbd database usage

2023-08-02 Thread Christopher Samuel
On 8/2/23 2:30 pm, Sandor wrote: I am looking to track accounting and job data. Slurm requires the use of MySQL or MariaDB. Has anyone created the needed tables within PostGreSQL then had slurmdbd write to it? Any problems? From memory (and confirmed by git) support for Postgres was removed

Re: [slurm-users] slurmdbd database usage

2023-08-02 Thread Michael Gutteridge
Pretty sure that dog won't hunt. There's not _just_ the tables, but I believe there's a bunch of queries and other database magic in slurmdbd that is specific to MySQL/MariaDB. - Michael On Wed, Aug 2, 2023 at 2:33 PM Sandor wrote: > I am looking to track accounting and job data. Slurm requi

[slurm-users] slurmdbd database usage

2023-08-02 Thread Sandor
I am looking to track accounting and job data. Slurm requires the use of MySQL or MariaDB. Has anyone created the needed tables within PostGreSQL then had slurmdbd write to it? Any problems? Thank you in advance! Sandor Felho

Re: [slurm-users] [EXTERNAL] Re: Job in "priority" status - resources available

2023-08-02 Thread Greg Wickham
Following on from what Michael said, the default Slurm configuration is to allocate only one job per node. If GRES a100_1g.10gb is on the same node ensure to enable “SelectType=select/cons_res” (info at https://slurm.schedmd.com/cons_res.html) to permit multiple jobs to use the same node. Also

[slurm-users] SLUG - Last Chance to Register!

2023-08-02 Thread Victoria Hobson
Final Call for Standard Registration! Slurm User Group 2023 (SLUG) standard pricing ends this Friday, August 4th! Register now, before it's too late. Join us September 12th -13th at Brigham Young University in Provo, Utah. We'll kick off the week with a Welcome Reception at the Provo Marriott Hot

Re: [slurm-users] stopping job array after N failed jobs in row

2023-08-02 Thread Michael DiDomenico
On Tue, Aug 1, 2023 at 3:27 PM Daniel Letai wrote: > The other OTHER approach might be to use some epilog (or possibly > epilogslurmctld) to log exit codes for first 20 tasks in each array, and > cancel the array if non-zero. This is a global approach which will affect all > job arrays, so migh

Re: [slurm-users] Job in "priority" status - resources available

2023-08-02 Thread Michael Gutteridge
I'm not sure there's enough information in your message- Slurm version and configs are often necessary to make a more confident diagnosis. However, the behaviour you are looking for (lower priority jobs skipping the line) is called "backfill". There's docs here: https://slurm.schedmd.com/sched_co

[slurm-users] Job in "priority" status - resources available

2023-08-02 Thread Cumer Cristiano
Hello, I'm quite a newbie regarding Slurm. I recently created a small Slurm instance to manage our GPU resources. I have this situation: JOBIDSTATE TIME ACCOUNTPARTITIONPRIORITY REASON CPU MIN_MEM TRES_PER_NODE 1739PENDING 0:0