On 01/07/2022 15:05, Chris Samuel wrote:
> On 29/6/22 09:01, Jean-Christophe HAESSIG wrote:
>
>> No, the job is placed through DRMAA API which enables programs to place
>> jobs in a cluster-agnostic way. Th program doesn't know it is talking to
>> Slurm. The DRMAA
On 29/06/2022 15:01, Jean-Christophe HAESSIG wrote:
Hi,
Turns out I had libslurm36_20.11.7+really20.11.4-2+deb11u1 but
slurm-wlm-basic-plugins was only at version 20.11.7+really20.11.4
because libslurm36 was installed some time after the last Slurm upgrade.
The libraries were incompatible but
On 28/06/2022 23:14, Chris Samuel wrote:
> On 28/6/22 12:19 pm, Jean-Christophe HAESSIG wrote:
Hi,
> I suspect this is where your error is happening:
>
> https://github.com/SchedMD/slurm/blob/1ce55318222f89fbc862ce559edfd17e911fee38/src/common/plugin.c#L284
>
>
Yes I
Hi,
I'm facing a weird issue where launching a job through drmaa
(https://github.com/natefoo/slurm-drmaa) aborts with the message "Plugin
is corrupted", but only when that job is placed from one of my compute
nodes. Running the command from the login node seems to work.
My cluster runs Slurm 2
On 09/03/2022 14:46, Loris Bennett wrote:
> Hi Jean-Christophe,
Hi,
>scontrol show runawayjobs
Thank you, I didn't know about that functionality. So, I undid the
fiddling I had done on the database and ran sacctmgr show runawayjobs.
It found the jobs and I 'fixed' them. Apparently it didn't
Hi,
I recently noticed impossible usage values returned by sreport, my
cluster was reportedly used at 100%.
Upon further investigation, I found about 6000 jobs launched on
2020-08-31 that were 'COMPLETED' but had their CPUTime still increasing,
amounting to about 500 days. The root cause for t
Le mercredi 04 novembre 2020 à 21:41 +, Sebastian T Smith a écrit :
> Hi,
Hi,
> We have Hyper-threading/SMT enabled on our cluster. It's challenging
> to fully utilize threads, as Brian suggests. We have a few workloads
> that benefit from it being enabled
Our cluster services tasks in the
Hi,
I would like to make good use of hyperthreaded processors and I already
skimmed through a quantity of posts and documentation.
It is pretty clear that Slurm likes to allocate processing units up to
the core level, and to be able to allocate threads one has to either :
- not declare Sockets/Co