I think that the main reason is the lack of access to some /dev "files" in
your docker container. For singularity nvidia plugin is required, maybe
there is something similar for docker...
Cheers,
Marcin
-
https://funinit.wordpress.com
On Wed, 2 Jan 2019, 05:53 허웅 Hi Chris.
>
>
>
> T
Their is an option for that in slurm.conf, check man but it's something
like FirstJobId ; )
Cheers,
Marcin
funinit.wordpress.com
On Mon, 24 Dec 2018, 06:13 Hanby, Mike Howdy,
>
>
>
> We installed a new server to take over the duties of the Slurm master. I
> imported our accounting database int
I have very similar issue for quite a time and I was unable to find its
root cause. Are you using sssd and AD as a data source with only a subtree
of entries searched - this is my case.
Did you disable users enumeration? It also what I have. I didn’t find ang
evidence that it’s related but... may
I had exactly the same requirement - you can find my notes from it here;
https://funinit.wordpress.com/2018/06/07/how-to-use-job_submit_lua-with-slurm/
cheers,
Marcin
wt., 6 lis 2018 o 20:48 Sam Hawarden napisał(a):
> Hi Yair,
>
>
> You can set maxsubmitjob=0 on an account.
>
>
> The error mes
As far as I remember sprio does the calculation on its own when executed
and priority in job structure stored by slurmctl is updated periodically...
maybe this is the answer ?
cheers,
Marcin
śr., 17 paź 2018 o 00:42 Glen MacLachlan napisał(a):
>
> Hi all,
>
> I'm using slurm 17.02.8 and when I
This should be quite easy..
if job_desc.min_cpus or job_desc.min_cpus < YOUR_NUMBER then
job_desc.partition = "YourPartition"
end
Check slurm.h definition of job_descriptior and (small self advert but
maybe helpful..) you can also check my blog post on job_submit/lua (
https://funinit.wordpress.
>From my experience it's the question about your future sacct queries. If
you are not going to list a lot of jobs that were executed long time ago vm
with a few gb of ram should be fine. It depends on the numer of jobs you
expect a lot of small or a few big. Nevertheless, if you think that you'll
b
I spent some time debugging issues I had working on lua script for
job_submit_lua. Ended up with notes in form of blog post. Sharing for those
who may have similar issues
https://funinit.wordpress.com/2018/06/07/how-to-use-job_submit_lua-with-slurm/
cheers,
Marcin
For me is a shortcut for description
On Thu, 26 Apr 2018 at 15:29, Loris Bennett
wrote:
> Hi,
>
> I'm currently looking at ironing out a few crinkles in my account
> hierarchy and was looking at the attributes 'Parent' and 'Organisation'
> again. I use 'Parent' to set up the account hierarchy,
Check config.log, is pkg-config aware of paths to your lua shared libraries?
cheers,
Marcin
In our environment we're getting various statisting to grafana, where we
have dashboards designed for IT team (either to be used as one displayed on
TV or something we use from time to time to foresee future limitations or
unsed resources ), but we also have dashboards for our management to help
th
r
him.
cheers,
Marcin
2018-03-15 11:00 GMT+01:00 Marcin Stolarek :
> I'm working on a priority multifactor plugin configuration and I'm not
> sure if I'm missing something or the behaviour I see is the result of bug.
> Basically
>
> # sshare | grep XX
> X
I'm working on a priority multifactor plugin configuration and I'm not sure
if I'm missing something or the behaviour I see is the result of bug.
Basically
# sshare | grep XX
XX10.0714294367
0.031536 0.736368
which I read as fairshare factor = 0.
Check returntoservice parameter in slurm.conf
On Mon, 5 Feb 2018 at 20:30, Guy - wrote:
> Hi,
> I've compiled and installed slurm on ubuntu. it works great but if I take
> a node down by running slurmd stop and start, it keeps appearing as DOWN
> (Not responding)
> The only fix is restarting slu
We're using icinga2 storing accounting data in influxdb for grafana
dashboards. In terms of monitoring I prefere end-user functionality, so
apart from services we also have a plugin that submits a jobs to cluster
(to idle nodes, with a few minutes of deadline) the job simply creates
files on shared
I think that it depends on your kernel and the way the cluster is booted
(for instance initrd size). You can check the memory used by kernel in
dmesg output - search for the line starting with "Memory:". This is fixed.
It may be also good idea to "reserve" some space for cache and buffers -
check h
If nothing changed recently the shared filesystem like nfs/gpfs/lustre is a
requirement for normal cluster configuration. You can workaround it with
prologue/spank plugins but honestly I haven't seen real hpc cluster
without shared filesystem.
cheers,
Marcin
2018-01-05 23:25 GMT+01:00 Andrew Mel
You can use slurmctl prologue to save it the way you want.
cheers,
Marcin
2017-11-30 0:25 GMT+01:00 Chris Samuel :
> On 30/11/17 8:57 am, Jacob Chappell wrote:
>
> Using "scontrol show jobid X" I can see info about running jobs, including
>> the command used to launch the job, the user's worki
I don't see any reason.
You can try the attached lines, I've also sent it to schedmd to check if
there is any reason someone should not do that
https://bugs.schedmd.com/show_bug.cgi?id=4496
cheers,
Marcin
2017-12-08 23:08 GMT+01:00 E.M. Dragowsky :
> Greetings --
>
> According to the documenta
I think it's more related to your configuration than general slurm
capabilities. For example if you have quite long prolog/epilog scripts it
may be good idea to discourage users from submitting huge job arrays (with
very short tasks?).
In my case it's quite common to see users submitting arrays wi
20 matches
Mail list logo