Actually I hit sent too quickly, what I meant (assuming bash) is
for a in $(scontrol show hostname whatever_list); do touch $a; done
with the same whatever_list being $SLURM_JOB_NODELIST
On Fri, Feb 14, 2025 at 1:18 PM Davide DelVento
wrote:
> Not sure I completely understand what you n
Not sure I completely understand what you need, but if I do... How about
touch whatever_prefix_$(scontrol show hostname whatever_list)
where whatever_list could be your $SLURM_JOB_NODELIST ?
On Fri, Feb 14, 2025 at 9:42 AM John Hearns via slurm-users <
slurm-users@lists.schedmd.com> wrote:
> I
I believe in absence of other reasons, slurm assigns nodes to jobs in the
order they are listed in the partition definitions of slurm.conf -- perhaps
for whatever reason the node 41 appears first there, rather than 01?
On Thu, Jan 9, 2025 at 7:24 AM Dan Healy via slurm-users <
slurm-users@lists.sc
Wonderful. Thanks Ole for the reminder! I had bookmarked your wiki (of
course!) but forgot to check it out in this case. I'll add a more prominent
reminder to self in my notes to always check it!
Happy new year everybody once again
On Tue, Jan 7, 2025 at 1:58 AM Ole Holm Nielsen via slurm-users <
Found it, I should have asked to my puppet as it's mandatory in some places
:-D
It is simply
scontrol show hostname gpu[01-02],node[03-04,12-22,27-32,36]
Sorry for the noise
On Mon, Jan 6, 2025 at 12:55 PM Davide DelVento
wrote:
> Hi all,
> I remember seeing on this list a slurm
Hi all,
I remember seeing on this list a slurm command to change a slurm-friendly
list such as
gpu[01-02],node[03-04,12-22,27-32,36]
into a bash friendly list such as
gpu01
gpu02
node03
node04
node12
etc
I made a note about it but I can't find my note anymore, nor the relevant
message. Can some
> Diego
>
> Il 07/12/2024 10:03, Diego Zuccato via slurm-users ha scritto:
> > Ciao Davide.
> >
> > Il 06/12/2024 16:42, Davide DelVento ha scritto:
> >
> >> I find it extremely hard to understand situations like this. I wish
> >> Slurm were more
Mmmm, from https://slurm.schedmd.com/sbatch.html
> By default both standard output and standard error are directed to a file
of the name "slurm-%j.out", where the "%j" is replaced with the job
allocation number.
Perhaps at your site there's a configuration which uses separate error
files? See the
Ciao Diego,
I find it extremely hard to understand situations like this. I wish Slurm
were more clear on how it reported what it is doing, but I digress...
I suspect that there are other job(s) which have higher priority than this
one which are supposed to run on that node but cannot start because
Another possible use case of this is a regular MPI job where the
first/controller task often uses more memory than the workers and may need
to be scheduled on a higher memory node than them. I think I saw this
happening in the past, but I'm not 100% sure it was in Slurm or some other
scheduling sys
Not sure if I understand your use case, but if I do I am not sure if Slurm
provides that functionality.
If it doesn't (and if my understanding is correct), you can still achieve
your goal by:
1) removing sbatch and salloc from user's path
2) writing your own custom scripts named sbatch (and hard/s
Slurm 18? Isn't that a bit outdated?
On Fri, Sep 27, 2024 at 9:41 AM Robert Kudyba via slurm-users <
slurm-users@lists.schedmd.com> wrote:
> We're in the process of upgrading but first we're moving to RHEL 9. My
> attempt to compile using rpmbuild -v -ta --define "_lto_cflags %{nil}"
> slurm-18.
Thanks everybody once again and especially Paul: your job_summary script
was exactly what I needed, served on a golden plate. I just had to
modify/customize the date range and change the following line (I can make a
PR if you want, but it's such a small change that it'd take more time to
deal with
Ciao Fabio,
That for sure is syntactically incorrect, because the way sbatch parsing
works: as soon as it finds a non-empy non-comment line (your first srun) it
will stop parsing for #SBATCH directives. So assuming this is a single file
as it looks from the formatting, the second hetjob and the cl
owing that the problem won't happen again in the future.
Thanks and have a great weekend
On Fri, Aug 23, 2024 at 8:00 AM Ole Holm Nielsen via slurm-users <
slurm-users@lists.schedmd.com> wrote:
> Hi Davide,
>
> On 8/22/24 21:30, Davide DelVento via slurm-users wrote:
> >
I am confused by the reported amount of Down and PLND Down by sreport.
According to it, our cluster would have had a significant amount of
downtime, which I know didn't happen (or, according to the documentation
"time that slurmctld was not responding", see
https://slurm.schedmd.com/sreport.html)
Hi Ole,
On Wed, Aug 21, 2024 at 1:06 PM Ole Holm Nielsen via slurm-users <
slurm-users@lists.schedmd.com> wrote:
> The slurmacct script can actually break down statistics by partition,
> which I guess is what you're asking for? The usage of the command is:
>
Yes, this is almost what I was askin
;
> inside jobs to emulate a login session, causing a heavy load on your
> servers.
>
> /Ole
>
> On 8/21/24 01:13, Davide DelVento via slurm-users wrote:
> > Thanks Kevin and Simon,
> >
> > The full thing that you do is indeed overkill, however I was able to
> l
Thanks Kevin and Simon,
The full thing that you do is indeed overkill, however I was able to learn
how to collect/parse some of the information I need.
What I am still unable to get is:
- utilization by queue (or list of node names), to track actual use of
expensive resources such as GPUs, high
Since each instance of the program is independent and you are using one
core for each, it'd be better to leave slurm deal with that and schedule
them concurrently as it sees fit. Maybe you simply need to add some
directive to allow shared jobs on the same node.
Alternatively (if at your site jobs m
g text output of squeue command)
>
> cheers
>
> josef
>
> --
> *From:* Davide DelVento via slurm-users
> *Sent:* Wednesday, 14 August 2024 01:52
> *To:* Paul Edmon
> *Cc:* Reid, Andrew C.E. (Fed) ; Jeffrey T Frey <
> f...@udel.edu>; slurm-users@lists.schedm
I too would be interested in some lightweight scripts. XDMOD in my
experience has been very intense in workload to install, maintain and
learn. It's great if one needs that level of interactivity, granularity and
detail, but for some "quick and dirty" summary in a small dept it's not
only overkill,
How about SchedMD itself? They are the ones doing most (if not all) of the
development, and they are great.
In my experience, the best options are either SchedMD or the vendor of your
hardware.
On Mon, Aug 12, 2024 at 11:17 PM John Joseph via slurm-users <
slurm-users@lists.schedmd.com> wrote:
>
I am pretty sure with vanilla slurm is impossible.
What it might be possible (maybe) is submitting 5 core jobs and using some
pre-post scripts which immediately before the job start change the
requested number of cores to "however are currently available on the node
where it is scheduled to run".
In part, it depends on how it's been configured, but have you tried
--exclusive?
On Thu, Aug 1, 2024 at 7:39 AM Henrique Almeida via slurm-users <
slurm-users@lists.schedmd.com> wrote:
> Hello, everyone, with slurm, how to allocate a whole node for a
> single multi-threaded process?
>
>
> https:
I think the best way to do it would be to schedule the 10 things to be a
single slurm job and then use some of the various MPMD ways (the nitty
gritty details depend if each executable is serial, OpenMP, MPI or hybrid).
On Mon, Jul 8, 2024 at 2:20 PM Dan Healy via slurm-users <
slurm-users@lists.s
I don't really have an answer for you, just responding to make your message
pop out in the "flood" of other topics we've got since you posted.
On our cluster we configure cancelling our jobs because it makes more sense
for our situation, so I have no experience with that resume from being
suspende
Not exactly the answer to your question (which I don't know) but if you can
get to prefix whatever is executed with this
https://github.com/NCAR/peak_memusage (which also uses getrusage) or a
variant you will be able to do that.
On Thu, May 16, 2024 at 4:10 PM Emyr James via slurm-users <
slurm-us
{
"emoji": "👍",
"version": 1
}
--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com
Are you seeking something simple rather than sophisticated? If so, you can
use the controller local disk for StateSaveLocation and place a cron job
(on the same node or somewhere else) to take that data out via e.g. rsync
and put it where you need it (NFS?) for the backup control node to use
if/whe
Hi Jason,
I wanted exactly the same and was confused exactly like you. For a while it
did not work, regardless of what I tried, but eventually (with some help) I
figured it out.
What I set up and it is working fine is this globally
PreemptType = preempt/partition_prio
PreemptMode=REQUEUE
and th
Yes, that is what we are also doing and it works well.
Note that requesting a batch script for another user, one sees nothing
(rather than an error message saying that one does not have permissions)
On Fri, Feb 16, 2024 at 12:48 PM Paul Edmon via slurm-users <
slurm-users@lists.schedmd.com> wrote:
The simple answer is to just add a line such as
Licenses=whatever:20
and then request your users to use the -L option as described at
https://slurm.schedmd.com/licenses.html
This works very well, however it does not do enforcement like Slurm does
with other resources. You will find posts in this
Hi Sylvain,
For the series better late than never, is this still a problem?
If so, is this a new install or an update?
Whan environment/compiler are you using? The error
undefined reference to `__nv_init_env'
seems to indicate that you are doing something cuda-related which I think
you should not
If you would like the high watermark memory utilization after the job
completes, https://github.com/NCAR/peak_memusage is a great tool. Of course
it has the limitation that you need to know that you want that information
*before* starting the job, which might or might not a problem for your use
cas
I think it would be useful, yes, and mostly for the epilog script.
In the job script itself, you are creating such files, so some of the
proposed use cases are a bit tricky to get right in the way you described
them. For example, if you scp these files, you are scp'ing them to their
status before
ight try setting that default of PreemptMode=CANCEL and then set
> specific PreemptModes for all your partitions. That's what we do and it
> works for us.
>
> -Paul Edmon-
> On 1/12/2024 10:33 AM, Davide DelVento wrote:
>
> Thanks Paul,
>
> I don't understand
; work, I don't see anything in the documentation that indicates it
> wouldn't. So I suspect you have a typo somewhere in your conf.
>
> -Paul Edmon-
> On 1/11/2024 6:01 PM, Davide DelVento wrote:
>
> I would like to add a preemptable queue to our cluster. Actually I already
I would like to add a preemptable queue to our cluster. Actually I already
have. We simply want jobs submitted to that queue be preempted if there are
no resources available for jobs in other (high priority) queues.
Conceptually very simple, no conditionals, no choices, just what I wrote.
However i
Not an answer to your question, but if the jobs need to be subdivided, why
not submit smaller jobs?
Also, this does not sound like a slurm problem, but rather a code or
infrastructure issue.
Finally, are you typically able to ssh into the main node of each subtask?
In many places that is not allo
the other thing was true: I had
two lines one specifying job_script and the other job_comment and only the
last one was honored until I noticed and consolidated them in one line,
comma-separating the arguments...
On Mon, Dec 11, 2023 at 9:52 AM Davide DelVento
wrote:
> Forgot to mention: this
uler’s decisions on a pending job by running “qstat
> -j jobid”. But there doesn’t seem to be any functional equivalent with
> SLURM?
>
>
>
> Regards,
>
> Mike
>
>
>
>
>
> *From:* slurm-users *On Behalf Of
> *Davide DelVento
> *Sent:* Monday, Dec
Forgot to mention: this is with slurm 23.02.6 (apologize for the double
message)
On Mon, Dec 11, 2023 at 9:49 AM Davide DelVento
wrote:
> Following the example from https://slurm.schedmd.com/power_save.html
> regarding SuspendExcNodes
>
> I configured my slurm.conf with
>
> Su
In case it's useful to others: I've been able to get this working by having
the "no action" script stop the slurmd daemon and start it *with the -b
option*.
On Fri, Oct 6, 2023 at 4:28 AM Ole Holm Nielsen
wrote:
> Hi Davide,
>
> On 10/5/23 15:28, Davide DelVento wro
Following the example from https://slurm.schedmd.com/power_save.html
regarding SuspendExcNodes
I configured my slurm.conf with
SuspendExcNodes=node[01-12]:2,node[13-32]:2,node[33-34]:1,nodegpu[01-02]:1
SuspendExcStates=down,drain,fail,maint,not_responding,reserved
#SuspendExcParts=
(the nodes in
By getting "stuck" do you mean the job stays PENDING forever or does
eventually run? I've seen the latter (and I agree with you that I wish
Slurm will log things like "I looked at this job and I am not starting it
yet because") but not the former
On Fri, Dec 8, 2023 at 9:00 AM Pacey, Mike wro
A little late here, but yes everything Hans said is correct and if you are
worried about slurm (or other critical system software) getting killed by
OOM, you can workaround it by properly configuring cgroup.
On Wed, Dec 6, 2023 at 2:06 AM Hans van Schoot wrote:
> Hi Joseph,
>
> This might depend
t down while I am looking at it.
>
> Although, I do agree, the functionality of being able to have "keep at
> least X nodes up and idle" would be nice, that is not how I see this
> documented or working.
>
> Brian Andrus
> On 11/23/2023 5:12 AM, Davide DelVento wrote
ave at least X nodes up",
> which includes running jobs. So it stops any wait time for the first X jobs
> being submitted, but any jobs after that will need to wait for the power_up
> sequence.
>
> Brian Andrus
> On 11/22/2023 6:58 AM, Davide DelVento wrote:
>
>
I assume you mean the sentence about dynamic MIG at
https://slurm.schedmd.com/gres.html#MIG_Management
Could it be supported? I think so, but only if one of their paying
customers (that could be you) asks for it.
On Wed, Nov 22, 2023 at 11:24 AM Aaron Kollmann <
aaron.kollm...@student.hpi.de> wrot
I've started playing with powersave and have a question about
SuspendExcNodes. The documentation at
https://slurm.schedmd.com/power_save.html says
For example nid[10-20]:4 will prevent 4 usable nodes (i.e IDLE and not
DOWN, DRAINING or already powered down) in the set nid[10-20] from being
powered
Not sure if that's what you are looking for, Joseph, but I believe
ClusterVisor and Bright do provide some basic Slurm management as a web GUI.
I don't think either is available outside of the support for hw purchased
from the respective vendors.
See e.g. https://www.advancedclustering.com/products
I don't have an answer for you, but I found your message in my spam folder.
I brought it out and I'm replying to it in the hope that it gets some
visibility in people's mailboxes.
Note that in the US it's SC week and many people are or have been busy with
it and will be travelling in the next days
>
> Having a large number of researchers able to run arbitrary code on the
> same submit host has a marked tendency to result in an overloaded host.
> There are various ways to regulate that ranging from "constant scolding" to
> "aggressive quotas/cgroups/etc", but all involve some degree of
> inco
Not a direct answer to your question, but have you looked at Open OnDemand?
Or maybe JupyterHub?
I think most places today prefer to do either of those which provide
somewhat the functionality you asked - and much more.
On Thu, Nov 9, 2023 at 4:17 PM Chip Seraphine
wrote:
> Hello,
>
> Our users
Not sure if it's the largest, but LUMI is a very large one
https://www.top500.org/system/180048/
https://docs.lumi-supercomputer.eu/runjobs/scheduled-jobs/partitions/
On Sun, Oct 29, 2023 at 4:16 AM John Joseph wrote:
> Dear All,
> Like to know that what is the maximum scalled up instance of SL
>
> I am working on SLURM 23.11 version.
>
???
Latest version is slurm-23.02.6 which one are you referring to?
https://github.com/SchedMD/slurm/tags
>
quot;-g" in slurm/configure file, and I
> wonder if I should add the "-g" to some other locations?
>
> Regards
>
> On Sat, Oct 21, 2023 at 12:47 AM Davide DelVento
> wrote:
>
>> Have you compiled slurm yourself or have you installed binaries? If the
&
Have you compiled slurm yourself or have you installed binaries? If the
latter, I speculate this is not possible, in that it would not have been
compiled with the required symbols (above all "-g" but probably others
depending on your platform).
If you compiled slurm yourself, and assuming you have
I'd be interested in this too, and I'm reposting only because the message
was flagged as both "dangerous email" and "spam", so people may not have
seen it (hopefully my reply will not suffer the same downfall...)
On Mon, Oct 16, 2023 at 3:26 AM Taras Shapovalov
wrote:
> Hello,
>
> In the past it
I don't think there is such a guarantee and in fact my reading of
https://slurm.schedmd.com/power_save.html#images means that most likely the
nodes can and will be mingled together and your script should untangle that.
But as you probably guessed from my other message, I'm new to powersave in
slur
Hi Ole,
Thanks for getting back to me.
> the great presentation
> > from our own
> I presented that talk at SLUG'23 :-)
>
Yes! That's why I wrote "from our own", but perhaps these are local slangs
where I live (and English is my second language)
> > 1) I'm not sure I fully understand ReconfigF
ct 4, 2023 at 7:47 PM Davide DelVento
wrote:
> And weirdly enough it has now stopped working again, after I did the
> experimentation for power save described in the other thread.
> That is really strange. At the highest verbosity level the logs just say
>
> slurmdbd: deb
:192.168.2.254 CONN:13
I reconfigured and reverted stuff to no change. Does anybody have any clue?
On Tue, Oct 3, 2023 at 5:43 PM Davide DelVento
wrote:
> For others potentially seeing this on mailing list search, yes, I needed
> that, which of course required creating an account charge which I
I'm experimenting with slurm powersave and I have several questions. I'm
following the guidance from https://slurm.schedmd.com/power_save.html and
the great presentation from our own
https://slurm.schedmd.com/SLUG23/DTU-SLUG23.pdf
I am running slurm 23.02.3
1) I'm not sure I fully understand Reco
th active
> users.
>
> -Paul Edmon-
> On 10/3/23 9:01 AM, Davide DelVento wrote:
>
> By increasing the slurmdbd verbosity level, I got additional information,
> namely the following:
>
> slurmdbd: error: couldn't get information for this user (null)(x
hanks!
On Mon, Oct 2, 2023 at 9:20 AM Davide DelVento
wrote:
> Thanks Paul, this helps.
>
> I don't have any PrivateData line in either config file. According to the
> docs, "By default, all information is visible to all users" so this should
> not be an issue. I tried
ut that didn't change the behavior.
On Mon, Oct 2, 2023 at 9:10 AM Paul Edmon wrote:
> At least in our setup, users can see their own scripts by doing sacct -B
> -j JOBID
>
> I would make sure that the scripts are being stored and how you have
> PrivateData set.
>
> -P
ssion" setting. FWIW, we use LDAP.
Is that the expected behavior, in that by default only root can see the job
scripts? I was assuming the users themselves should be able to debug their
own jobs... Any hint on what could be changed to achieve this?
Thanks!
On Fri, Sep 29, 2023 at 5:48
I don't really have an answer for you other than a "hallway comment", that
it sounds like a good thing which I would test with a simulator, if I had
one. I've been intrigued by (but really not looked much into)
https://slurm.schedmd.com/SLUG23/LANL-Batsim-SLUG23.pdf
On Fri, Sep 29, 2023 at 10:05 A
r each user in each location.
>
> Also it should be noted that there is no way to prune out job_scripts or
> job_envs right now. So the only way to get rid of them if they get large is
> to 0 out the column in the table. You can ask SchedMD for the mysql command
> to do this a
In my current slurm installation, (recently upgraded to slurm v23.02.3), I
only have
AccountingStoreFlags=job_comment
I now intend to add both
AccountingStoreFlags=job_script
AccountingStoreFlags=job_env
leaving the default 4MB value for max_script_size
Do I need to do anything on the DB mysel
Has anyone else noticed this issue and knows more about it?
https://bugs.schedmd.com/show_bug.cgi?id=16976
Mitigation by preventing users submitting many jobs works, but only to a
point.
I run a cluster we bought from ACT and recently updated to ClusterVisor v1.0
The new version has (among many things) a really nice view of individual
jobs resource utilization (GPUs, memory, CPU, temperature, etc). I did not
pay attention to the overall statistics, so I am not sure how CV fares
th
Actually rm -r does not give ANY warning, so in plain Linux "rm -r /" run
as root would destroy your system without notice. Your particular Linux
distro may have implemented safeguards with a shell alias such as `alias
rm='rm -i'` and that's a common thing, but not guaranteed to be there
On Thu, J
Can you ssh into the node and check the actual availability of memory?
Maybe there is a zombie process (or a healthy one with a memory leak bug)
that's hogging all the memory?
On Thu, May 25, 2023 at 7:31 AM Roger Mason wrote:
> Hello,
>
> Doug Meyer writes:
>
> > Could also review the node log
At a place I worked before, we used XDMOD several years ago. It was a bit
tricky to set up correctly and not exactly intuitive to get started with
data collection as a user (managers, allocation specialists and
other not-super-technical people were most of our users). But when
familiarized with it,
Ciao Matteo,
If you look through the archives, you will see I struggled with this
problem too. A few people suggested some alternatives, but in the end
I did not find anything really satisfying which did not require a ton
of work for me.
Another piece of the story is users requesting a license bu
> > And if you are seeing a workflow management system causing trouble on
> > your system, probably the most sustainable way of getting this resolved
> > is to file issues or pull requests with the respective project, with
> > suggestions like the ones you made. For snakemake, a second good point
>
gt; Cheers,
> Florian
>
> From: slurm-users on behalf of Davide
> DelVento
> Sent: Thursday, 16 February 2023 01:40
> To: Slurm User Community List
> Subject: [External] Re: [slurm-users] actual time of start (or finish) of a
> job
>
to the web version:
> https://slurm.schedmd.com/sacct.html.
>
> Best,
>
> Joseph
>
> --
> Joseph F. Guzman - ITS (Advanced Research Computing)
>
> Northern Arizona University
>
> joseph.f.guz...@nau.edu
>
I have a user who needs to find the actual start (or finish) time of a
number of jobs.
With the elapsed field of sacct start or finish become equivalent for
his search.
I see that information in /var/log/slurm/slurmctld.log so Slurm should
have it, however in sacct itself that information does not
Try first with small things like shell scripts you write which would
tell you where the thing is running (e.g. by using hostname). Keep in
mind that what would happen will most importantly depend on the shell.
For example, if you use "sudo" you know that using wildcards is
tricky, because your user
It would be very useful if there were a way (perhaps a custom script
parsing the sacct output) to provide the information in the same
format as "scontrol show job"
Has anybody attempted to do that?
On Wed, Dec 14, 2022 at 1:25 AM Will Furnass wrote:
>
> If you pipe output into 'less -S' then yo
Thanks for helping me find workarounds.
> My only other thought is that you might be able to use node features &
> job constraints to communicate this without the user realising.
I am not sure I understand this approach.
> For instance you could declare the nodes where the software is installed
Hi Chris,
> Unfortunately it looks like the license request information doesn't get
> propagated into any prologs from what I see from a scan of the
> documentation. :-(
Thanks. If I am reading you right, I did notice the same thing and in
fact that's why I wrote that job_submit lua script which
ich user will execute the scripts
>
> https://slurm.schedmd.com/prolog_epilog.html
>
> Maybe the variable isn't set for the user executing the
> prolog/epilog/taskprolog
>
> Jeff
>
> ____
> From: slurm-users on behalf of Davide
>
My problem: grant licensed software availability to my users only if
they request it on slurm; for now with local licenses.
I wrote a job_submit lua script which checks job_desc.licenses and if
it contains the appropriate strings it sets an appropriate
SOMETHING_LICENSE_REQ environmental variable.
as I will
describe in another thread.
On Sun, Sep 18, 2022 at 11:57 PM Bjørn-Helge Mevik
wrote:
>
> Davide DelVento writes:
>
> >> I'm curious: What kind of disruption did it cause for your production
> >> jobs?
> >
> > All jobs failed and went in pendin
x27;s possible that dialing up the verbosity on the
> slurmd logs may give that info but I haven't seen it in normal operating.
>
> -Paul Edmon-
>
> On 10/6/22 5:47 PM, Davide DelVento wrote:
> > Is there a simple way to check that whas slurm is running is what the
Is there a simple way to check that whas slurm is running is what the
config say it should be?
For example, my understanding is that changing cgroup.conf should be
followed by 'systemctl stop slurmd' on all compute nodes, then
'systemctl restart slurmctld' on the head node, then 'systemctl start
s
Perhaps just a very trivial question, but it doesn't look you
mentioned it: does your X-forwarding work from the login node? Maybe
the X-server on your client is the problem and trying xclock on the
login node would clarify that
On Wed, Oct 5, 2022 at 12:03 PM Allan Streib wrote:
>
> Hi everyone,
At my previous job there were cron jobs running everywhere measuring
possibly idle cores which were eventually averaged out for the
duration of the job, and reported (the day after) via email to the
user support team.
I believe they stopped doing so when compute became (relatively) cheap
at the exp
Are your licenses used only for the slurm cluster(s) or are they
shared with laptops, workstations and/or other computing equipment not
managed by slurm?
In the former case, the "local" licenses described in the
documentation will do the trick (but slurm does not automatically
enforce their use, so
scales well, but it looks like you have a rather beginner cluster that
> would never be impacted by such choices.
>
> Brian Andrus
>
>
> On 9/16/2022 10:00 AM, Davide DelVento wrote:
> > Thanks Brian.
> >
> > I am still perplexed. What is a database to install, a
ironment. The 2nd step would be dependent on what things you are
> tracking within that.
>
> Brian Andrus
>
>
> On 9/16/2022 5:01 AM, Davide DelVento wrote:
> > So if I understand correctly, this "remote database" is something that
> > is neither part of slu
Thanks a lot.
> > Does it need the execution permission? For root alone sufficient?
>
> slurmd runs as root, so it only need exec perms for root.
Perfect. That must have been then, since my script (like the example
one) did not have the execution permission on.
> I'm curious: What kind of disrup
Thanks to both of you.
> Permissions on the file itself (and the directories in the path to it)
Does it need the execution permission? For root alone sufficient?
> Existence of the script on the nodes (prologue is run on the nodes, not the
> head)
Yes, it's in a shared filesystem.
> Not sure
a
> certain number are allowed by each cluster and change that if needed.
>
> If you got creative, you could keep the license count that is in the
> database updated to match the number free from flexlm to stop license
> starvation due to users outside slurm using them up so they
I am a bit confused by remote licenses.
https://lists.schedmd.com/pipermail/slurm-users/2020-September/006049.html
(which is only 2 years old) claims that they are just a counter, so
like local licenses. Then why call them remote?
Only a few days after, this
https://lists.schedmd.com/pipermail/sl
1 - 100 of 111 matches
Mail list logo