I checked the source code, and now belive it is a bug of
select/cons_res plugin (or intended behavior).
In src/plugins/select/cons_res/dist_tasks.c (tag slurm-18-08-0-1),
line 1863~1869 triggers block allocation if SelectTypeParameters is
not set to CR_CORE / CR_SOCKET, without regard to task dist
After upgrading from Slurm 17.11.5 to 17.11.9-2 yesterday, and then to
17.11.10 today, I'm periodically seeing the message "Task %d reported
exit for a second time" on both 17.11.9-2 and 17.11.10. Sometimes
there's just a single such message, and other times there are a slew of
messages.
Any
Hi Nathan,
we follow your input but we weren't able to succeed only changing xauth.
Current status (we've updare to 18.08.1)
is that we log in with X2go; DISPLAY is =:50 (for example)
we set it to DISPLAY=$HOSTNAME:50.0
and after stopping firewalld, disabling SELinux and changing a bit
sshd_conf
Hi,
According to https://slurm.schedmd.com/cpu_management.html,
> The default allocation method within a node is cyclic allocation (allocate
> available CPUs in a round-robin fashion across the sockets within a node).
Not a native English speaker. I think the sentense means that: if a
job alloca
So if you use the showq utility it has functionality for that:
https://github.com/fasrc/slurm_showq
Happy to have contributors to this.
-Paul Edmon-
On 10/05/2018 09:56 AM, Alexandre Strube wrote:
Is there a way to show the actual position in the queue, given the
current priority? It’s possi
Is there a way to show the actual position in the queue, given the current
priority? It’s possible to compute it, but I would like to see it as an
ordinal…
--
[]
Alexandre Strube
su...@ubuntu.com
We use “maintenance” reservations to prevent nodes from receiving production
jobs.
https://slurm.schedmd.com/reservations.html
Create a reservation with “flags=maint” and it will override other reservations
(if they exist).
-greg
> On 5 Oct 2018, at 4:06 PM, Michael Di Domenico wrote:
>
>
A reservation overlapping with times you have the node in drain?
Drain and reserve:
# scontrol update nodename=node[037] state=drain reason=“testing"
# scontrol create reservation users=renfro reservationname='drain_test'
nodes=node[037] starttime=2018-10-05T08:17:00 endtime=2018-10-05T09:00:00
You could reconfigure the partition node lists on the fly using scontrol:
$ scontrol update PartitionName=regular_part1 Nodes=
:
$ scontrol update PartitionName=regular_partN Nodes=
$ scontrol update PartitionName=maint Nodes=r00n00
Should be easy enough to write a script that find the parti
Is anyone on the list using maintenance partitions for broken nodes?
If so, how are you moving nodes between partitions?
The situation with my machines at the moment, is that we have a steady
stream of new jobs coming into the queues, but broken nodes as well.
I'd like to fix those broken nodes an
Hi,
> On 3 Oct 2018, at 16:51, Andy Georges wrote:
>
> Hi all,
>
>> On 15 Sep 2018, at 14:47, Chris Samuel wrote:
>>
>> On Thursday, 13 September 2018 3:10:19 AM AEST Paul Edmon wrote:
>>
>>> Another way would be to make all your Linux users and then map that in to
>>> Slurm using sacctmgr.
11 matches
Mail list logo