Re: [slurm-users] CPU allocation within a node is not cyclic

2018-10-05 Thread CUI Hao
I checked the source code, and now belive it is a bug of select/cons_res plugin (or intended behavior). In src/plugins/select/cons_res/dist_tasks.c (tag slurm-18-08-0-1), line 1863~1869 triggers block allocation if SelectTypeParameters is not set to CR_CORE / CR_SOCKET, without regard to task dist

[slurm-users] "Task %d reported exit for a second time" on Slurm 17.11.9-2 and 17.11.10

2018-10-05 Thread Andy Riebs
After upgrading from Slurm 17.11.5 to 17.11.9-2 yesterday, and then to 17.11.10 today, I'm periodically seeing the message "Task %d reported exit for a second time" on both 17.11.9-2 and 17.11.10. Sometimes there's just a single such message, and other times there are a slew of messages. Any

Re: [slurm-users] Slurm 18.08 X11 Errors

2018-10-05 Thread Luca Cenzato
Hi Nathan, we follow your input but we weren't able to succeed only changing xauth. Current status (we've updare to 18.08.1) is that we log in with X2go; DISPLAY is =:50 (for example) we set it to DISPLAY=$HOSTNAME:50.0 and after stopping firewalld, disabling SELinux and changing a bit sshd_conf

[slurm-users] CPU allocation within a node is not cyclic

2018-10-05 Thread CUI Hao
Hi, According to https://slurm.schedmd.com/cpu_management.html, > The default allocation method within a node is cyclic allocation (allocate > available CPUs in a round-robin fashion across the sockets within a node). Not a native English speaker. I think the sentense means that: if a job alloca

Re: [slurm-users] Position in queue?

2018-10-05 Thread Paul Edmon
So if you use the showq utility it has functionality for that: https://github.com/fasrc/slurm_showq Happy to have contributors to this. -Paul Edmon- On 10/05/2018 09:56 AM, Alexandre Strube wrote: Is there a way to show the actual position in the queue, given the current priority? It’s possi

[slurm-users] Position in queue?

2018-10-05 Thread Alexandre Strube
Is there a way to show the actual position in the queue, given the current priority? It’s possible to compute it, but I would like to see it as an ordinal… -- [] Alexandre Strube su...@ubuntu.com

Re: [slurm-users] maintenance partitions?

2018-10-05 Thread Greg Wickham
We use “maintenance” reservations to prevent nodes from receiving production jobs. https://slurm.schedmd.com/reservations.html Create a reservation with “flags=maint” and it will override other reservations (if they exist). -greg > On 5 Oct 2018, at 4:06 PM, Michael Di Domenico wrote: > >

Re: [slurm-users] maintenance partitions?

2018-10-05 Thread Renfro, Michael
A reservation overlapping with times you have the node in drain? Drain and reserve: # scontrol update nodename=node[037] state=drain reason=“testing" # scontrol create reservation users=renfro reservationname='drain_test' nodes=node[037] starttime=2018-10-05T08:17:00 endtime=2018-10-05T09:00:00

Re: [slurm-users] maintenance partitions?

2018-10-05 Thread Jeffrey Frey
You could reconfigure the partition node lists on the fly using scontrol: $ scontrol update PartitionName=regular_part1 Nodes= : $ scontrol update PartitionName=regular_partN Nodes= $ scontrol update PartitionName=maint Nodes=r00n00 Should be easy enough to write a script that find the parti

[slurm-users] maintenance partitions?

2018-10-05 Thread Michael Di Domenico
Is anyone on the list using maintenance partitions for broken nodes? If so, how are you moving nodes between partitions? The situation with my machines at the moment, is that we have a steady stream of new jobs coming into the queues, but broken nodes as well. I'd like to fix those broken nodes an

Re: [slurm-users] Create users

2018-10-05 Thread Andy Georges
Hi, > On 3 Oct 2018, at 16:51, Andy Georges wrote: > > Hi all, > >> On 15 Sep 2018, at 14:47, Chris Samuel wrote: >> >> On Thursday, 13 September 2018 3:10:19 AM AEST Paul Edmon wrote: >> >>> Another way would be to make all your Linux users and then map that in to >>> Slurm using sacctmgr.