Re: [slurm-users] $TMPDIR does not honor "TmpFS"

2018-11-21 Thread Christopher Samuel
On 22/11/18 12:38 am, Douglas Duckworth wrote: We are setting TmpFS=/scratchLocal in /etc/slurm/slurm.conf on nodes and controller. However $TMPDIR value seems to be /tmp not /scratchLocal. As a result users are writing to /tmp which we do not want. Our solution to that was to use a plugin th

Re: [slurm-users] About x11 support

2018-11-21 Thread Christopher Samuel
On 22/11/18 5:04 am, Mahmood Naderan wrote: The idea is to have a job manager that find the best node for a newly submitted job. If the user has to manually ssh to a node, why one should use slurm or any other thing? You are in a really really unusual situation - in 15 years I've not come ac

Re: [slurm-users] How to check the percent cpu of a job?

2018-11-21 Thread Christopher Samuel
On 22/11/18 5:41 am, Ryan Novosielski wrote: You can see, both of the above are examples of jobs that have allocated CPU numbers that are very different from the ultimate CPU load (the first one using way more than allocated, though they’re in a cgroup so theoretically isolated from the other us

Re: [slurm-users] How to check the percent cpu of a job?

2018-11-21 Thread Ole Holm Nielsen
On 21-11-2018 19:41, Ryan Novosielski wrote: Olm’s “pestat” script does allow you to get similar information, but I’m interested to see if indeed there’s a better answer. I’ve used his script for more or less the same reason, to see if the jobs are using the resources they’re allocated. They s

Re: [slurm-users] How to check the percent cpu of a job?

2018-11-21 Thread Ole Holm Nielsen
Hi Yalei, On 21-11-2018 18:51, 宋亚磊 wrote: How to check the percent cpu of a job in slurm? I tried sacct, sstat, squeue, but I can't find that how to check. Can someone help me? I would recommend my "pestat" tool, which was also announced on the list today. The CPUload is one of the many sta

Re: [slurm-users] About x11 support

2018-11-21 Thread Tina Friedrich
I agree with you on that one - I'd forgotten about that detail. The having to actually do an 'ssh -X' before you can do 'srun --x11' is quite silly, and a bit aggravating. You can do 'ssh -X localhost' and then try the srun; that should work, as well. Tina On 21/11/2018 18:04, Mahmood Naderan

Re: [slurm-users] How to check the percent cpu of a job?

2018-11-21 Thread Ryan Novosielski
Olm’s “pestat” script does allow you to get similar information, but I’m interested to see if indeed there’s a better answer. I’ve used his script for more or less the same reason, to see if the jobs are using the resources they’re allocated. They show at a node level though, and then you have t

Re: [slurm-users] How to check the percent cpu of a job?

2018-11-21 Thread 宋亚磊
Hi Jing, thank you! The following command show us the cpu load of the node, $ scontrol show node | grep CPULoad but I want the percent cpu of the job, like top or ps. For examplt, a job allocated 10 cpus, but it just use 2, so the percent cpu should be 200%, not be 1000%, I want konw this. A

Re: [slurm-users] About x11 support

2018-11-21 Thread Mahmood Naderan
>The 'fix' for Mahmood would be to ssh to another host and then submit >the X11 job. The idea is to have a job manager that find the best node for a newly submitted job. If the user has to manually ssh to a node, why one should use slurm or any other thing? Regards, Mahmood

Re: [slurm-users] How to check the percent cpu of a job?

2018-11-21 Thread Jing Gong
Hi, > How to check the percent cpu of a job in slurm? We use command "scontrol" likes $ scontrol show node | grep CPULoad ... CPUAlloc=48 CPUErr=0 CPUTot=48 CPULoad=25.32 ... Regards, Jing From: slurm-users on behalf of 宋亚磊 Sent: Wednesday, No

[slurm-users] How to check the percent cpu of a job?

2018-11-21 Thread 宋亚磊
Hello everyone, How to check the percent cpu of a job in slurm? I tried sacct, sstat, squeue, but I can't find that how to check. Can someone help me? Best regards, Yalei

Re: [slurm-users] $TMPDIR does not honor "TmpFS"

2018-11-21 Thread Jeffrey T Frey
If you check the applicable code in src/slurmd/slurmstepd/task.c, TMPDIR is set to "/tmp" if it's not already set in the job environment and then TMPDIR is created if permissible. It's your responsibility to set TMPDIR -- e.g. we have a plugin we wrote (autotmp) to set TMPDIR to per-job and per

Re: [slurm-users] $TMPDIR does not honor "TmpFS"

2018-11-21 Thread Michael Gutteridge
I don't think that's a bug. As far as I've ever known, TmpFS is only used to tell slurmd where to look for available space (reported as TmpDisk for the node). The manpage only indicates that, not any additional functionality. We set TMPDIR in a task prolog: #!/bin/bash echo "export TMPDIR=/loc/

Re: [slurm-users] $TMPDIR does not honor "TmpFS"

2018-11-21 Thread Roberts, John E.
TmpFS in slurm.conf wasn’t being honored from my experience from at least v16.05.10. When I initially configured Slurm, I noticed this myself. As with the user below, we are also just setting this elsewhere. Thanks! John From: slurm-users on behalf of Shenglong Wang Reply-To: Slurm User Comm

Re: [slurm-users] Excessive use of backfill on a cluster

2018-11-21 Thread Baker D . J .
Hi Chris, Our SchedulerParameters are... SchedulerParameters = bf_window=3600,bf_resolution=180,bf_max_job_user=4 I gather that the "bf_window" should be as high as the highest maximum time limit on the partitions (set at 2.5 days = 3600 minutes). Best regards, David

Re: [slurm-users] Excessive use of backfill on a cluster

2018-11-21 Thread Baker D . J .
Hi Lois Thank you for sharing your multi priority configuration with us. I understand why you say about the QOS factor -- I've reduced it and increased the FS factor to see where that takes us. Our QOS factor is only there to ensure that test jobs gain a higher priority more quickly than other

Re: [slurm-users] $TMPDIR does not honor "TmpFS"

2018-11-21 Thread Shenglong Wang
We have TMPDIR setup inside prolog file. Hope users do not have absolute path /tmp inside their scripts. #!/bin/bash SLURM_BIN="/opt/slurm/bin" SLURM_job_tmp=/state/partition1/job-${SLURM_JOB_ID} mkdir -m 700 -p $SLURM_job_tmp chown $SLURM_JOB_USER $SLURM_job_tmp echo "export SLURM_JOBTMP=$SL

Re: [slurm-users] $TMPDIR does not honor "TmpFS"

2018-11-21 Thread Roger Moye
We are having the exact same problem with $TMPDIR. I wonder if a bug has crept in?I spoke to the SchedMD guys at SC18 last week and they were not aware of a bug but since more than one person is having this difficulty something must be wrong somewhere. -Roger From: slurm-users [mailto:sl

Re: [slurm-users] Updated Slurm tool "pestat" (Processor Element status)

2018-11-21 Thread Ryan Novosielski
Thanks Olm! I am quite fond of your utilities — thank you for providing them. Sent from my iPhone > On Nov 21, 2018, at 08:51, Ole Holm Nielsen > wrote: > > Dear Slurm users, > > The Slurm tool "pestat" (Processor Element status) has been enhanced due to a > user request. Now pestat will d

Re: [slurm-users] Slurm Accounting Question

2018-11-21 Thread Jessica Nettelblad
We save all job scripts by adding a line of scontrol write batch_script to our slurmctld.prolog. For example: test -e "$jobdir/jobscript" || timeout 15s scontrol write batch_script "${SLURM_JOBID}" "$jobdir/jobscript" Best regards, Jessica Nettelblad, UPPMAX On Wed, Nov 21, 2018 at 2:26 PM Dougla

[slurm-users] Updated Slurm tool "pestat" (Processor Element status)

2018-11-21 Thread Ole Holm Nielsen
Dear Slurm users, The Slurm tool "pestat" (Processor Element status) has been enhanced due to a user request. Now pestat will display an additional available GRES column for the nodes if the -G flag is used. This is useful if your nodes have GPUs installed. The pestat tool prints a Slurm c

[slurm-users] $TMPDIR does not honor "TmpFS"

2018-11-21 Thread Douglas Duckworth
Hi We are setting TmpFS=/scratchLocal in /etc/slurm/slurm.conf on nodes and controller. However $TMPDIR value seems to be /tmp not /scratchLocal. As a result users are writing to /tmp which we do not want. We are not setting $TMPDIR anywhere else such as /etc/profile.d nor do users have it d

Re: [slurm-users] Slurm Accounting Question

2018-11-21 Thread Douglas Duckworth
Hi Lois Thanks for letting us know! So out of the box there's no way to know what script for example a user ran in a particular job? Wish that feature existed as we sometimes have users who contact us regarding jobs that exited weeks ago. We will look into the solution you suggest. Thanks,