Re: [slurm-users] Python and R installation in a SLURM cluster

2018-05-10 Thread Eric F. Alemany
Hi Lachlan, Thank you for sharing your environment. Everyone has their own set of rules and i appreciate everyone’s input. It seems as if the NFS share is a great place to start. Best, Eric _ Eri

Re: [slurm-users] Python and R installation in a SLURM cluster

2018-05-10 Thread Eric F. Alemany
Thank you Thomas for your suggestion. I take note of all the people’s comment and hope to come with a good solution. _ Eric F. Alemany System Administrator for Research Division of Radiation & C

[slurm-users] impact of changing SelectTypeParameters?

2018-05-10 Thread Liam Forbes
Might there be any "adverse" impacts to changing SelectTypeParameters=CR_CPU to SelectTypeParameters=CR_Socket_Memory? Our compute nodes are, for the most part, exclusive so users are allocated entire nodes, no sharing, for jobs. We explicitly specify CPUs, Sockets, CoresPerSocket, & ThreadsPerCore

Re: [slurm-users] Python and R installation in a SLURM cluster

2018-05-10 Thread Lachlan Musicman
On 11 May 2018 at 01:35, Eric F. Alemany wrote: > Hi All, > > I know this might sounds as a very basic question: where in the cluster > should I install Python and R? > Headnode? > Execute nodes ? > > And is there a particular directory (path) I need to install Python and R. > > Background: > SLU

[slurm-users] How to check if there's a reservation

2018-05-10 Thread Prentice Bisbal
Dear Slurm Users, We've started using maintenance reservations. As you would expect, this caused some confusion for users who were wondering why their jobs were queuing up and not running. Some of my users provide a public service of sorts that automatically submits jobs to our cluster. They w

Re: [slurm-users] How to access environment variables in submit script?

2018-05-10 Thread Dmitri Chebotarov
Thank you all for suggestions... I will take closer look at 'exec sbatch...' script. Running ''sbatch --output.. --error=..' works as well so far. On 5/10/18, 11:14 , "slurm-users on behalf of Michael Jennings" wrote: On Thursday, 10 May 2018, at 10:09:22 (-0400), Paul Edmon wrote:

Re: [slurm-users] Multiple accounts/partitions for a user

2018-05-10 Thread Mahmood Naderan
Yes thank you very much. Regards, Mahmood On Thu, May 10, 2018 at 7:42 PM, Simon Flood wrote: > On 10/05/18 15:59, Mahmood Naderan wrote: > > Yes it's possible for a user to be attached to more than one account or to > an account with more than partition. > > As per the error message you hav

[slurm-users] Historical License Usage by Jobs

2018-05-10 Thread Barry Moore
Hello All, Is it possible to track all jobs which requested a specific license? I am using Slurm 16.05.6. I looked through `sacct ... --format=all`, but maybe I am missing something. Thanks, Barry -- Barry E Moore II, PhD E-mail: bmoor...@pitt.edu Assistant Research Professor Center for Resea

[slurm-users] --uid , --gid option is root only now :'(

2018-05-10 Thread Christopher Benjamin Coffey
Hi, We noticed that recently --uid, and --gid functionality changed where previously a user in the slurm administrators group could launch jobs successfully with --uid, and --gid , allowing for them to submit jobs as another user. Now, in order to use --uid, --gid, you have to be the root user

Re: [slurm-users] Understanding gres binding

2018-05-10 Thread Kilian Cavalotti
Hi Paul, I'd first suggest to upgrade to 17.11.6, I think the first couple 17.11.x releases had some issues in terms of GRES binding. Then, I believe you also need to request all of your cores to be allocated on the same socket, if that's what you want. Something like --ntasks-per-socket=16. Her

Re: [slurm-users] [EXTERNAL]: Array Job Node Allocation

2018-05-10 Thread Hwa, George
This is exactly what we want do, spreading array jobs across nodes. The primary motivation for us is to achieve load-balancing. -Original Message- From: slurm-users [mailto:slurm-users-boun...@lists.schedmd.com] On Behalf Of Emyr James Sent: Tuesday, March 20, 2018 11:54 PM To: slurm-us

Re: [slurm-users] Python and R installation in a SLURM cluster

2018-05-10 Thread Thomas M. Payerle
Assuming you plan for users to use R in jobs, it will need to be accessible to the execute/compute nodes. I would usually suggest on a shared drive. Although it should be OK if locally installed on each compute node (probably want at same exact path and with same R packages installed). Presumabl

Re: [slurm-users] Built in X11 forwarding in 17.11 won't work on local displays

2018-05-10 Thread Nathan Harper
In our case we are using Nomachine rather than x2go, but the same should apply: I connect to a login node (eg loginnode01) via Nomachine and fire up a KDE workspace, open a terminal and ssh -X loginnode01 (so SSH back into the same host), then srun --x11 etc On Thu, 10 May 2018 at 17:19, Patrick

Re: [slurm-users] Built in X11 forwarding in 17.11 won't work on local displays

2018-05-10 Thread Patrick Goetz
On 05/09/2018 04:14 PM, Nathan Harper wrote: Yep, exactly the same issue. Our dirty workaround is to ssh -X back into the same host and it will work. Hi - Since I'm having this problem too, can you elaborate? You're ssh -X ing into a machine and then ssh -X ing back to the original host?

Re: [slurm-users] Python and R installation in a SLURM cluster

2018-05-10 Thread Eric F. Alemany
Thank you Simon for your quick reply. I liked the "(N)either" touch - makes sense ._ Eric F. Alemany System Administrator for Research Division of Radiation & Cancer Biology Department of Radia

Re: [slurm-users] Python and R installation in a SLURM cluster

2018-05-10 Thread Raymond Wan
Hi Eric, On 10/05/18 23:35, Eric F. Alemany wrote: > I know this might sounds as a very basic question: where in > the cluster should I install Python and R? > Headnode? > Execute nodes ? I don't think there is a fixed rule for a question like this and it depends on the compromise between what

Re: [slurm-users] Python and R installation in a SLURM cluster

2018-05-10 Thread Simon Flood
On 10/05/18 16:35, Eric F. Alemany wrote: I know this might sounds as a very basic question: where in the cluster should I install Python and R? Headnode? Execute nodes ? And is there a particular directory (path) I need to install Python and R. Background: SLURM on Ubuntu 18.04 1 headnode 4

Re: [slurm-users] Slurm Installation on different Unix environment

2018-05-10 Thread Paul Edmon
We build from source.  The build dependencies would be munge.  We just use the provided slurm.spec file (since we are building for CentOS).  We then distribute the rpms. You can build slurm from autotools as well which permits a more generic build and which you can then repackage into other pa

[slurm-users] Python and R installation in a SLURM cluster

2018-05-10 Thread Eric F. Alemany
Hi All, I know this might sounds as a very basic question: where in the cluster should I install Python and R? Headnode? Execute nodes ? And is there a particular directory (path) I need to install Python and R. Background: SLURM on Ubuntu 18.04 1 headnode 4 execute nodes NFS shared drive among

Re: [slurm-users] Multiple accounts/partitions for a user

2018-05-10 Thread Simon Flood
On 10/05/18 15:59, Mahmood Naderan wrote: Is it possible to assign a user to two partitions/accounts? I did that. But the sbatch isn't able to submit the job. [mahmood@rocks7 ~]$ cat slurm.sh #!/bin/bash #SBATCH --output=test.out #SBATCH --job-name=test #SBATCH --ntasks=6 #SBATCH --partition=PL

Re: [slurm-users] How to access environment variables in submit script?

2018-05-10 Thread Michael Jennings
On Thursday, 10 May 2018, at 10:09:22 (-0400), Paul Edmon wrote: > Not that I am aware of.  Since the header isn't really part of the > script bash doesn't evaluate them as far as I know. > > On 05/10/2018 09:19 AM, Dmitri Chebotarov wrote: > > > >Is it possible to access environment variables in

Re: [slurm-users] Slurm source installation

2018-05-10 Thread Ole Holm Nielsen
On 10-05-2018 16:56, Valeriana wrote: Hi Ole! Thanks for you help. I already checked this installation, but it didn't help me much. I am not using rpm, I am installing direct from the source code (configure, make and make install process). My question is: do I need these plugins on the computat

Re: [slurm-users] How to access environment variables in submit script?

2018-05-10 Thread Thomas M. Payerle
I don't believe that is possible. The #SBATCH lines are comments to the shell, so it does not do any variable expansion there. To my knowledge, Slurm does not do any variable expansion in the parameters either. If you really needed that sort of functionality, you would probably need to have someth

[slurm-users] Multiple accounts/partitions for a user

2018-05-10 Thread Mahmood Naderan
Hi Is it possible to assign a user to two partitions/accounts? I did that. But the sbatch isn't able to submit the job. [mahmood@rocks7 ~]$ cat slurm.sh #!/bin/bash #SBATCH --output=test.out #SBATCH --job-name=test #SBATCH --ntasks=6 #SBATCH --partition=PLAN1 #SBATCH --mem=8G mpirun /share/apps/me

Re: [slurm-users] Memory oversubscription and sheduling

2018-05-10 Thread Michael Jennings
On Thursday, 10 May 2018, at 20:02:37 (+1000), Chris Samuel wrote: > For instance there's the LBNL Node Health Check (NHC) system that plugs into > both Slurm and Torque. > > https://slurm.schedmd.com/SUG14/node_health_check.pdf > > https://github.com/mej/nhc > > At ${JOB-1} we would run our i

Re: [slurm-users] Slurm source installation

2018-05-10 Thread Valeriana
Hi Ole! Thanks for you help. I already checked this installation, but it didn't help me much. I am not using rpm, I am installing direct from the source code (configure, make and make install process). My question is: do I need these plugins on the computational nodes? Thanks in advance, Valeri

Re: [slurm-users] Slurm Installation on different Unix environment

2018-05-10 Thread agostino bruno
Thanks Paul, Could you be more specific on build slurm and its dependencies? You mean to build slurm from source? If you have a standard procedure to do that can you share it with me? At the end odroid has Ubuntu installed on. Thank you very much in advance, Agostino > Il giorno 10 mag

Re: [slurm-users] How to access environment variables in submit script?

2018-05-10 Thread Paul Edmon
Not that I am aware of.  Since the header isn't really part of the script bash doesn't evaluate them as far as I know. -Paul Edmon- On 05/10/2018 09:19 AM, Dmitri Chebotarov wrote: Hello Is it possible to access environment variables in a submit script? F.e. $SCRATCH is set to a path and I l

Re: [slurm-users] Slurm Installation on different Unix environment

2018-05-10 Thread Paul Edmon
Assuming you can build slurm and its dependencies this should work.  We've run slurm here with different OS's on various nodes for a while and it works fine.  That said I haven't tried odroids so I can't speak specifically to that. -Paul Edmon- On 05/10/2018 08:26 AM, agostino bruno wrote:

[slurm-users] How to access environment variables in submit script?

2018-05-10 Thread Dmitri Chebotarov
Hello Is it possible to access environment variables in a submit script? F.e. $SCRATCH is set to a path and I like to use $SCRATCH variable in #SBATCH: #SBATCH --output=$SCRATCH/slurm/%j.out #SBATCH --error=$SCRATCH/slurm/%j.err Since it's Bash script, # are ignored and I suspect these variables

[slurm-users] Slurm Installation on different Unix environment

2018-05-10 Thread agostino bruno
Dear All, I am writing to you since I would like to have some information about the possibility to have a unique slurm server (Centos) controlling clients with Cento and Ubuntu installed on. I already have a small cluster with 3 nodes controlled by one server. On all the systems the OS is Cen

Re: [slurm-users] Slurm source installation

2018-05-10 Thread Ole Holm Nielsen
On 10-05-2018 13:39, Valeriana wrote: Good Morning, I'm new to SLURM. I just installed  slurm-17.11.5.tar.bz2 source on a Master server (CentOS 7 17.08) with the followings plugins: DMTCP,Padb,Hostlist,Interactive Script,mpich, openmpi, Node Health Check,PEStat,HDF5,pam_slurm,PMIx and sqlog.

[slurm-users] Slurm source installation

2018-05-10 Thread Valeriana
Good Morning, I'm new to SLURM. I just installed  slurm-17.11.5.tar.bz2 source on a Master server (CentOS 7 17.08) with the followings plugins: DMTCP,Padb,Hostlist,Interactive Script,mpich, openmpi, Node Health Check,PEStat,HDF5,pam_slurm,PMIx and sqlog. Munge is installed on a server and nod

[slurm-users] Preempting not working for GPU nodes

2018-05-10 Thread Zheng Gong
Hi everyone, We have a heterogeneous cluster. Part of the nodes have two nvidia gpu cards. slurm.conf looks like: NodeName=compute-0-[0-6] CPUs=32 > > NodeName=compute-1-[0-5] CPUs=16 Gres=gpu:2 Weight=100 > > >> PartitionName=hipri DefaultTime=00-1 MaxTime=00-1 MaxNodes=1 >> PriorityTier=9 Node

Re: [slurm-users] srun --reboot in sbatch

2018-05-10 Thread Chris Samuel
On Monday, 7 May 2018 11:42:03 PM AEST Tueur Volvo wrote: > why ? can i have srun --reboot in sbatch file ? It doesn't make sense to reboot the node part way through running your job, you're just going to kill the running job. Instead add this near the top of your batch script: #SBATCH --reboo

Re: [slurm-users] slurm reboot node with spank plugin

2018-05-10 Thread Chris Samuel
On Wednesday, 9 May 2018 10:17:17 PM AEST Tueur Volvo wrote: > I currently use a plugin node feature like knl > but i don't like use node feature because i must write "feature" in > slurm.conf file Actually you don't. The knl_generic plugin does that work for us, it populates the features avail

Re: [slurm-users] Memory oversubscription and sheduling

2018-05-10 Thread Chris Samuel
On Monday, 7 May 2018 11:58:38 PM AEST Cory Holcomb wrote: > Thank you, for the reply I was beginning to wonder if my message was seen. It's a busy list at times. :-) > While I understand how batch systems work, if you have a system daemon that > develops a memory leak and consumes the memory o

Re: [slurm-users] How to get information about job steps

2018-05-10 Thread Chris Bridson (NBI)
If accounting is setup the state/requests of the steps should be saved, e.g. # sacct --job 12345678 -o JobID,User,WCKey,JobName,ReqMem,Timelimit,MaxRSS,State,ExitCode JobID User WCKeyJobName ReqMem Timelimit MaxRSS State ExitCode -

Re: [slurm-users] Nodes are down after 2-3 minutes.

2018-05-10 Thread Chris Samuel
On Thursday, 10 May 2018 1:02:36 AM AEST Eric F. Alemany wrote: > All seem good for now Great news! -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC

Re: [slurm-users] Splitting mpi rank output

2018-05-10 Thread Chris Samuel
On Thursday, 10 May 2018 2:25:49 AM AEST Christopher Benjamin Coffey wrote: > I have a user trying to use %t to split the mpi rank outputs into different > files and it's not working. I verified this too. Any idea why this might > be? This is the first that I've heard of a user trying to do this.

Re: [slurm-users] "Low socket*core*thre" - solution?

2018-05-10 Thread Chris Samuel
On Thursday, 10 May 2018 12:27:29 AM AEST Mahmood Naderan wrote: > To be honest, I see many commands in the manual that look similar for > a not professional user. For example, restarting slurmd, slurmctl and > now scontrol reconfigure and they look confusing. Do you agree with > that? The comman