Hi Lachlan,
Thank you for sharing your environment. Everyone has their own set of rules and
i appreciate everyone’s input.
It seems as if the NFS share is a great place to start.
Best,
Eric
_
Eri
Thank you Thomas for your suggestion. I take note of all the people’s comment
and hope to come with a good solution.
_
Eric F. Alemany
System Administrator for Research
Division of Radiation & C
Might there be any "adverse" impacts to changing
SelectTypeParameters=CR_CPU to SelectTypeParameters=CR_Socket_Memory? Our
compute nodes are, for the most part, exclusive so users are allocated
entire nodes, no sharing, for jobs. We explicitly specify CPUs, Sockets,
CoresPerSocket, & ThreadsPerCore
On 11 May 2018 at 01:35, Eric F. Alemany wrote:
> Hi All,
>
> I know this might sounds as a very basic question: where in the cluster
> should I install Python and R?
> Headnode?
> Execute nodes ?
>
> And is there a particular directory (path) I need to install Python and R.
>
> Background:
> SLU
Dear Slurm Users,
We've started using maintenance reservations. As you would expect, this
caused some confusion for users who were wondering why their jobs were
queuing up and not running. Some of my users provide a public service of
sorts that automatically submits jobs to our cluster. They w
Thank you all for suggestions...
I will take closer look at 'exec sbatch...' script.
Running ''sbatch --output.. --error=..' works as well so far.
On 5/10/18, 11:14 , "slurm-users on behalf of Michael Jennings"
wrote:
On Thursday, 10 May 2018, at 10:09:22 (-0400),
Paul Edmon wrote:
Yes thank you very much.
Regards,
Mahmood
On Thu, May 10, 2018 at 7:42 PM, Simon Flood wrote:
> On 10/05/18 15:59, Mahmood Naderan wrote:
>
> Yes it's possible for a user to be attached to more than one account or to
> an account with more than partition.
>
> As per the error message you hav
Hello All,
Is it possible to track all jobs which requested a specific license? I am
using Slurm 16.05.6. I looked through `sacct ... --format=all`, but maybe I
am missing something.
Thanks,
Barry
--
Barry E Moore II, PhD
E-mail: bmoor...@pitt.edu
Assistant Research Professor
Center for Resea
Hi,
We noticed that recently --uid, and --gid functionality changed where
previously a user in the slurm administrators group could launch jobs
successfully with --uid, and --gid , allowing for them to submit jobs as
another user. Now, in order to use --uid, --gid, you have to be the root user
Hi Paul,
I'd first suggest to upgrade to 17.11.6, I think the first couple
17.11.x releases had some issues in terms of GRES binding.
Then, I believe you also need to request all of your cores to be
allocated on the same socket, if that's what you want. Something like
--ntasks-per-socket=16.
Her
This is exactly what we want do, spreading array jobs across nodes. The primary
motivation for us is to achieve load-balancing.
-Original Message-
From: slurm-users [mailto:slurm-users-boun...@lists.schedmd.com] On Behalf Of
Emyr James
Sent: Tuesday, March 20, 2018 11:54 PM
To: slurm-us
Assuming you plan for users to use R in jobs, it will need to be accessible
to the execute/compute nodes.
I would usually suggest on a shared drive. Although it should be OK if
locally installed on each compute
node (probably want at same exact path and with same R packages
installed). Presumabl
In our case we are using Nomachine rather than x2go, but the same should
apply:
I connect to a login node (eg loginnode01) via Nomachine and fire up a KDE
workspace, open a terminal and ssh -X loginnode01 (so SSH back into the
same host), then srun --x11 etc
On Thu, 10 May 2018 at 17:19, Patrick
On 05/09/2018 04:14 PM, Nathan Harper wrote:
Yep, exactly the same issue. Our dirty workaround is to ssh -X back into the
same host and it will work.
Hi -
Since I'm having this problem too, can you elaborate? You're ssh -X ing
into a machine and then ssh -X ing back to the original host?
Thank you Simon for your quick reply.
I liked the "(N)either" touch - makes sense
._
Eric F. Alemany
System Administrator for Research
Division of Radiation & Cancer Biology
Department of Radia
Hi Eric,
On 10/05/18 23:35, Eric F. Alemany wrote:
> I know this might sounds as a very basic question: where in
> the cluster should I install Python and R?
> Headnode?
> Execute nodes ?
I don't think there is a fixed rule for a question like this
and it depends on the compromise between what
On 10/05/18 16:35, Eric F. Alemany wrote:
I know this might sounds as a very basic question: where in the cluster should
I install Python and R?
Headnode?
Execute nodes ?
And is there a particular directory (path) I need to install Python and R.
Background:
SLURM on Ubuntu 18.04
1 headnode
4
We build from source. The build dependencies would be munge. We just
use the provided slurm.spec file (since we are building for CentOS). We
then distribute the rpms.
You can build slurm from autotools as well which permits a more generic
build and which you can then repackage into other pa
Hi All,
I know this might sounds as a very basic question: where in the cluster should
I install Python and R?
Headnode?
Execute nodes ?
And is there a particular directory (path) I need to install Python and R.
Background:
SLURM on Ubuntu 18.04
1 headnode
4 execute nodes
NFS shared drive among
On 10/05/18 15:59, Mahmood Naderan wrote:
Is it possible to assign a user to two partitions/accounts? I did
that. But the sbatch isn't able to submit the job.
[mahmood@rocks7 ~]$ cat slurm.sh
#!/bin/bash
#SBATCH --output=test.out
#SBATCH --job-name=test
#SBATCH --ntasks=6
#SBATCH --partition=PL
On Thursday, 10 May 2018, at 10:09:22 (-0400),
Paul Edmon wrote:
> Not that I am aware of. Since the header isn't really part of the
> script bash doesn't evaluate them as far as I know.
>
> On 05/10/2018 09:19 AM, Dmitri Chebotarov wrote:
> >
> >Is it possible to access environment variables in
On 10-05-2018 16:56, Valeriana wrote:
Hi Ole! Thanks for you help. I already checked this installation, but it
didn't help me much. I am not using rpm, I am installing direct from the
source code (configure, make and make install process). My question is:
do I need these plugins on the computat
I don't believe that is possible.
The #SBATCH lines are comments to the shell, so it does not do any variable
expansion there.
To my knowledge, Slurm does not do any variable expansion in the parameters
either.
If you really needed that sort of functionality, you would probably need to
have someth
Hi
Is it possible to assign a user to two partitions/accounts? I did
that. But the sbatch isn't able to submit the job.
[mahmood@rocks7 ~]$ cat slurm.sh
#!/bin/bash
#SBATCH --output=test.out
#SBATCH --job-name=test
#SBATCH --ntasks=6
#SBATCH --partition=PLAN1
#SBATCH --mem=8G
mpirun /share/apps/me
On Thursday, 10 May 2018, at 20:02:37 (+1000),
Chris Samuel wrote:
> For instance there's the LBNL Node Health Check (NHC) system that plugs into
> both Slurm and Torque.
>
> https://slurm.schedmd.com/SUG14/node_health_check.pdf
>
> https://github.com/mej/nhc
>
> At ${JOB-1} we would run our i
Hi Ole! Thanks for you help. I already checked this installation, but it
didn't help me much. I am not using rpm, I am installing direct from the
source code (configure, make and make install process). My question is:
do I need these plugins on the computational nodes? Thanks in advance,
Valeri
Thanks Paul,
Could you be more specific on build slurm and its dependencies? You mean to
build slurm from source? If you have a standard procedure to do that can you
share it with me?
At the end odroid has Ubuntu installed on.
Thank you very much in advance,
Agostino
> Il giorno 10 mag
Not that I am aware of. Since the header isn't really part of the
script bash doesn't evaluate them as far as I know.
-Paul Edmon-
On 05/10/2018 09:19 AM, Dmitri Chebotarov wrote:
Hello
Is it possible to access environment variables in a submit script?
F.e. $SCRATCH is set to a path and I l
Assuming you can build slurm and its dependencies this should work.
We've run slurm here with different OS's on various nodes for a while
and it works fine. That said I haven't tried odroids so I can't speak
specifically to that.
-Paul Edmon-
On 05/10/2018 08:26 AM, agostino bruno wrote:
Hello
Is it possible to access environment variables in a submit script?
F.e. $SCRATCH is set to a path and I like to use $SCRATCH variable in #SBATCH:
#SBATCH --output=$SCRATCH/slurm/%j.out
#SBATCH --error=$SCRATCH/slurm/%j.err
Since it's Bash script, # are ignored and I suspect these variables
Dear All,
I am writing to you since I would like to have some information about the
possibility to have a unique slurm server (Centos) controlling clients with
Cento and Ubuntu installed on.
I already have a small cluster with 3 nodes controlled by one server. On all
the systems the OS is Cen
On 10-05-2018 13:39, Valeriana wrote:
Good Morning,
I'm new to SLURM. I just installed slurm-17.11.5.tar.bz2 source on a
Master server (CentOS 7 17.08) with the followings plugins:
DMTCP,Padb,Hostlist,Interactive Script,mpich, openmpi, Node Health
Check,PEStat,HDF5,pam_slurm,PMIx and sqlog.
Good Morning,
I'm new to SLURM. I just installed slurm-17.11.5.tar.bz2 source on a
Master server (CentOS 7 17.08) with the followings plugins:
DMTCP,Padb,Hostlist,Interactive Script,mpich, openmpi, Node Health
Check,PEStat,HDF5,pam_slurm,PMIx and sqlog. Munge is installed on a
server and nod
Hi everyone,
We have a heterogeneous cluster. Part of the nodes have two nvidia gpu
cards.
slurm.conf looks like:
NodeName=compute-0-[0-6] CPUs=32
>
> NodeName=compute-1-[0-5] CPUs=16 Gres=gpu:2 Weight=100
>
>
>> PartitionName=hipri DefaultTime=00-1 MaxTime=00-1 MaxNodes=1
>> PriorityTier=9 Node
On Monday, 7 May 2018 11:42:03 PM AEST Tueur Volvo wrote:
> why ? can i have srun --reboot in sbatch file ?
It doesn't make sense to reboot the node part way through running your job,
you're just going to kill the running job.
Instead add this near the top of your batch script:
#SBATCH --reboo
On Wednesday, 9 May 2018 10:17:17 PM AEST Tueur Volvo wrote:
> I currently use a plugin node feature like knl
> but i don't like use node feature because i must write "feature" in
> slurm.conf file
Actually you don't. The knl_generic plugin does that work for us, it
populates the features avail
On Monday, 7 May 2018 11:58:38 PM AEST Cory Holcomb wrote:
> Thank you, for the reply I was beginning to wonder if my message was seen.
It's a busy list at times. :-)
> While I understand how batch systems work, if you have a system daemon that
> develops a memory leak and consumes the memory o
If accounting is setup the state/requests of the steps should be saved, e.g.
# sacct --job 12345678 -o
JobID,User,WCKey,JobName,ReqMem,Timelimit,MaxRSS,State,ExitCode
JobID User WCKeyJobName ReqMem Timelimit MaxRSS
State ExitCode
-
On Thursday, 10 May 2018 1:02:36 AM AEST Eric F. Alemany wrote:
> All seem good for now
Great news!
--
Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC
On Thursday, 10 May 2018 2:25:49 AM AEST Christopher Benjamin Coffey wrote:
> I have a user trying to use %t to split the mpi rank outputs into different
> files and it's not working. I verified this too. Any idea why this might
> be? This is the first that I've heard of a user trying to do this.
On Thursday, 10 May 2018 12:27:29 AM AEST Mahmood Naderan wrote:
> To be honest, I see many commands in the manual that look similar for
> a not professional user. For example, restarting slurmd, slurmctl and
> now scontrol reconfigure and they look confusing. Do you agree with
> that?
The comman
41 matches
Mail list logo