from:"Fulcomer, Samuel"

[slurm-users] Re: FairShare if there's only one account?

2024-08-10 Thread Fulcomer, Samuel via slurm-users

...ok... sure I had no idea where the "parent" label came from. This makes perfect sense. It will default to "1", I think. On Sat, Aug 10, 2024 at 12:24 PM Ryan Cox wrote: > fairshare=parent sets the user association to effectively compete at the > account level, so this is behaving as inten

[slurm-users] Re: FairShare if there's only one account?

2024-08-10 Thread Fulcomer, Samuel via slurm-users

...and there's not actually one account in your setup, is there? There should at least be a "root" and a "mic" account, I think. I don't recall whether you'd sent the output of "sshare | head -15"... On Sat, Aug 10, 2024 at 2:30 PM Fulcomer, Sa

[slurm-users] Re: FairShare if there's only one account?

2024-08-10 Thread Fulcomer, Samuel via slurm-users

We use the following relevant settings... PriorityType=priority/multifactor PriorityDecayHalfLife=7-0 PriorityCalcPeriod=00:02:00 PriorityMaxAge=3-0 PriorityWeightAge=0 PriorityWeightFairshare=200 PriorityWeightJobSize=1 PriorityWeightPartition=200 PriorityWeightQOS=100 PriorityWeightTRES=

[slurm-users] Re: FairShare if there's only one account?

2024-08-09 Thread Fulcomer, Samuel via slurm-users

uot;sshare" (no arguments) would be useful for me. On Fri, Aug 9, 2024 at 9:52 PM Drucker, Daniel wrote: > On Aug 9, 2024, at 9:21 PM, Fulcomer, Samuel > wrote: > > And note that the high PriorityWeightAge may be complicating things. We > set it to 0. With it set so high, it

[slurm-users] Re: FairShare if there's only one account?

2024-08-09 Thread Fulcomer, Samuel via slurm-users

ing to run. On Fri, Aug 9, 2024 at 9:15 PM Fulcomer, Samuel wrote: > ...and what are the top 10-15 lines in your share output?... > > On Fri, Aug 9, 2024 at 9:07 PM Drucker, Daniel < > ddruc...@mclean.harvard.edu> wrote: > >> PriorityType=priority/multifactor >> P

[slurm-users] Re: FairShare if there's only one account?

2024-08-09 Thread Fulcomer, Samuel via slurm-users

WeightQOS=0 > > In 21.08.8. > > > > On Aug 9, 2024, at 8:36 PM, Fulcomer, Samuel > wrote: > > External Email - Use Caution > > Yes, well, in that case, it should work as you desire, modulo your > slurm.conf settings. What are the relevant lines in yours? > &

[slurm-users] Re: FairShare if there's only one account?

2024-08-09 Thread Fulcomer, Samuel via slurm-users

t's say user A has completed a million jobs in the last few days > as well, and user A has never submitted any before. > > On Aug 9, 2024, at 6:03 PM, Fulcomer, Samuel > wrote: > > External Email - Use Caution > > I don't think fairshare use is updated unt

[slurm-users] Re: FairShare if there's only one account?

2024-08-09 Thread Fulcomer, Samuel via slurm-users

I don't think fairshare use is updated until jobs finish... On Fri, Aug 9, 2024 at 5:59 PM Drucker, Daniel via slurm-users < slurm-users@lists.schedmd.com> wrote: > Hi Paul from over at mclean.harvard.edu! > > I have never added *any* users using sacctmgr - I've always just had > everyone I guess

[slurm-users] Re: Increasing SlurmdTimeout beyond 300 Seconds

2024-02-12 Thread Fulcomer, Samuel via slurm-users

We'd bumped ours up for a while 20+ years ago when we had a flaky network connection between two buildings holding our compute nodes. If you need more than 600s you have networking problems. On Mon, Feb 12, 2024 at 5:41 PM Timony, Mick via slurm-users < slurm-users@lists.schedmd.com> wrote: > We

Re: [slurm-users] Maintaining slurm config files for test and production clusters

2023-01-04 Thread Fulcomer, Samuel

;s no reason to baroquify it. On Wed, Jan 4, 2023 at 1:54 PM Fulcomer, Samuel wrote: > Just make the cluster names the same, with different Nodename and > Partition lines. The rest of slurm.conf can be the same. Having two cluster > names is only necessary if you're running

Re: [slurm-users] Maintaining slurm config files for test and production clusters

2023-01-04 Thread Fulcomer, Samuel

Just make the cluster names the same, with different Nodename and Partition lines. The rest of slurm.conf can be the same. Having two cluster names is only necessary if you're running production in a multi-cluster configuration. Our model has been to have a production cluster and a test cluster wh

Re: [slurm-users] Dell <> GPU compatibility matrix?

2022-10-27 Thread Fulcomer, Samuel

The NVIDIA A10 would probably work. Check the Dell specs for card lengths that it can accommodate. It's also passively cooled, so you'd need to ensure that there's good airflow through the card. The proof would be installing a card, and watching the temp when you run apps on it. It's 150W, so not t

Re: [slurm-users] slurmctld hanging

2022-07-28 Thread Fulcomer, Samuel

Hi Byron, We ran into this with 20.02, and mitigated it with some kernel tuning. From our sysctl.conf: net.core.somaxconn = 2048 net.ipv4.tcp_max_syn_backlog = 8192 # prevent neighbour (aka ARP) table overflow... net.ipv4.neigh.default.gc_thresh1 = 3 net.ipv4.neigh.default.gc_thresh2 = 320

Re: [slurm-users] unable to ssh onto compute nodes on which I have running jobs

2022-07-27 Thread Fulcomer, Samuel

>From our /etc/pam.d/sshd on our compute nodes accountrequired pam_nologin.so accountsufficientpam_access.so accountinclude password-auth -accountrequired pam_slurm_adopt.so and /pam.d/password-auth: #-session optional pam_systemd.so Note that di

Re: [slurm-users] How to open a slurm support case

2022-03-24 Thread Fulcomer, Samuel

...it is a bit arcane, but it's not like we're funding lavish lifestyles with our support payments. I would prefer to see a slightly more differentiated support system, but this suffices... On Thu, Mar 24, 2022 at 6:06 PM Sean Crosby wrote: > Hi Jeff, > > The support system is here - https://bug

Re: [slurm-users] QOS time limit tighter than partition limit

2021-12-16 Thread Fulcomer, Samuel

...and you shouldn't be able to do this with a QoS (I think as you want it to), as "grptresrunmins" applies to the aggregate of everything using the QoS. On Thu, Dec 16, 2021 at 6:12 PM Fulcomer, Samuel wrote: > I've not parsed your message very far, but... > > fo

Re: [slurm-users] QOS time limit tighter than partition limit

2021-12-16 Thread Fulcomer, Samuel

I've not parsed your message very far, but... for i in `cat limit_users` ; do sacctmgr where user=$i partition=foo account=bar set grptresrunmins=cpu=Nlimit On Thu, Dec 16, 2021 at 6:01 PM Ross Dickson wrote: > It would like to impose a time limit stricter than the partition limit on > a certa

Re: [slurm-users] Prevent users from updating their jobs

2021-12-16 Thread Fulcomer, Samuel

There's no clear answer to this. It depends a bit on how you've segregated your resources. In our environment, GPU and bigmem nodes are in their own partitions. There's nothing to prevent a user from specifying a list of potential partitions in the job submission, so there would be no need for the

Re: [slurm-users] GPU jobs not running correctly

2021-08-20 Thread Fulcomer, Samuel

ets, e.g.: # 8-gpu A6000 nodes - dual-root NodeName=gpu[1504-1506] Name=gpu Type=a6000 File=/dev/nvidia[0-3] CPUs=0-23 NodeName=gpu[1504-1506] Name=gpu Type=a6000 File=/dev/nvidia[4-7] CPUs=24-47 On Fri, Aug 20, 2021 at 6:01 PM Fulcomer, Samuel wrote: > Well... you've got lots of

Re: [slurm-users] GPU jobs not running correctly

2021-08-20 Thread Fulcomer, Samuel

apWatts=n/a > >CurrentWatts=0 AveWatts=0 > >ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s > > > > *Node2-3* > > NodeName=node02 Arch=x86_64 CoresPerSocket=16 > >CPUAlloc=0 CPUTot=64 CPULoad=0.48 > >AvailableFeatures=RTX6000 > >

Re: [slurm-users] GPU jobs not running correctly

2021-08-19 Thread Fulcomer, Samuel

What SLURM version are you running? What are the #SLURM directives in the batch script? (or the sbatch arguments) When the single GPU jobs are pending, what's the output of 'scontrol show job JOBID'? What are the node definitions in slurm.conf, and the lines in gres.conf? Are the nodes all the

Re: [slurm-users] History of pending jobs

2021-07-30 Thread Fulcomer, Samuel

XDMoD can do that for you, but bear in mind that wait/pending time by itself may not be particularly useful. Consider the extreme scenario in which a user is only allowed to use one node at a time, but submits a thousand one-day jobs. Without any other competition for resources, the average wait/p

Re: [slurm-users] Incorrect Number of GPUs?

2021-07-26 Thread Fulcomer, Samuel

on, Jul 26, 2021 at 1:32 PM Jason Simms wrote: > Dear Samuel, > > Restarting slurmctld did the trick. Thanks! I should have thought to do > that, but typically sconrtrol reconfigure picks up most changes. > > Warmest regards, > Jason > > On Mon, Jul 26, 2021 at 12:55

Re: [slurm-users] Incorrect Number of GPUs?

2021-07-26 Thread Fulcomer, Samuel

...and... you need to restart slurmctld when you change a NodeName line. "scontrol reconfigure" doesn't do the truck. On Mon, Jul 26, 2021 at 12:49 PM Fulcomer, Samuel wrote: > If you have a dual-root PCIe system you may need to specify the CPU/core > affinity in gres.con

Re: [slurm-users] Incorrect Number of GPUs?

2021-07-26 Thread Fulcomer, Samuel

If you have a dual-root PCIe system you may need to specify the CPU/core affinity in gres.conf. On Mon, Jul 26, 2021 at 12:07 PM Jason Simms wrote: > Hello all, > > I have a GPU node with 3 identical GPUs (we started with two and recently > added the third). Running nvidia-smi correctly shows th

Re: [slurm-users] Priority Access to GPU?

2021-07-12 Thread Fulcomer, Samuel

Jason, I've just been working through a similar scenario to handle access to our 3090 nodes that have been purchased by researchers. I suggest putting the node into an additional partition, and then add a QOS for the lab group that has grptres=gres/gpu=1,cpu=M,mem=N (where cpu and mem are whateve

Re: [slurm-users] 答复: how to check what slurm is doing when job pending with reason=none?

2021-06-17 Thread Fulcomer, Samuel

You can specify a partition priority in the partition line in slurm.conf, e.g. Priority=65000 (I forget what the max is...) On Thu, Jun 17, 2021 at 10:31 PM wrote: > Thanks for the help. We tried to reduce the sched_interval and the pending > time decreased as expected. > > But the influence of

Re: [slurm-users] monitor draining/drain nodes

2021-06-12 Thread Fulcomer, Samuel

...sorry... "sinfo | grep drain && sinfo | grep drain | mail -s 'drain nodes' " On Sat, Jun 12, 2021 at 4:46 PM Fulcomer, Samuel wrote: > ...something like "sinfo | grep drain && mail -s 'drain nodes' address> " > > ...will

Re: [slurm-users] monitor draining/drain nodes

2021-06-12 Thread Fulcomer, Samuel

...something like "sinfo | grep drain && mail -s 'drain nodes' " ...will work... Substitute "draining" or "drained" for "drain" to taste... On Sat, Jun 12, 2021 at 4:32 PM Rodrigo Santibáñez < rsantibanez.uch...@gmail.com> wrote: > Hi SLURM users, > > Does anyone have a cronjob or similar to m

Re: [slurm-users] Staging data on the nodes one will be processing on via sbatch

2021-04-03 Thread Fulcomer, Samuel

inline below... On Sat, Apr 3, 2021 at 4:50 PM Will Dennis wrote: > Sorry, obvs wasn’t ready to send that last message yet… > > > > Our issue is the shared storage is via NFS, and the “fast storage in > limited supply” is only local on each node. Hence the need to copy it over > from NFS (and th

Re: [slurm-users] Staging data on the nodes one will be processing on via sbatch

2021-04-03 Thread Fulcomer, Samuel

nd, or something like /tmp… That’s why my desired > workflow is to “copy data locally / use data from copy / remove local copy” > in separate steps. > > > > > > *From: *slurm-users on behalf of > Fulcomer, Samuel > *Date: *Saturday, April 3, 2021 at 4:00 PM > *To:

Re: [slurm-users] Staging data on the nodes one will be processing on via sbatch

2021-04-03 Thread Fulcomer, Samuel

Unfortunately this is not a good workflow. You would submit a staging job with a dependency for the compute job; however, in the meantime, the scheduler might launch higher-priority jobs that would want the scratch space, and cause it to be scrubbed. In a rational process, the scratch space would

Re: [slurm-users] Parent account in AllowAccounts

2021-01-15 Thread Fulcomer, Samuel

Durai, There is no inheritance in "AllowAccounts". You need to specify each account explicitly. There _is_ inheritance in fairshare calculation. On Fri, Jan 15, 2021 at 2:17 PM Brian Andrus wrote: > As I understand it, the parents are really meant for reporting, so you > can run reports that a

Re: [slurm-users] [EXT] GPU Jobs with Slurm

2021-01-14 Thread Fulcomer, Samuel

lly through sql hacking; however, we just went with a virgin database when we last upgraded in order to get it working (and sucked the accounting data into XDMoD). On Thu, Jan 14, 2021 at 6:36 PM Fulcomer, Samuel wrote: > AllowedDevicesFile should not be necessary. The relevant devices a

Re: [slurm-users] [EXT] GPU Jobs with Slurm

2021-01-14 Thread Fulcomer, Samuel

AllowedDevicesFile should not be necessary. The relevant devices are identified in gres.conf. "ConstrainDevices=yes" should be all that's needed. nvidia-smi will only see the allocated GPUs. Note that a single allocated GPU will always be shown by nvidia-smi to be GPU 0, regardless of its actual h

Re: [slurm-users] trying to add gres

2021-01-05 Thread Fulcomer, Samuel

Important notes... If requesting more than one core and not using "-N 1", equal numbers of GPUs will be allocated on each node where the cores are allocated. (i.e. if requesting 1 GPU for a 2-core job, if one core is allocated on each of two nodes, one GPU will be allocated on each node). If you

Re: [slurm-users] Slurm Upgrade

2020-11-02 Thread Fulcomer, Samuel

e also described in the > RELEASE_NOTES file. > > So I wouldn't go directly to 20.x, instead I would go from 17.x to 19.x > and then to 20.x > > -Paul Edmon- > On 11/2/2020 8:55 AM, Fulcomer, Samuel wrote: > > We're doing something similar. We're continuing t

Re: [slurm-users] Slurm Upgrade

2020-11-02 Thread Fulcomer, Samuel

We're doing something similar. We're continuing to run production on 17.x and have set up a new server/cluster running 20.x for testing and MPI app rebuilds. Our plan had been to add recently purchased nodes to the new cluster, and at some point turn off submission on the old cluster and switch e

Re: [slurm-users] [pmix] [Cross post - Slurm, PMIx, UCX] Using srun with SLURM_PMIX_DIRECT_CONN_UCX=true fails with input/output error

2020-10-22 Thread Fulcomer, Samuel

Compile slurm without ucx support. We wound up spending quality time with the Mellanox... wait, no, NVIDIA Networking UCX folks to get this sorted out. I recommend using SLURM 20 rather than 19. regards, s On Thu, Oct 22, 2020 at 10:23 AM Michael Di Domenico wrote: > was there ever a result

Re: [slurm-users] GRES Restrictions

2020-08-25 Thread Fulcomer, Samuel

cgroups should work correctly _if_ you're not running with an old corrupted slurm database. There was a bug in a much earlier version of slurm that corrupted the database in a way that the cgroups/accounting code could no longer fence GPUs. This was fixed in a later version, but the database corru

Re: [slurm-users] [External] Defining a default --nodes=1

2020-05-08 Thread Fulcomer, Samuel

"-N 1" restricts a job to a single node. We've continued to have issues with this. Historically we've had a single partition with multiple generations of nodes segregated for multinode scheduling via topology.conf. "Use -N 1" (unless you really know what you're doing) only goes so far. There are

Re: [slurm-users] Managing Local Scratch/TmpDisk

2020-03-31 Thread Fulcomer, Samuel

If you use cgroups, tmpfs /tmp and /dev/shm usage is counted against the requested memory for the job. On Tue, Mar 31, 2020 at 4:51 PM Ellestad, Erik wrote: > How are folks managing allocation of local TmpDisk for jobs? > > We see how you define the location of TmpFs in slurm.conf. > > And then

Re: [slurm-users] Heterogeneous HPC

2019-09-19 Thread Fulcomer, Samuel

Thanks! and I'll watch the video... Privileged containers! never! On Thu, Sep 19, 2019 at 9:06 PM Michael Jennings wrote: > On Thursday, 19 September 2019, at 19:27:38 (-0400), > Fulcomer, Samuel wrote: > > > I obviously haven't been keeping up with any securit

Re: [slurm-users] Heterogeneous HPC

2019-09-19 Thread Fulcomer, Samuel

Hey Michael, I obviously haven't been keeping up with any security concerns over the use of Singularity. In a 2-3 sentence nutshell, what are they? I've been annoyed by NVIDIA's docker distribution for DGX-1 & friends. We've been setting up an ersatz-secure SIngularity environment for use of mid

Re: [slurm-users] Substituions for "see META file" in slurm.spec file of 15.08.11-1 release

2019-07-09 Thread Fulcomer, Samuel

...and for the SchedMD folks, it would be a lot simpler to drop/disambiguate the "year it was released" first element in the version number, and just use it as an incrementing major version number. On Tue, Jul 9, 2019 at 6:42 PM Fulcomer, Samuel wrote: > Hi Pariksheet, > &

Re: [slurm-users] Substituions for "see META file" in slurm.spec file of 15.08.11-1 release

2019-07-09 Thread Fulcomer, Samuel

#x27;ve suggested some documentation clarification, but it's still somewhat easily missed. Regards, Sam On Tue, Jul 9, 2019 at 6:23 PM Pariksheet Nanda wrote: > Hi Samuel, > > On Mon, Jul 8, 2019 at 8:19 PM Fulcomer, Samuel > wrote: > > > > The underlying issu

Re: [slurm-users] Substituions for "see META file" in slurm.spec file of 15.08.11-1 release

2019-07-08 Thread Fulcomer, Samuel

Hi Pariksheet, Note that an "upgrade", in the sense that retained information is converted to new formats, is only relevant for the slurmctld/slurmdbd (and backup) node. If you're planning downtime in which you quiesce job execution (i.e., schedule a maintenance reservation), and have image conf

Re: [slurm-users] Configure Slurm 17.11.9 in Ubuntu 18.10 with use of PMI

2019-06-20 Thread Fulcomer, Samuel

Hi Palle, You should probably get the latest stable SLURM version from www.schedmd.com and use the build/install instructions found there. Note that you should check for WARNING messages in the config.log produced by SLURM's configure, as they're the best place to find you've missing packages tha

Re: [slurm-users] Proposal for new TRES - "Processor Performance Units"....

2019-06-20 Thread Fulcomer, Samuel

t go to a 3 > month moving window to allow people to bank their fairshare, but we haven't > done that yet as people have been having a hard enough time understanding > our current system. It's not due to its complexity but more that most > people just flat out aren't cog

Re: [slurm-users] Proposal for new TRES - "Processor Performance Units"....

2019-06-19 Thread Fulcomer, Samuel

t rely purely on fairshare weighting for > resource usage. It has worked pretty well for our purposes. > > -Paul Edmon- > On 6/19/19 3:30 PM, Fulcomer, Samuel wrote: > > > (...and yes, the name is inspired by a certain OEM's software licensing > schemes...) > >

Re: [slurm-users] Proposal for new TRES - "Processor Performance Units"....

2019-06-19 Thread Fulcomer, Samuel

ting of 130 CPUs > because the CPUs are normalized to the old performance. Since it would > probably look bad politically to reduce someone's number, but giving a new > customer a larger number should be fine. > > Regards, > Alex > > On Wed, Jun 19, 2019 at 12:32 PM Fu

[slurm-users] Proposal for new TRES - "Processor Performance Units"....

2019-06-19 Thread Fulcomer, Samuel

(...and yes, the name is inspired by a certain OEM's software licensing schemes...) At Brown we run a ~400 node cluster containing nodes of multiple architectures (Sandy/Ivy, Haswell/Broadwell, and Sky/Cascade) purchased in some cases by University funds and in others by investigator funding (~50:

Re: [slurm-users] MaxTRESRunMinsPU not yet enabled - similar options?

2019-05-20 Thread Fulcomer, Samuel

On Mon, May 20, 2019 at 2:59 PM wrote: > > > > I did test setting GrpTRESRunMins=cpu=N for each user + account > association, and that does appear to work. Does anyone know of any other > solutions to this issue? No. Your solution is what we currently do. A "...PU" would be a nice, tidy additio

Re: [slurm-users] Power9 ACC922

2019-04-16 Thread Fulcomer, Samuel

ing? > > Prentice > > On 4/16/19 1:12 PM, Fulcomer, Samuel wrote: > > We had an AC921 and AC922 as a while as loaners. > > We had no problems with SLURM. > > Getting POWERAI running correctly (bugs since fixed in newer release) and > apps properly built and linked t

Re: [slurm-users] Power9 ACC922

2019-04-16 Thread Fulcomer, Samuel

We had an AC921 and AC922 as a while as loaners. We had no problems with SLURM. Getting POWERAI running correctly (bugs since fixed in newer release) and apps properly built and linked to ESSL was the long march. regards, s On Tue, Apr 16, 2019 at 12:59 PM Prentice Bisbal wrote: > Sergi, > >

Re: [slurm-users] Topology configuration questions:

2019-01-17 Thread Fulcomer, Samuel

mit all jobs to all partitions plugin and > having users constrain to specific types of nodes using the > --constraint=whatever flag. > > > Nicholas McCollum > Alabama Supercomputer Authority > -- > *From:* "Fulcomer, Samuel" > *S

Re: [slurm-users] Topology configuration questions:

2019-01-17 Thread Fulcomer, Samuel

We use topology.conf to segregate architectures (Sandy->Skylake), and also to isolate individual nodes with 1Gb/s Ethernet rather than IB (older GPU nodes with deprecated IB cards). In the latter case, topology.conf had a switch entry for each node. It used to be the case that SLURM was unhappy wi

Re: [slurm-users] How to delete an association

2019-01-03 Thread Fulcomer, Samuel

y delete the association with the > following command after the users' jobs completes. > > # sacctmgr delete user where name=clschf partition=k80 account=acct-clschf > > Best, > > Jianwen > > On Dec 29, 2018, at 11:50, Fulcomer, Samuel > wrote: > > ...rig

Re: [slurm-users] How to delete an association

2018-12-28 Thread Fulcomer, Samuel

...right. An association isn't an "entity". You want to delete a "user" where name=clschf partition=k80 account=acct-clschf . This won't entirely delete the user entity, only the record/association matching the name/partition/account spec. The foundation of SLURM nomenclature has some unfortunate

Re: [slurm-users] Accounting: Default Associations for Unknown Accounts

2018-12-20 Thread Fulcomer, Samuel

Yes, in a way. In thinking about this for Brown (we haven't implemented it, yet), we've the idea of having a Linux cron job periodically query the group membership of the AD group granted access to the HPC resource, and adding any new users to the SLURM accounting database. We're at the point of u

Re: [slurm-users] Looking for old SLURM versions

2018-10-25 Thread Fulcomer, Samuel

We've got 15.0.8/9. -s On Wed, Oct 24, 2018 at 5:51 PM, Bob Healey wrote: > I'm in the process of upgrading a system that has been running 2.5.4 for > the last 5 years with no issues. I'd like to bring that up to something > current, but I need a a bunch of older versions that do not appear to

Re: [slurm-users] network/communication failure

2018-05-21 Thread Fulcomer, Samuel

Is there a firewall turned on? What does "iptables -L -v" report on the three hosts? On Mon, May 21, 2018 at 11:05 AM, Turner, Heath wrote: > If anyone has advice, I would really appreciate... > > I am running (just installed) slurm-11.17.6, with a master + 2 hosts. It > works locally on the ma

Re: [slurm-users] GPU / cgroup challenges

2018-05-02 Thread Fulcomer, Samuel

This came up around 12/17, I think, and as I recall the fixes were added to the src repo then; however, they weren't added to any fo the 17.releases. On Wed, May 2, 2018 at 6:04 AM, R. Paul Wiegand wrote: > I dug into the logs on both the slurmctld side and the slurmd side. > For the record, I h

Re: [slurm-users] Single user consuming all resources of the cluster

2018-02-07 Thread Fulcomer, Samuel

We use GrpTresRunMins for this, with the idea that it's OK for users to occupy lots of resources with short-running jobs, but not so much with long-running jobs. On Wed, Feb 7, 2018 at 8:41 AM, Bill Barth wrote: > Of course, Matteo. Happy to help. Our job completion script is: > > #!/bin/bash >

64 matches

Mail list logo