from:"Henkel"

[slurm-users] Re: TRES cpu vs tasks

2024-12-04 Thread Henkel, Andreas via slurm-users

Hi Miriam, The Definition of cpu is “fluid” . It depends on hardware and configuration. If threads are defined then cpu may relate to one thread whereas on hardware configurations without threads it will refer to a physical core. https://slurm.schedmd.com/mc_support.html#defs Didn’t you set min

[slurm-users] Re: slurmd on a warwwulf node - not running

2024-12-03 Thread Henkel, Andreas via slurm-users

Hi, No it doesn’t need to be below 1000. Best Andreas Am 03.12.2024 um 22:08 schrieb Steven Jones via slurm-users : HI, Does the slurm user need to be <1000UID?Using IPA with a UID of [root@vuwunicoslurmd1 slurm]# id slurm uid=126209577(slurm) gid=126209576(slurm) groups=126209576(slurm)

Re: [slurm-users] Database cluster

2024-01-24 Thread Henkel, Andreas

Hi Daniel, We run a simple Galera-MySQL Cluster and have a HAproxy running on all clients to steer the requests (round-Robin) to one of the DB-nodes that answer the health check properly. Best, Andreas Am 23.01.2024 um 15:35 schrieb Daniel L'Hommedieu : Xand, Thanks - that’s great to hear.

Re: [slurm-users] slurm reporting

2019-11-27 Thread Henkel, Andreas

Hi Mark, Thanks for your insight. We also work with elasticsearch and I appreciate the easy analysis (once one understands Kibana logic). Do you use job completion plugin as is? Or did you modify it to account for ssl or additional metrics? Best Andreas Am 26.11.2019 um 18:27 schrieb Mark Ha

Re: [slurm-users] Limiting the number of CPU

2019-11-14 Thread Henkel, Andreas

Hi again, I’m pretty sure that’s not valid since your scontrol Show job shows minmemorypernode mich bigger than 1G. Best Andreas Am 14.11.2019 um 14:37 schrieb Nguyen Dai Quy mailto:q...@vnoss.org>>: On Thu, Nov 14, 2019 at 1:59 PM Sukman mailto:suk...@pusat.itb.ac.id>> wrote: Hi Brian, th

Re: [slurm-users] Limiting the number of CPU

2019-11-14 Thread Henkel, Andreas

Hi, Is lowercase #sbatch really valid? > Am 14.11.2019 um 14:09 schrieb Sukman : > > Hi Brian, > > thank you for the suggestion. > > It appears that my node is in drain state. > I rebooted the node and everything became fine. > > However, the QOS still cannot be applied properly. > Do you hav

Re: [slurm-users] After reboot nodes are in state = down

2019-09-26 Thread Henkel, Andreas

Hi Rafal, How do you restart the nodes? If you don’t use scontrol reboot Slurm doesn’t expect nodes to reboot therefore you see that reason in those cases. Best Andreas Am 27.09.2019 um 07:53 schrieb Rafał Kędziorski mailto:rafal.kedzior...@gmail.com>>: Hi, I'm working with slurm-wlm 18.08.

Re: [slurm-users] One time override to force run job

2019-09-07 Thread Henkel, Andreas

Hi Tina, We have an additional partition with partitionqos that increase the limits and allows for running short jobs over the limits if nodes are idle. And on Submission in the Standard-Partitions we automatically add the additional partition via a job_submit-plugin. Best, Andreas > Am 04.09

[slurm-users] Fwd: Getting information about AssocGrpCPUMinutesLimit for a job

2019-08-08 Thread Henkel, Andreas

Sorry,didn’t send to the list Anfang der weitergeleiteten Nachricht: Von: Henkel mailto:hen...@uni-mainz.de>> Datum: 8. August 2019 um 09:21:55 MESZ An: "Sarlo, Jeffrey S" mailto:jsa...@central.uh.edu>> Betreff: Aw:⁨ [slurm-users] Getting information about AssocGrpCPUMi

Re: [slurm-users] Slurm configuration

2019-08-03 Thread Henkel, Andreas

Hi, Have you checked documentation of MinMemory in Slurm.conf for node definition? Best, Andreas > Am 02.08.2019 um 23:53 schrieb Sistemas NLHPC : > > Hi all, > > Currently we have two types of nodes, one with 192GB and another with 768GB > of RAM, it is required that in nodes of 768 GB it is

Re: [slurm-users] Rename account or move user from one account to another

2019-06-17 Thread Henkel, Andreas

Hi Christoph, I think the only way is to modify the database directly. I don’t know if Slurm likes it and personally would try it in a copy of the DB with a separate slurmdbd to see if the values reported are still correct. Best regards, Andreas Henkel > Am 14.06.2019 um 16:16 schrieb

Re: [slurm-users] Pending with resource problems

2019-04-17 Thread Henkel, Andreas

I think there isn’t enough memory. AllocTres Shows mem=55G And your job wants another 40G although the node only has 63G in total. Best, Andreas Am 17.04.2019 um 16:45 schrieb Mahmood Naderan mailto:mahmood...@gmail.com>>: Hi, Although it was fine for previous job runs, the following script now

Re: [slurm-users] Strange error, submission denied

2019-02-20 Thread Henkel

spec is: > > CPUs=80 Boards=1 SocketsPerBoard=2 CoresPerSocket=20 ThreadsPerCore=2 > > Hope this helps! > > All the best, > Chris -- Dr. Andreas Henkel Operativer Leiter HPC Zentrum für Datenverarbeitung Johannes Gutenberg Universität Anselm-Franz-von-Bentze

Re: [slurm-users] Priority access for a group of users

2019-02-18 Thread Henkel

o by using preempt/qos. Though we haven't use that. Best, Andreas On 2/18/19 9:07 AM, Marcus Wagner wrote: > Hi Andreas, > > > doesn't it suffice to use priority tier partitions? You don't need to > use preemption at all, do you? > > > Best > Marcus > &

Re: [slurm-users] Priority access for a group of users

2019-02-17 Thread Henkel, Andreas

Hi David, I think there is another option if you don’t want to use preemption. If the max runlimit is small (several hours for example) working without preemption may be acceptable. Assign a qos with a priority boost to the owners of the node. Then whenever they submit jobs to the partition the

Re: [slurm-users] Strange error, submission denied

2019-02-17 Thread Andreas Henkel

Not the answer you hoped for there I guess... On 15.02.19 07:15, Marcus Wagner wrote: > I have filed a bug: > > https://bugs.schedmd.com/show_bug.cgi?id=6522 > > > Lets see, what ScheMD has to tell us ;) > > > Best > Marcus > > On 2/15/19 6:25 AM, Marcus Wagner wrote: >> NumNodes=1 NumCPUs=48 NumT

Re: [slurm-users] Strange error, submission denied

2019-02-14 Thread Andreas Henkel

ncm0071.hpc.itc.rwth-aachen.de <7> OMP_STACKSIZE: <#> unlimited+p2 > +pemap 13,61 > ncm0071.hpc.itc.rwth-aachen.de <8> OMP_STACKSIZE: <#> unlimited+p2 > +pemap 14,62 > ncm0071.hpc.itc.rwth-aachen.de <9> OMP_STACKSIZE: <#> unlimited+p2 > +pemap 18,

Re: [slurm-users] Strange error, submission denied

2019-02-14 Thread Henkel, Andreas

gt; Am 14.02.2019 um 09:32 schrieb Marcus Wagner : > > Hi Andreas, > > > >> On 2/14/19 8:56 AM, Henkel, Andreas wrote: >> Hi Marcus, >> >> More ideas: >> CPUs doesn’t always count as core but may take the meaning of one thread, >> hence makes

Re: [slurm-users] Strange error, submission denied

2019-02-13 Thread Henkel, Andreas

Hi Marcus, More ideas: CPUs doesn’t always count as core but may take the meaning of one thread, hence makes different Maybe the behavior of CR_ONE_TASK is still not solid nor properly documente and ntasks and ntasks-per-node are honored different internally. If so solely using ntasks can mea

Re: [slurm-users] Strange error, submission denied

2019-02-13 Thread Henkel, Andreas

Hi Marcus, What just came to my mind: if you don’t set —ntasks isn’t the default just 1? All examples I know using ntasks-per-node also set ntasks with ntasks >= ntasks-per-node. Best, Andreas > Am 14.02.2019 um 06:33 schrieb Marcus Wagner : > > Hi all, > > I have narrowed this down a litt

Re: [slurm-users] How to request ONLY one CPU instead of one socket or one node?

2019-02-12 Thread Henkel, Andreas

Hi Leon, If the partition is defined to run jobs exclusive you always get a full node. You’ll have to try to either split up your analysis in independent subtasks to be run in parallel by dividing the data or make use of some Perl parallelization package like parallel::Forkmanager to run steps of

[slurm-users] Change Licenses of running jobs

2019-01-22 Thread Henkel

running for job although the job is running. Any hint appreciated. Best regards, Andreas -- Dr. Andreas Henkel Operativer Leiter HPC Zentrum für Datenverarbeitung Johannes Gutenberg Universität Anselm-Franz-von-Bentzelweg 12 55099 Mainz Telefon: +49 6131 39 26434 OpenPGP Fingerprint: FEC6 287B

Re: [slurm-users] SLURM_JOB_GPU not set in salloc

2019-01-21 Thread Henkel, Andreas

Thank you Chris. This is what I assumed since setting those Variables for complicated Allocations may be just useless. Yet I wasn’t sure if it was possible at all. Best, Andreas > Am 19.01.2019 um 08:39 schrieb Chris Samuel : > >> On 18/1/19 3:18 am, Henkel wrote: >>

[slurm-users] SLURM_JOB_GPU not set in salloc

2019-01-18 Thread Henkel

figured to behave differently. Best regards, Andreas -- Dr. Andreas Henkel Operativer Leiter HPC Zentrum für Datenverarbeitung Johannes Gutenberg Universität Anselm-Franz-von-Bentzelweg 12 55099 Mainz Telefon: +49 6131 39 26434 OpenPGP Fingerprint: FEC6 287B EFF3 7998 A141 03BA E2A9 089F

Re: [slurm-users] [Slurm 18.08.4] sacct/seff Inaccurate usercpu values

2019-01-17 Thread Henkel, Andreas

n't show up until the next release, but at least > there is a fix available. > > Mike Robbert > >> On 1/15/19 11:43 PM, Henkel, Andreas wrote: >> Bad news Dir the cgroup-Users, seems like the bug is „resolved“ by the site >> switching to task/Linux instea

Re: [slurm-users] [Slurm 18.08.4] sacct/seff Inaccurate usercpu values

2019-01-15 Thread Henkel, Andreas

Bad news Dir the cgroup-Users, seems like the bug is „resolved“ by the site switching to task/Linux instead :-( > Am 09.01.2019 um 22:06 schrieb Christopher Benjamin Coffey > : > > Thanks... looks like the bug should get some attention now that a paying site > is complaining: > > https://bugs

Re: [slurm-users] salloc with bash scripts problem

2019-01-02 Thread Henkel, Andreas

Hi, As far as I understand salloc is used to make allocations but initiate a shell (whatever the sallocdefaultcommand specifies) on the node you called salloc. If you’re looking for an interactive session you‘ll probably have to use srun --pty xterm . This will allocate the resources AND initia

Re: [slurm-users] Disabling --nodelist

2018-11-27 Thread Henkel, Andreas

Hi, A spank plugin would probably work. A job-submit which replaces the nodelist with an empty string could work either. What about just changing the .profile and set the env variable for nodelist to empty string? „Note that environment variables will override any options set in a batch scri

Re: [slurm-users] Slurm not building with HWLOC 2.0.2

2018-10-24 Thread Andreas Henkel

r 2018 7:16:58 PM AEDT Andreas Henkel wrote: > >> PS: sorry, I missed to tell the SLurm-Version: it's 17.11.7 > It's always worth checking the NEWS file in git for changes after the release > you're on in case it's since been fixed. > > https://github.com/

Re: [slurm-users] Slurm not building with HWLOC 2.0.2

2018-10-24 Thread Andreas Henkel

PS: sorry, I missed to tell the SLurm-Version: it's 17.11.7 On 10/24/18 9:43 AM, Andreas Henkel wrote: > > HI all, > > did anyone build Slurm using a recent version of HWLOC like 2.0.1 or > 2.0.2? > > When I try to I end up with > > task_cgroup_cpuset.c:486:40:

[slurm-users] Slurm not building with HWLOC 2.0.2

2018-10-24 Thread Andreas Henkel

cts(obj->cpuset, hwloc_topology_get_allowed_cpuset(topology)) Replace cpusets with nodesets for NUMA nodes. To find out which ones, replace intersects() with and() to get the actual intersection. at https://www.open-mpi.org/projects/hwloc/doc/v2.0.1/a00327.php Yet in the source code of Slurm there are already some preprocessor switches for HWLOC 2. Any hints welcome. Best, Andreas Henkel

Re: [slurm-users] UsageFactor in combination with GrpTRESRunMins

2018-03-21 Thread Henkel, Andreas

PS: we're using Slurm 17.11.5 Am 21.03.2018 um 16:18 schrieb Henkel, Andreas mailto:hen...@uni-mainz.de>>: Hi, recently, while trying a new configuration I came cross a Problem. In principal, we have one big Partition containing all nodes with PriorityTier=2. Each account got Gr

[slurm-users] UsageFactor in combination with GrpTRESRunMins

2018-03-21 Thread Henkel, Andreas

eas Dr. Andreas Henkel COO HPC Data Center JGU Mainz Anselm-Franz-von-Bentzelweg 12 55099 Mainz

[slurm-users] Re: TRES cpu vs tasks

[slurm-users] Re: slurmd on a warwwulf node - not running

Re: [slurm-users] Database cluster

Re: [slurm-users] slurm reporting

Re: [slurm-users] Limiting the number of CPU

Re: [slurm-users] Limiting the number of CPU

Re: [slurm-users] After reboot nodes are in state = down

Re: [slurm-users] One time override to force run job

[slurm-users] Fwd: Getting information about AssocGrpCPUMinutesLimit for a job

Re: [slurm-users] Slurm configuration

Re: [slurm-users] Rename account or move user from one account to another

Re: [slurm-users] Pending with resource problems

Re: [slurm-users] Strange error, submission denied

Re: [slurm-users] Priority access for a group of users

Re: [slurm-users] Priority access for a group of users

Re: [slurm-users] Strange error, submission denied

Re: [slurm-users] Strange error, submission denied

Re: [slurm-users] Strange error, submission denied

Re: [slurm-users] Strange error, submission denied

Re: [slurm-users] Strange error, submission denied

Re: [slurm-users] How to request ONLY one CPU instead of one socket or one node?

[slurm-users] Change Licenses of running jobs

Re: [slurm-users] SLURM_JOB_GPU not set in salloc

[slurm-users] SLURM_JOB_GPU not set in salloc

Re: [slurm-users] [Slurm 18.08.4] sacct/seff Inaccurate usercpu values

Re: [slurm-users] [Slurm 18.08.4] sacct/seff Inaccurate usercpu values

Re: [slurm-users] salloc with bash scripts problem

Re: [slurm-users] Disabling --nodelist

Re: [slurm-users] Slurm not building with HWLOC 2.0.2

Re: [slurm-users] Slurm not building with HWLOC 2.0.2

[slurm-users] Slurm not building with HWLOC 2.0.2

Re: [slurm-users] UsageFactor in combination with GrpTRESRunMins

[slurm-users] UsageFactor in combination with GrpTRESRunMins

33 matches

Site Navigation

Mail list logo

Footer information