[slurm-users] Issue about selecting cpus for optimization

2024-06-10 Thread Purvesh Parmar via slurm-users
Hi, We have 16 nodes cluster with DGX-A100 (80 GB). We have 128 cores of each node separated in to a separate partition for cpu only jobs and 8 GPUs and 128 cores in other partitions for cpugpu jobs. We want to ensure that only selected 128 cores should be part of the cpu partition. (NUMA / Symm

[slurm-users] slurm bank utility

2023-12-10 Thread Purvesh Parmar
Hi, We are using slurm 21.08. We are curious to know how to use "sbank" utility for crediting GPU Hours , just like cpu minutes, and also get the status of GPUHours credited, used etc. Actually, sbank utility from github is not having functionality of adding / querying the GPUHours Any other mean

Re: [slurm-users] Distribute a single node resources across multiple partitons

2023-07-06 Thread Purvesh Parmar
Hi, Do I need separate slurmctld and slurmd to run for this? I am struggling for this. Any pointers. -- Purvesh On Mon, 26 Jun 2023 at 12:15, Purvesh Parmar wrote: > Hi, > > I have slurm 20.11 in a cluster of 4 nodes, with each node having 16 cpus. > I want to create two parti

[slurm-users] Distribute a single node resources across multiple partitons

2023-06-25 Thread Purvesh Parmar
Hi, I have slurm 20.11 in a cluster of 4 nodes, with each node having 16 cpus. I want to create two partitions (ppart and cpart) and want that 8 cores from each of the 4 nodes should be part of part of ppart and remaining 8 cores should be part of cpart, this means, I want to distribute each node'

[slurm-users] partial partition utilization

2023-05-21 Thread Purvesh Parmar
Hi, We have slurm-21.04 and have 8 nodes in job_submit.lua partition (wtih 2 gpus per node). I want to calculate utilization of only specific 4 nodes out of 8 from the partition over the period of last 15 dayw , how to do it? Regards, Purvesh P

Re: [slurm-users] Migration of slurm communication network / Steps / how to

2023-04-23 Thread Purvesh Parmar
Thank you.. will try this and get back. Any other step being missed here for migration? Thankyou, Purvesh On Mon, 24 Apr 2023 at 12:08, Ole Holm Nielsen wrote: > On 4/24/23 08:09, Purvesh Parmar wrote: > > thank you, however, because this is change in the data center, the names &

Re: [slurm-users] Migration of slurm communication network / Steps / how to

2023-04-23 Thread Purvesh Parmar
at 11:25, Ole Holm Nielsen wrote: > On 4/24/23 06:58, Purvesh Parmar wrote: > > thank you, but its change of hostnames as well, apart from ip addresses > > as well of the slurm server, database serverver name and slurmd compute > > nodes as well. > > I suggest that

Re: [slurm-users] Migration of slurm communication network / Steps / how to

2023-04-23 Thread Purvesh Parmar
ut I think that > updates itself. > > The names of the servers are in slurm.conf, but again, if the names don’t > change, that won’t matter. If you have IPs there, you will need to change > them. > > Sent from my iPhone > > > On Apr 23, 2023, at 14:01, Purvesh Parmar > wro

[slurm-users] Migration of slurm communication network / Steps / how to

2023-04-23 Thread Purvesh Parmar
and slurmd on compute nodes Please help and guide for above. Regards, Purvesh Parmar INHAIT

Re: [slurm-users] changing the operational network in slurm setup

2023-03-13 Thread Purvesh Parmar
; > Sent from my T-Mobile 4G LTE Device > > > > Original message > From: Purvesh Parmar > Date: 3/13/23 7:05 PM (GMT-08:00) > To: Slurm User Community List > Subject: Re: [slurm-users] changing the operational network in slurm setup > > CAUTION: This em

Re: [slurm-users] changing the operational network in slurm setup

2023-03-13 Thread Purvesh Parmar
to use the 10GB interface? > > > -Original Message----- > From: Purvesh Parmar > Reply-To: Slurm User Community List > To: Slurm User Community List > Subject: [slurm-users] changing the operational network in slurm setup > Date: 03/13/2023 06:19:13 PM > > CA

[slurm-users] changing the operational network in slurm setup

2023-03-13 Thread Purvesh Parmar
hi, We have slurm 22.08 running on ethernet (1 GbE) network (slurmdbd, slurmctld and slurmd on compute nodes) on ubuntu 20.04. We want to migrate the slurm services on the 10 gbe network, which is present on all the nodes and on the master server as well. How to proceed for this? Thanks, P. Parm

[slurm-users] can a job run across partition in slurm

2022-09-08 Thread Purvesh Parmar
We require more nodes to run a single job which requires more nodes than present in HMEM partition. We have other partition XEON . Can a user run a single job across both partitions? We are using slurm 21 Thanks & Regards, Purvesh

Re: [slurm-users] Epilog script does not execute

2022-07-18 Thread Purvesh Parmar
-test I have restarted slurmctld on master and slurmd on the nodes. Then I have tested jobs, but nothing executes after the job is over. Please help Regards, Purvesh On Sat, 16 Jul 2022 at 12:37, Purvesh Parmar wrote: > Hi, > > I have written a shell script with name epilog-test. I h

[slurm-users] Epilog script does not execute

2022-07-16 Thread Purvesh Parmar
Hi, I have written a shell script with name epilog-test. I have mentioned in the slurm.conf file : Epilog=/var/slurm/etc/epilog-test The same slurm.conf file has been copied on all the nodes. My epilog-test is #! /bin/bash echo "epilog test" > /tmp/testfile Chmod +x epilog-test I have restarte

Re: [slurm-users] limit the queued jobs

2022-07-10 Thread Purvesh Parmar
, Ole Holm Nielsen wrote: > Hi Purvesh, > > On 7/11/22 03:37, Purvesh Parmar wrote: > > I want to limit the queued jobs per user to 5 which means, system should > > not allow more than 5 jobs per user to remain in queue (not running and > > waiting for resources) and only

[slurm-users] limit the queued jobs

2022-07-10 Thread Purvesh Parmar
Hi, I want to limit the queued jobs per user to 5 which means, system should not allow more than 5 jobs per user to remain in queue (not running and waiting for resources) and only 4 jobs to run at any given time. To summarize, I want to implement a policy of 4 jobs per user in the running state a

Re: [slurm-users] gpu utilization of a reserved node

2022-05-04 Thread Purvesh Parmar
allocated gpu hours and also does not show for a week duration. sreport reservation utilization name=rizwan_res start=2022-03-28T10:00:00 end=2022-04-03T10:00:00 Please help. Regards, Purvesh On Sat, 30 Apr 2022 at 15:57, Purvesh Parmar wrote: > Hello, > > We have a node given to a g

[slurm-users] gpu utilization of a reserved node

2022-04-30 Thread Purvesh Parmar
Hello, We have a node given to a group that has 2 GPUs in dedicated mode by setting reservation for 6 months. We want to find out GPU hours utilization weekly utilization of that particular reserved node. The node is not in to seperate partition. Below command does not help in showing the allocate

[slurm-users] Distribute the node resources in multiple partitions and regarding job submission script

2022-04-12 Thread Purvesh Parmar
Hello, I am using slurm 21.08. I am stuck with the following. Q1 : I have 8 nodes with 2 gpus each and 128 cores with 512 GB RAM. I want to distribute each node's resources in 2 partitions so that "par1" partition will have 2 gpus with 64 cores and 256 GB ram of the node and the other partition

[slurm-users] Distribute the node resources in multiple partitions and regarding job submission script

2022-04-10 Thread Purvesh Parmar
Hello, I have been using slurm 21.08. Q1 : I have 8 nodes with 2 gpus each and 128 cores with 512 GB RAM. I want to distribute the node resources in 2 partitions so that "par1" partition will have 2 gpus with 64 cores and 256 GB ram of the node and the other partition "par 2" will have the remain