Dear Loris: Many thanks for your response.

I did change the IDLE state to UNKNOWN state for NodeName configuration, then reloaded *slurmctld* and got 2 gpu nodes(gpu3 & 4) as drain mode. Although the same state I have manually updated to IDLE state.

But how do I change the CoresPerSocket and ThreadsPerCore in the NodeName parameter?


Thanks & Regards,
Sudeep Narayan Banerjee

On 18/05/20 7:29 pm, Loris Bennett wrote:
Hi Sudeep,

I am not sure if this is the cause of the problem but in your slurm.conf
you have

   # COMPUTE NODES

   NodeName=node[1-10] Sockets=2 CoresPerSocket=8 ThreadsPerCore=1 Procs=16  
RealMemory=60000  State=IDLE
   NodeName=gpu[1-2] CPUs=16 Gres=gpu:2 State=IDLE

   NodeName=node[11-22] Sockets=2 CoresPerSocket=16 ThreadsPerCore=1 Procs=32 
State=IDLE
   NodeName=node[23-24] Sockets=2 CoresPerSocket=20 ThreadsPerCore=1 Procs=40 
State=IDLE
   NodeName=gpu[3-4] CPUs=32 Gres=gpu:1 State=IDLE

But if you read

   man slurm.conf

you will find the following under the description of the parameter
"State" for nodes:

   "IDLE" should not be specified in the node configuration, but set the
   node state to "UNKNOWN" instead.

Cheers,

Loris


Sudeep Narayan Banerjee <snbaner...@iitgn.ac.in> writes:

Dear Loris: I am very sorry to address as Support; actually it has
become a bad habit for me which I will change. Sincere Apologies!

Yes, I have checked while adding hybrid arch of hardware but while
executing slurmctld, it shows mismatch in core-count and also the
existing 32core nodes goes to Dowm/Drng mode and new 40-core nodes
sets to IDLE.

Any help/guide to some link will be highly appreciated!

Thanks & Regards,
Sudeep Narayan Banerjee
System Analyst | Scientist B
Information System Technology Facility
Academic Block 5 | Room 110
Indian Institute of Technology Gandhinagar
Palaj, Gujarat 382355 INDIA
On 18/05/20 6:30 pm, Loris Bennett wrote:

  Dear Sudeep,

Sudeep Narayan Banerjee <snbaner...@iitgn.ac.in> writes:

  Dear Support,


This mailing list is not really the Slurm support list.  It is just the
Slurm User Community List, so basically a bunch of people just like you.

  node11-22 is having 16cores socket x 2 and node23-24 is having 20cores
socket x 2. In slurm.conf file (attached), can we merge all the nodes
11-24 (having different core count) and have a single queue or
partition name?


Yes, you can have a partition consisting of heterogeneous nodes.  Have
you tried this?  Was there a problem?

Cheers,

Loris

Reply via email to