Hi Steven,

yes, you have the syntax a bit wrong. If you consult the documentation (or the man-page) of slurm.conf you find this in the "NODE CONFIGURATION" section (in the paragraph about "NodeName"):

 Note that if the short form of the hostname is not used, it may prevent
 use of hostlist expressions (the numeric portion in brackets must be at
 the end of the string)

So the respective part in your slurm.conf should be

 NodeName=node[1-7] ...

and you have to configure your name resolution (default domain?) such that these short names are resolvable to IP-addresses.

If that's not feasible you might have to use e.g. something like this

 NodeName=DEFAULT CPUs=20 RealMemory=48
 NodeName=node1.ods.vuw.ac.nz
 NodeName=node2.ods.vuw.ac.nz
 ...


BTW: do your nodes only have 48 MB of memory? The unit in which "RealMemory" has to be specified is megabytes.

Regards,
Hermann





On 12/4/24 01:47, Steven Jones via slurm-users wrote:
I guess I have the syntax wrong,

root@node1 slurm]# /usr/sbin/slurmd -D
slurmd: fatal: Unable to create NodeAddr list from node[1-7].ods.vuw.ac.nz
[root@node1 slurm]# tail /etc/slurm/slurm.conf
#ResumeRate=
#SuspendExcNodes=
#SuspendExcParts=
#SuspendRate=
#SuspendTime=
#
#
# COMPUTE NODES
NodeName=node[1-7].ods.vuw.ac.nz CPUs=20 RealMemory=48 State=UNKNOWN
PartitionName=debug Nodes=ALL Default=YES MaxTime=INFINITE State=UP
[root@node1 slurm]#


regards

Steven


------------------------------------------------------------------------
*From:* Steven Jones via slurm-users <slurm-users@lists.schedmd.com>
*Sent:* Wednesday, 4 December 2024 1:28 pm
*To:* slurm-us...@schedmd.com <slurm-us...@schedmd.com>
*Subject:* [slurm-users] Re: Slurm not running on a warewulf node
Well that is a start, TY.

[root@node1 slurm]# /usr/sbin/slurmd -D
slurmd: fatal: Unable to determine this slurmd's NodeName

Where is this set?

regards

Steven


------------------------------------------------------------------------
*From:* Jeffrey R. Lang <jrl...@uwyo.edu>
*Sent:* Wednesday, 4 December 2024 1:17 pm
*To:* Steven Jones <steven.jo...@vuw.ac.nz>; slurm-us...@schedmd.com <slurm-us...@schedmd.com>
*Subject:* RE: Slurm not running on a warewulf node
        
You don't often get email from jrl...@uwyo.edu. Learn why this is important <https://aka.ms/LearnAboutSenderIdentification>
        

Steve

  Trying running the failing process from the command line and use the -D option.

Per man page: Run slurmd in the foreground. Error and debug messages will be copied to stderr.

*Jeffrey R. Lang*

Advanced Research Computing Center

University of Wyoming, Information Technology Center

1000 E. University Ave

Laramie,  WY 82071

Email: jrl...@uwyo.edu

Work: 307.766.3381

*From:* Steven Jones via slurm-users <slurm-users@lists.schedmd.com>
*Sent:* Tuesday, December 3, 2024 5:39 PM
*To:* slurm-us...@schedmd.com
*Subject:* [slurm-users] slurm not running on a warewulf node

◆ This message was sent from a non-UWYO address. Please exercise caution when clicking links or opening attachments from external sources.

Hi,

I have set a log creation/location in slurm.conf   as,

SlurmdLogFile=/var/log/slurm/slurmd.log

But it is 0 length.

Slurm will not run, what else do I need to do to log why its failing pls?

regards

Steven




--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

Reply via email to