On Wed, 1 Mar 2023 at 07:51, Doug Meyer wrote:
> Hi,
>
> I forgot one thing you didn't mention. When you change the node
> descriptors and partitions you have to also restart slurmctld. scontrol
> reconfigure works for the nodes but the main daemon has to be told to
> reread the config. Until
Hi,
I forgot one thing you didn't mention. When you change the node
descriptors and partitions you have to also restart slurmctld. scontrol
reconfigure works for the nodes but the main daemon has to be told to
reread the config. Until you restart the daemon it will be referencing the
config fro
Hey,
Thanks for sticking with this.
On Sun, 26 Feb 2023 at 23:43, Doug Meyer wrote:
> Hi,
>
> Suggest removing "boards=1", The docs say to include it but in previous
> discussions with schedmd we were advised to remove it.
>
>
I just did. Then ran scontrol reconfigure.
> When you are runni
Hi,
Suggest removing "boards=1", The docs say to include it but in previous
discussions with schedmd we were advised to remove it.
When you are running execute "scontrol show node " and look at
the lines ConfigTres and AllocTres. The former is what the maitre d
believes is available, the latter
Hi Doug,
Again, many thanks for your detailed response.
Based on my understanding of your previous note, I did the following:
I set the nodename with CPUs=64 Boards=1 SocketsPerBoard=2
CoresPerSocket=16 ThreadsPerCore=2
and the partitions with oversubscribe=force:2
then I put further restrictio
Hi,
You got me, I didn't know that " oversubscribe=FORCE:2" is an option. I'll
need to explore that.
I missed the question about srun. srun is the preferred I believe. I am
not associated with drafting the submit scripts but can ask my peer. You
do need to stipulate the number of cores you wa
Hi,
Thanks for your considered response. Couple of questions linger...
On Sat, 25 Feb 2023 at 21:46, Doug Meyer wrote:
> Hi,
>
> Declaring cores=64 will absolutely work but if you start running MPI
> you'll want a more detailed config description. The easy way to read it is
> "128=2 sockets *
Hi,
Declaring cores=64 will absolutely work but if you start running MPI you'll
want a more detailed config description. The easy way to read it is "128=2
sockets * 32 corespersocket * 2 threads per core".
NodeName=hpc[306-308] CPUs=128 Sockets=2 CoresPerSocket=32 ThreadsPerCore=2
RealMemory=512
Howdy, and thanks for the warm welcome,
On Fri, 24 Feb 2023 at 07:31, Doug Meyer wrote:
> Hi,
>
> Did you configure your node definition with the outputs of slurmd -C?
> Ignore boards. Don't know if it is still true but several years ago
> declaring boards made things difficult.
>
>
$ slurmd -C
Hi,
Did you configure your node definition with the outputs of slurmd -C?
Ignore boards. Don't know if it is still true but several years ago
declaring boards made things difficult.
Also, if you have hyperthreaded AMD or Intel processors your partition
declaration should be overscribe:2
Start w
Hi folks,
I have a single-node "cluster" running Ubuntu 20.04 LTS with the
distribution packages for slurm (slurm-wlm 19.05.5)
Slurm only ran one job in the node at a time with the default
configuration, leaving all other jobs pending.
This happened even if that one job only requested like a few c
11 matches
Mail list logo