Hi
I'm not an expert, but is it possible that the currently running jobs is
consuming the whole node because it is allocated the whole memory of the
node (so the other 2 jobs had to wait until it finishes)?
Maybe if you try to restrict the required memory for each job?
Regards
On Thu, Jan 18, 20
Hi
I'm having a hard time figuring out the distribution of jobs between 2
clusters in a Slurm multi-cluster environment. The documentation says that
each job is submitted to the cluster that provides the earliest start time,
and once the task is submitted to a cluster, it can't be re-distributed t
I'm having a hard time figuring out the distribution of jobs between 2
clusters in a Slurm multi-cluster environment. The documentation says that
each job is submitted to the cluster that provides the earliest start time,
and once the task is submitted to a cluster, it can't be re-distributed to
an
Hi
slurmdbd produces the following error in the log file:
error: CONN:X Failed to unpack DBD_NODE_STATE message
I tried to restart it many times, but it keeps getting back. I restarted
the machine, but it's still there.
Regards
--
Mohammed
to point gdb at the location of the source code, and then follow
> any of the gazillion tutorials around about gdb. If you are not familiar
> with gdb already, I strongly recommend that you start with some simpler
> program before attempting something as big as slurm.
>
> Have a great
Hi
I clone the slurm repository from github (version 23.11), and tried to
configure it as follows:
configure --config-cache --prefix=/usr/slurm_vm_23.11
--sysconfdir=/etc/slurm_vm_23 --with-http-parser=/usr/ --with-yaml=/usr/
--with-jwt=/usr/ --with-mysql_config=/usr/bin --enable-debug
--with-pmix
Hi
Is it possible to debug "sbatch" itself when submitting a script? For
example, I want to debug the following command:
sbatch -Mall some_script.sh
I don't want to debug the "some_script.sh". I want to debug the "sbatch"
itself when submitting the "some_script.sh". I tried to use "gdb" but I'm
n
Hi
I have 2 slurm clusters: cluster A with 3 compute nodes, each node has 32
CPUs; Cluster B with 4 compute nodes, each node has 8 CPUs. I'm using slurm
multicluster on clusters A and B. I tried to run Nas Parallel Benchmarks
(sp.A.x) on them. Initially, I tried to benchmark the execution time and
Hi
May be slurm rest api be useful (https://slurm.schedmd.com/rest.html)? But
I think you will need to generate a token to be able to communicate with
the cluster.
Regards
On Sun, Aug 27, 2023, 8:20 AM Steven Swanson wrote:
> Can I submit jobs with a computer/docker container that is not part
Hi
I'm doing a simple benchmark to record the time for issuing a sbatch
command. The contents of the script are:
#!/bin/bash
IFS='= ' read _ local_clusterid <<< $(scontrol show config |grep -i
clustername) # Extract local cluster name
echo "Local cluster: "$local_clusterid
# Check input clusters
Hi
I'm doing a simple benchmark to record the time for issuing a sbatch
command. The contents of the script are:
#!/bin/bash
IFS='= ' read _ local_clusterid <<< $(scontrol show config |grep -i
clustername) # Extract local cluster name
echo "Local cluster: "$local_clusterid
# Check input clusters
Hi
I work on 3 clusters: A, B, C. Each of Clusters A and C has 3 compute nodes
and the head node. One of the 3 compute nodes has an old GPU in each
cluster of A and C. All nodes, on all clusters, have Ubuntu 22.04 except
for the 2 nodes with GPU (both of them have Ubuntu 18.04 to suit the old
GPU
Hi
I'm also trying to use slurm rest api. I wonder if the error about slurmdbd
has anything to do with it. Does slurmctld connect correctly to slurmdbd?
Regards
On Wed, Jun 28, 2023, 9:03 PM Brian Andrus wrote:
> Vlad,
>
> Actually, it looks like it is working. You are using v0.39 for the pars
ability you describe, but separate
> clusters is not one of them.
>
> Brian Andrus
> On 6/26/2023 6:11 AM, mohammed shambakey wrote:
>
> Hi
>
> Just out of interest, I wonder what the exact difference between slurm
> multi-cluster and federation (apart from unique jo
Hi
Just out of interest, I wonder what the exact difference between slurm
multi-cluster and federation (apart from unique job id, and federation
limitations) is. Usually, I use the "-Mall" option with multi-cluster.
Initially, I thought the federation will send tasks to more than on cluster
at onc
Hi
Is it possible to connect slurm restapi queries to a
multi-cluster/federation? I guess each request uses one (and only one) JWT,
so it is not possible to do it, right?
Regards
--
Mohammed
t for the sview).
After that, sview is installed in the correct location.
Regards
On Sun, Apr 23, 2023 at 10:50 AM Ole Holm Nielsen <
ole.h.niel...@fysik.dtu.dk> wrote:
> On 23-04-2023 02:43, mohammed shambakey wrote:
> > I installed slurm 23.11.0-0rc1, and sview is not installed, despi
Hi
I installed slurm 23.11.0-0rc1, and sview is not installed, despite it
exists in /src/sview/sview. I can execute it from that path but not
/bin (because it does not exist there).
I tried just copying it to /bin, but it complained
about being just a wrapper.
I wonder if I'm missing something?
ich may be an NFS share or something like that.
> I would also strongly recommend not mixing Ubuntu releases in the same
> cluster.
>
> Reed
>
> On Apr 14, 2023, at 11:12 AM, mohammed shambakey
> wrote:
>
> Hi
>
> I'm new to slurm, and sorry if this is a repeated e
Hi
I'm new to slurm, and sorry if this is a repeated email. I have a cluster
at my work consisting of one head node, and 3 compute nodes. Ubuntu 22.04
is installed on the head node, and 2 compute nodes, whereas the third has
Ubuntu 18.04 (it is needed because it hosts an old M10 GPU).
I installed
20 matches
Mail list logo