this might be nothing, but i usually call --gres with an equals

srun --gres=gpu:k10:8

i'm not sure if the equals is optional or not



On Wed, Nov 16, 2016 at 4:34 AM, Dmitrij S. Kryzhevich <kryz...@ispms.ru> wrote:
>
> Hi,
>
> I have some issues with gres usage. I'm running slurm of 16.05.4 version and
> I have a small stand with 4 nodes+master. The best description of it would
> be to paste confs:
> slurm.conf: http://paste.org.ru/?m8v7ca
> gres.conf: http://paste.org.ru/?ouspnz
> They are populated on each node.
>
> And the problem is following:
>
> [dkryzhevich@gpu ~]$ srun -N 1 --gres gpu:c2050 <whatever>
> srun: error: Unable to allocate resources: Requested node configuration is
> not available
> [dkryzhevich@gpu ~]$
>
> Relevant logs: http://paste.org.ru/?mj4dfs
> Whatever I did with --gres flag it just does not start. What am I missing
> here?
>
> I tried to remove Type column from gres.conf and all nodes have gone into
> "drain" state. I tried to remove all details from Gres column in slurm.conf
> in addition (i.e. "NodeName=node2 Gres=gpu:1 CoresPerSocket=2
> ThreadsPerCore=2 State=UNKNOWN") and task was submitted but I want the
> ability to specify type of card in case I really need it.
>
> And two small unrelevant questions.
> 1. Is it possible to submit a job from any node, or is it master only? Start
> secondary slurmctl daemon on each node may be, I don't know.
> 2. Is it possible to start a job on two separate nodes with nvidia cards in
> a way something like
> $ srun --gres gpu:2
> ? The point is to use 2-3-4 cards installed on different nodes with some MPI
> connection between threads.
>
> BR,
> Dmitrij

Reply via email to