perhaps by setting
CUDA_VISIBLE_DEVICES), or is there any additional config for that? Do I
really need that extra "GPU" partition that the vendor put in for any of
this, or is there a way to bind GRES resources to a particular partition in
such a way that simply launching jobs in that partiti
brück Center for Molecular Medicine in
> the Helmholtz Association / Charité – Universitätsmedizin Berlin
>
> Visiting Address: Invalidenstr. 80, 3rd Floor, Room 03 028, 10117 Berlin
> Postal Address: Chariteplatz 1, 10117 Berlin
>
> E-Mail: manuel.holtgr...@bihealth.de
> Phone: +49
likely some others I miss.
>
>
> MfG
> --
> Markus Kötter, +49 681 870832434
> 30159 Hannover, Lange Laube 6
> Helmholtz Center for Information Security
>
--
Analabha Roy
Assistant Professor
Department of Physics
<http://www.buruniv.ac.in/academics/department/
Hi,
I've just finished setup of a single node "cluster" with slurm on ubuntu
20.04. Infrastructural limitations prevent me from running it 24/7, and
it's only powered on during business hours.
Currently, I have a cron job running that hibernates that sole node before
closing time.
The hiberna
the nodes when needed.
>
> Cheers,
> Florian
> ------
> *From:* slurm-users on behalf of
> Analabha Roy
> *Sent:* Monday, 6 February 2023 18:21
> *To:* slurm-users@lists.schedmd.com
> *Subject:* [External] [slurm-users] Hibernating a whole cluster
>
> Hi,
>
&g
t; cronjob that hibernates/shuts it down will do so when there are no jobs
> running. At least in theory.
>
> Hope that helps.
>
> Sean
>
> ---
> Sean McGrath
> Senior Systems Administrator, IT Services
>
> --
> *From:* slurm-users
if needed.
>
> Il 07/02/2023 13:14, Analabha Roy ha scritto:
> > Hi Sean,
> >
> > Thanks for your awesome suggestion! I'm going through the reservation
> > docs now. At first glance, it seems like a daily reservation would turn
> > down jobs that are too
---
> Sean McGrath
> Senior Systems Administrator, IT Services
>
> --
> *From:* slurm-users on behalf of
> Analabha Roy
> *Sent:* Tuesday 7 February 2023 12:14
> *To:* Slurm User Community List
> *Subject:* Re: [slurm-users] [External] Hibernatin
ent on the veracity and reliability of the AI's response.
AR
--
Analabha Roy
Assistant Professor
Department of Physics
<http://www.buruniv.ac.in/academics/department/physics>
The University of Burdwan <http://www.buruniv.ac.in/>
Golapbag Campus, Barddhaman 713104
West Bengal, Indi
27;ll have to fix afterwards), it does not directly impact users: their
> jobs will run and complete/fail regardless of slurmctld state. At most
> the users won't receive a completion mail and they will be billed less
> than expected.
>
> Diego
>
> Il 10/02/2023 20:06, Analabh
Hi,
Thanks for the advice. I already tried out mana, but at present it only
works with mpich, not openmpi, which is what I've setup via Ubuntu.
AR
On Sun, 19 Feb 2023, 02:10 Christopher Samuel, wrote:
> On 2/10/23 11:06 am, Analabha Roy wrote:
>
> > I'm havi
set is for the GPU:
https://github.com/hariseldon99/buparamshavak/blob/main/shavak_root/etc/slurm-llnl/gres.conf
Thanks for your attention,
Regards,
AR
--
Analabha Roy
Assistant Professor
Department of Physics
<http://www.buruniv.ac.in/academics/department/physics>
The University of Burdwan <h
I started with slurm I built the sbatch one small step at a time.
> Nodes, cores. memory, partition, mail, etc
>
> It sounds like your config is very close but your problem may be in the
> submit script.
>
> Best of luck and welcome to slurm. It is very powerful with a huge
> comm
nce you are comfortable I would urge you to use the
> NodeName/Partition descriptor format above and encourage your users to
> declare oversubscription in their jobs. It is a little more work up front
> but far easier than correcting scripts later.
>
>
> Doug
>
>
>
efault. Without an
> estimated runtime to work with the backfill scheduler is crippled. In an
> environment mixing single thread and MPI jobs of various sizes it is
> critical the jobs are honest in their requirements providing slurm the
> information needed to correctly assign resources
100 CPUs running, according to this, but 60 according to
scontrol on the node??
The submission scripts are on pastebin:
https://pastebin.com/s21yXFH2
https://pastebin.com/C0uW0Aut
AR
> Doug
>
>
> On Sun, Feb 26, 2023 at 2:43 AM Analabha Roy
> wrote:
>
>> Hi Doug,
.
Thanks for all the help. It was awesome.
AR
> Doug
>
> On Sun, Feb 26, 2023 at 10:25 PM Analabha Roy
> wrote:
>
>> Hey,
>>
>>
>> Thanks for sticking with this.
>>
>> On Sun, 26 Feb 2023 at 23:43, Doug Meyer wrote:
>>
>>> Hi
17 matches
Mail list logo