[slurm-users] RES: RES: How to delay the start of slurmd until Infiniband/OPA network is fully up?

2023-11-01 Thread Paulo Jose Braga Estrela
***" PÚBLICA -Mensagem original- De: slurm-users Em nome de Ole Holm Nielsen Enviada em: quarta-feira, 1 de novembro de 2023 05:19 Para: slurm-users@lists.schedmd.com Assunto: Re: [slurm-users] RES: How to delay the start of slurmd until Infiniband/OPA network is fully up? Hi Paulo,

[slurm-users] RES: RES: multiple srun commands in the same SLURM script

2023-11-01 Thread Paulo Jose Braga Estrela
...@schedmd.com Assunto: Re: [slurm-users] RES: multiple srun commands in the same SLURM script Paulo Jose Braga Estrela writes: > Hi, > > I think that you have a syntax error in your bash script. The "&" > means that you want to send a process to background not t

[slurm-users] RES: multiple srun commands in the same SLURM script

2023-10-31 Thread Paulo Jose Braga Estrela
Hi, I think that you have a syntax error in your bash script. The "&" means that you want to send a process to background not that you want to run many commands in parallel. To run commands in a serial fashion you should use cmd && cmd2, then the cmd2 will only be executed if the command 1 retu

[slurm-users] RES: RES: Change something in user's script using job_submit.lua plugin

2023-10-31 Thread Paulo Jose Braga Estrela
ado, 28 de outubro de 2023 03:48 Para: Slurm User Community List Cc: Paulo Jose Braga Estrela Assunto: Re: RES: [slurm-users] Change something in user's script using job_submit.lua plugin Hi Paulo, Maybe what you see is due to a bug then? You might try to update Slurm to see if has been fix

[slurm-users] RES: How to delay the start of slurmd until Infiniband/OPA network is fully up?

2023-10-31 Thread Paulo Jose Braga Estrela
I think that you should use NetworkManager-wait-online.service In RHEL 8. Take a look at its man page. It only allows the system reach network-online after all network interfaces are online. So, if your OP interfaces are managed by Network Manager, you can use it. PÚBLICA -Mensagem origina

[slurm-users] RES: Change something in user's script using job_submit.lua plugin

2023-10-27 Thread Paulo Jose Braga Estrela
sing job_submit.lua plugin Hi Paulo, Which Slurm version do you have, and did you set this in slurm.conf: JobSubmitPlugins=lua ? Perhaps you may find some useful information in this Wiki page: https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_configuration/#job-submit-plugins /Ole On 26-10-20

[slurm-users] Change something in user's script using job_submit.lua plugin

2023-10-26 Thread Paulo Jose Braga Estrela
Hi, Is it possible to change something in user's sbatch script by using a job_submit plugin? To be more specific, using Lua job_submit plugin. I'm trying to do the following in job_submit.lua when a user changes job's partition to "cloud" partition, but the script got executed without modifica

Re: [slurm-users] 2 nodes being randomly set to "not responding"

2021-07-21 Thread jose
Hi, most likely you might want to set it in exact opposite way, as slurm cloud scheduling guide says: "TreeWidth Since the slurmd daemons are not aware of the network addresses of other nodes in the cloud, the slurmd daemons on each node should be sent messages directly and not forward those me

Re: [slurm-users] SLURM in Virtual Machine

2019-09-12 Thread Jose A.
that make it worthwhile. So long as you allocate enough resources for the node (be it the controller or other) you will be fine. Brian Andrus On 9/12/2019 7:23 AM, Jose A wrote: > Dear all, > > In the expansion of our Cluster we are considering to install SLURM within a virtual machine

[slurm-users] SLURM in Virtual Machine

2019-09-12 Thread Jose A
Dear all, In the expansion of our Cluster we are considering to install SLURM within a virtual machine in order to simplify updates and reconfigurations. Does any of your have experience running SLURM in VMs? I would really appreciate if you could share your ideas and experiences. Thanks a lo

Re: [slurm-users] Changing node weights in partitions

2019-03-26 Thread Jose A
different nodes prioritize different types of jobs. Is that, specially step 4, possible! Thanks for the help. José > On 24. Mar 2019, at 21:52, Ole Holm Nielsen > wrote: > > Hi José, > >> On 23-03-2019 19:59, Jose A wrote: >> You got my point. I want a way in w

Re: [slurm-users] Changing node weights in partitions

2019-03-23 Thread Jose A
Hello Chris, You got my point. I want a way in which a partition influences the priority with a node takes new jobs. Any tip will be really appreciated. Thanks a lot. Cheers, José > On 23. Mar 2019, at 03:38, Chris Samuel wrote: > >> On 22/3/19 12:51 pm, Ole Holm Nielsen wrote: >> >> The

Re: [slurm-users] Changing node weights in partitions

2019-03-22 Thread Jose A
Dear Ole, Thanks for your fast reply. I really appreciate that. I had a look at your website and googled about “weight masks” but still have some questions. From your example I see that the mask definition is commented out. How to define what the mask means? If helps, I’ll put an easy examp