Re: [slurm-users] Several slurmdbds against one mysql server?

2023-05-01 Thread Angel de Vicente
Hello,

This is the first time that I'm installing Slurm, so things are not very
clear to me yet (even more so for multi-cluster operation).

Brian Andrus  writes:

> You can do it however you like. You asked if there was a good or existing way 
> to
> do it easily, that was provided. Up to you if you want to write your own 
> scripts
> that do the work and manage that, or just have to learn the ins and outs of
> running sreport.

I'm not sure what scripts you have in mind above, since as far as I can
see I already have a working solution for what I need (i.e. keep all job
records from different clusters in a single database).

But let's say I go for the federated cluster option. I think my question
still holds. Let's say, for clarity, that I have two clusters (CA and
CB) and another machine (DB) where I will store the mysql database. As
far as I can see, in terms of the daemons running in each machine, I can
implement the whole thing in two ways:

option 1)
  CA: slurmd, slurmctld (AccountingStorageHost: DB)
  CB: slurmd, slurmctld (AccountingStorageHost: DB)
  DB: slurmdbd, mysqld 

option 2)
  CA: slurmd, slurmctld, slurmdbd (StorageHost: DB)
  CB: slurmd, slurmctld, slurmdbd (StorageHost: DB)
  DB: mysqld


By reading the documentation on multi-cluster and federated clusters I
think option 1) is the preferred way, but I was just trying to
understand why and what are the pros/cons of each option.

Thanks,
-- 
Ángel de Vicente
 Research Software Engineer (Supercomputing and BigData)
 Tel.: +34 922-605-747
 Web.: http://research.iac.es/proyecto/polmag/

 GPG: 0x8BDC390B69033F52


smime.p7s
Description: S/MIME cryptographic signature


Re: [slurm-users] Several slurmdbds against one mysql server?

2023-05-01 Thread Ole Holm Nielsen

On 5/1/23 09:22, Angel de Vicente wrote:

This is the first time that I'm installing Slurm, so things are not very
clear to me yet (even more so for multi-cluster operation).

Brian Andrus  writes:


You can do it however you like. You asked if there was a good or existing way to
do it easily, that was provided. Up to you if you want to write your own scripts
that do the work and manage that, or just have to learn the ins and outs of
running sreport.


I'm not sure what scripts you have in mind above, since as far as I can
see I already have a working solution for what I need (i.e. keep all job
records from different clusters in a single database).


If I read Brian's comments correctly, he's saying that Slurm already has a 
well-tested and documented solution for multi-cluster sites: Federated 
clusters.  You don't HAVE to use the solution that Slurm/SchedMD provides, 
but it will be the easy and well tested solution for you.


If you don't want to use federated clusters, you are free to do so.  But 
then you have to write *your own scripts* to implement your own ideas. 
Probably no-one can help you with your ideas, and you will have to develop 
everything by yourself from scratch (not an easy task if this is your 
first experience with Slurm).


I hope Brian's comments will help you select the best way forward.  The 
slurm-users list is generally helpful, also to new Slurm users.


/Ole



But let's say I go for the federated cluster option. I think my question
still holds. Let's say, for clarity, that I have two clusters (CA and
CB) and another machine (DB) where I will store the mysql database. As
far as I can see, in terms of the daemons running in each machine, I can
implement the whole thing in two ways:

option 1)
   CA: slurmd, slurmctld (AccountingStorageHost: DB)
   CB: slurmd, slurmctld (AccountingStorageHost: DB)
   DB: slurmdbd, mysqld

option 2)
   CA: slurmd, slurmctld, slurmdbd (StorageHost: DB)
   CB: slurmd, slurmctld, slurmdbd (StorageHost: DB)
   DB: mysqld


By reading the documentation on multi-cluster and federated clusters I
think option 1) is the preferred way, but I was just trying to
understand why and what are the pros/cons of each option.




Re: [slurm-users] Several slurmdbds against one mysql server?

2023-05-01 Thread Angel de Vicente
Hello, 

Ole Holm Nielsen  writes:

> If I read Brian's comments correctly, he's saying that Slurm already has a
> well-tested and documented solution for multi-cluster sites: Federated 
> clusters.

Thanks Ole. Don't get me wrong, I have nothing against using Federated
clusters, and I guess I will probably end up going for it, but my
question keeps just the same (as far as I understand nothing changes in
that respect with multi-cluster or federated setting?): whether I should
just run one slurmdbd daemon or several.

Cheers,
-- 
Ángel de Vicente
 Research Software Engineer (Supercomputing and BigData)
 Tel.: +34 922-605-747
 Web.: http://research.iac.es/proyecto/polmag/

 GPG: 0x8BDC390B69033F52


smime.p7s
Description: S/MIME cryptographic signature


Re: [slurm-users] Several slurmdbds against one mysql server?

2023-05-01 Thread Ole Holm Nielsen

Hi Angel,

On 5/1/23 11:28, Angel de Vicente wrote:

Ole Holm Nielsen  writes:


If I read Brian's comments correctly, he's saying that Slurm already has a
well-tested and documented solution for multi-cluster sites: Federated clusters.


Thanks Ole. Don't get me wrong, I have nothing against using Federated
clusters, and I guess I will probably end up going for it, but my
question keeps just the same (as far as I understand nothing changes in
that respect with multi-cluster or federated setting?): whether I should
just run one slurmdbd daemon or several.


As Brian wrote:

On a technical note: slurm keeps the detailed accounting data for each cluster in separate TABLES within a single database. 


In the Federation page https://slurm.schedmd.com/federation.html it is 
implicitly assumed that the sacctmgr command talks only to a single 
slurmdbd instance.  It is not, however, explicitly stated as an answer to 
your question.


You can see in another presentation that there is only a *single* slurmdbd 
in a federated multi-cluster scenario: 
https://slurm.schedmd.com/SLUG18/slurm_overview.pdf

Look at slide 28 "Typical Enterprise Architecture".

/Ole



Re: [slurm-users] Several slurmdbds against one mysql server?

2023-05-01 Thread Angel de Vicente
Hello Ole,

Ole Holm Nielsen  writes:

> As Brian wrote:
>
>> On a technical note: slurm keeps the detailed accounting data for each 
>> cluster
>> in separate TABLES within a single database. 
>
> In the Federation page
> https://urldefense.com/v3/__https://slurm.schedmd.com/federation.html__;!!D9dNQwwGXtA!UXs13P7Zdf-J6x0HmI1pkRQ7dxPXonmaR08N9UtrXNcoixhdJMhbWu2-wEKkxP8qjCcbDTbNpaJyJP224dxuZU6gbW1FV7rFvg$
> it is implicitly assumed that the sacctmgr command talks only to a single
> slurmdbd instance.  It is not, however, explicitly stated as an answer to your
> question.

And hence my question.. because as I was saying in a previous mail,
reading the documentation I understand that this is the standard way to
do it, but right now I got it working the other way: in each cluster I
have one slurmdbd daemon that connects with a single mysqld daemon in a
third machine (option 2 from my question).

I have a single database with detailed accounting data for each cluster
in separate tables, and from each cluster I can query the whole database
so as far as I can see all is working fine but it is implemented
different to the standard approach.

I did it this way not because I wanted something special or outside of
the standard, simply because it was not very clear to me from the
documentation which way to go and this came natural when implementing it
(maybe simply because in the database machine I don't have Slurm
installed). And I have no problem with changing the installation to a
single slurmdbd daemon if I need to.

But this being my first time I just hope to learn if this is really a
bad idea that is going to bite me in the near future when these machines
go to production and I should change to the standard way, or in general
whether someone has a clear idea of the pros/cons of both ways.

Sorry if I'm being a pest with this.

Thanks,
-- 
Ángel de Vicente
 Research Software Engineer (Supercomputing and BigData)
 Tel.: +34 922-605-747
 Web.: http://research.iac.es/proyecto/polmag/

 GPG: 0x8BDC390B69033F52


smime.p7s
Description: S/MIME cryptographic signature


Re: [slurm-users] Several slurmdbds against one mysql server?

2023-05-01 Thread Ole Holm Nielsen

On 5/1/23 12:08, Angel de Vicente wrote:

Hello Ole,

Ole Holm Nielsen  writes:


As Brian wrote:


On a technical note: slurm keeps the detailed accounting data for each cluster
in separate TABLES within a single database.


In the Federation page
https://urldefense.com/v3/__https://slurm.schedmd.com/federation.html__;!!D9dNQwwGXtA!UXs13P7Zdf-J6x0HmI1pkRQ7dxPXonmaR08N9UtrXNcoixhdJMhbWu2-wEKkxP8qjCcbDTbNpaJyJP224dxuZU6gbW1FV7rFvg$
it is implicitly assumed that the sacctmgr command talks only to a single
slurmdbd instance.  It is not, however, explicitly stated as an answer to your
question.


And hence my question.. because as I was saying in a previous mail,
reading the documentation I understand that this is the standard way to
do it, but right now I got it working the other way: in each cluster I
have one slurmdbd daemon that connects with a single mysqld daemon in a
third machine (option 2 from my question).

I have a single database with detailed accounting data for each cluster
in separate tables, and from each cluster I can query the whole database
so as far as I can see all is working fine but it is implemented
different to the standard approach.

I did it this way not because I wanted something special or outside of
the standard, simply because it was not very clear to me from the
documentation which way to go and this came natural when implementing it
(maybe simply because in the database machine I don't have Slurm
installed). And I have no problem with changing the installation to a
single slurmdbd daemon if I need to.

But this being my first time I just hope to learn if this is really a
bad idea that is going to bite me in the near future when these machines
go to production and I should change to the standard way, or in general
whether someone has a clear idea of the pros/cons of both ways.


If implementing Slurm for the first time, the slurm-users mailing list is 
probably the most helpful way to ask questions.  The official Slurm 
documentation is of course the place to start learning.  Some people have 
found my Slurm Wiki page helpful:

https://wiki.fysik.dtu.dk/Niflheim_system/SLURM/
However, I do not describe federated clusters because we don't use this 
aspect.


I also recommend SchedMD's paid support contracts, since they are the 
experts and give a fantastic service: https://www.schedmd.com/support.php


/Ole



Re: [slurm-users] Several slurmdbds against one mysql server?

2023-05-01 Thread Angel de Vicente
Hello,

Ole Holm Nielsen  writes:

> Some people have found my Slurm Wiki page helpful:
> https://urldefense.com/v3/__https://wiki.fysik.dtu.dk/Niflheim_system/SLURM/__;!!D9dNQwwGXtA!XMmnNXjYeab2rG3idS5c4OZZWOH-xBHl13dhN9GL954dY5t_semYQVyc07oGLuO7iq3gfU-zuirJ59nt9GIGA7TmbnZfVPtBJw$

me being one of them :-) [in my bookmarks your wiki sits next to the
official Slurm page]. This installation was done in Ubuntu machines,
so I could only use the information in your wiki to a certain extent,
but it is very clearly organized and very useful. 

Thanks a lot,
-- 
Ángel de Vicente
 Research Software Engineer (Supercomputing and BigData)
 Tel.: +34 922-605-747
 Web.: http://research.iac.es/proyecto/polmag/

 GPG: 0x8BDC390B69033F52


smime.p7s
Description: S/MIME cryptographic signature