Re: [slurm-users] Several slurmdbds against one mysql server?
Hello, This is the first time that I'm installing Slurm, so things are not very clear to me yet (even more so for multi-cluster operation). Brian Andrus writes: > You can do it however you like. You asked if there was a good or existing way > to > do it easily, that was provided. Up to you if you want to write your own > scripts > that do the work and manage that, or just have to learn the ins and outs of > running sreport. I'm not sure what scripts you have in mind above, since as far as I can see I already have a working solution for what I need (i.e. keep all job records from different clusters in a single database). But let's say I go for the federated cluster option. I think my question still holds. Let's say, for clarity, that I have two clusters (CA and CB) and another machine (DB) where I will store the mysql database. As far as I can see, in terms of the daemons running in each machine, I can implement the whole thing in two ways: option 1) CA: slurmd, slurmctld (AccountingStorageHost: DB) CB: slurmd, slurmctld (AccountingStorageHost: DB) DB: slurmdbd, mysqld option 2) CA: slurmd, slurmctld, slurmdbd (StorageHost: DB) CB: slurmd, slurmctld, slurmdbd (StorageHost: DB) DB: mysqld By reading the documentation on multi-cluster and federated clusters I think option 1) is the preferred way, but I was just trying to understand why and what are the pros/cons of each option. Thanks, -- Ángel de Vicente Research Software Engineer (Supercomputing and BigData) Tel.: +34 922-605-747 Web.: http://research.iac.es/proyecto/polmag/ GPG: 0x8BDC390B69033F52 smime.p7s Description: S/MIME cryptographic signature
Re: [slurm-users] Several slurmdbds against one mysql server?
On 5/1/23 09:22, Angel de Vicente wrote: This is the first time that I'm installing Slurm, so things are not very clear to me yet (even more so for multi-cluster operation). Brian Andrus writes: You can do it however you like. You asked if there was a good or existing way to do it easily, that was provided. Up to you if you want to write your own scripts that do the work and manage that, or just have to learn the ins and outs of running sreport. I'm not sure what scripts you have in mind above, since as far as I can see I already have a working solution for what I need (i.e. keep all job records from different clusters in a single database). If I read Brian's comments correctly, he's saying that Slurm already has a well-tested and documented solution for multi-cluster sites: Federated clusters. You don't HAVE to use the solution that Slurm/SchedMD provides, but it will be the easy and well tested solution for you. If you don't want to use federated clusters, you are free to do so. But then you have to write *your own scripts* to implement your own ideas. Probably no-one can help you with your ideas, and you will have to develop everything by yourself from scratch (not an easy task if this is your first experience with Slurm). I hope Brian's comments will help you select the best way forward. The slurm-users list is generally helpful, also to new Slurm users. /Ole But let's say I go for the federated cluster option. I think my question still holds. Let's say, for clarity, that I have two clusters (CA and CB) and another machine (DB) where I will store the mysql database. As far as I can see, in terms of the daemons running in each machine, I can implement the whole thing in two ways: option 1) CA: slurmd, slurmctld (AccountingStorageHost: DB) CB: slurmd, slurmctld (AccountingStorageHost: DB) DB: slurmdbd, mysqld option 2) CA: slurmd, slurmctld, slurmdbd (StorageHost: DB) CB: slurmd, slurmctld, slurmdbd (StorageHost: DB) DB: mysqld By reading the documentation on multi-cluster and federated clusters I think option 1) is the preferred way, but I was just trying to understand why and what are the pros/cons of each option.
Re: [slurm-users] Several slurmdbds against one mysql server?
Hello, Ole Holm Nielsen writes: > If I read Brian's comments correctly, he's saying that Slurm already has a > well-tested and documented solution for multi-cluster sites: Federated > clusters. Thanks Ole. Don't get me wrong, I have nothing against using Federated clusters, and I guess I will probably end up going for it, but my question keeps just the same (as far as I understand nothing changes in that respect with multi-cluster or federated setting?): whether I should just run one slurmdbd daemon or several. Cheers, -- Ángel de Vicente Research Software Engineer (Supercomputing and BigData) Tel.: +34 922-605-747 Web.: http://research.iac.es/proyecto/polmag/ GPG: 0x8BDC390B69033F52 smime.p7s Description: S/MIME cryptographic signature
Re: [slurm-users] Several slurmdbds against one mysql server?
Hi Angel, On 5/1/23 11:28, Angel de Vicente wrote: Ole Holm Nielsen writes: If I read Brian's comments correctly, he's saying that Slurm already has a well-tested and documented solution for multi-cluster sites: Federated clusters. Thanks Ole. Don't get me wrong, I have nothing against using Federated clusters, and I guess I will probably end up going for it, but my question keeps just the same (as far as I understand nothing changes in that respect with multi-cluster or federated setting?): whether I should just run one slurmdbd daemon or several. As Brian wrote: On a technical note: slurm keeps the detailed accounting data for each cluster in separate TABLES within a single database. In the Federation page https://slurm.schedmd.com/federation.html it is implicitly assumed that the sacctmgr command talks only to a single slurmdbd instance. It is not, however, explicitly stated as an answer to your question. You can see in another presentation that there is only a *single* slurmdbd in a federated multi-cluster scenario: https://slurm.schedmd.com/SLUG18/slurm_overview.pdf Look at slide 28 "Typical Enterprise Architecture". /Ole
Re: [slurm-users] Several slurmdbds against one mysql server?
Hello Ole, Ole Holm Nielsen writes: > As Brian wrote: > >> On a technical note: slurm keeps the detailed accounting data for each >> cluster >> in separate TABLES within a single database. > > In the Federation page > https://urldefense.com/v3/__https://slurm.schedmd.com/federation.html__;!!D9dNQwwGXtA!UXs13P7Zdf-J6x0HmI1pkRQ7dxPXonmaR08N9UtrXNcoixhdJMhbWu2-wEKkxP8qjCcbDTbNpaJyJP224dxuZU6gbW1FV7rFvg$ > it is implicitly assumed that the sacctmgr command talks only to a single > slurmdbd instance. It is not, however, explicitly stated as an answer to your > question. And hence my question.. because as I was saying in a previous mail, reading the documentation I understand that this is the standard way to do it, but right now I got it working the other way: in each cluster I have one slurmdbd daemon that connects with a single mysqld daemon in a third machine (option 2 from my question). I have a single database with detailed accounting data for each cluster in separate tables, and from each cluster I can query the whole database so as far as I can see all is working fine but it is implemented different to the standard approach. I did it this way not because I wanted something special or outside of the standard, simply because it was not very clear to me from the documentation which way to go and this came natural when implementing it (maybe simply because in the database machine I don't have Slurm installed). And I have no problem with changing the installation to a single slurmdbd daemon if I need to. But this being my first time I just hope to learn if this is really a bad idea that is going to bite me in the near future when these machines go to production and I should change to the standard way, or in general whether someone has a clear idea of the pros/cons of both ways. Sorry if I'm being a pest with this. Thanks, -- Ángel de Vicente Research Software Engineer (Supercomputing and BigData) Tel.: +34 922-605-747 Web.: http://research.iac.es/proyecto/polmag/ GPG: 0x8BDC390B69033F52 smime.p7s Description: S/MIME cryptographic signature
Re: [slurm-users] Several slurmdbds against one mysql server?
On 5/1/23 12:08, Angel de Vicente wrote: Hello Ole, Ole Holm Nielsen writes: As Brian wrote: On a technical note: slurm keeps the detailed accounting data for each cluster in separate TABLES within a single database. In the Federation page https://urldefense.com/v3/__https://slurm.schedmd.com/federation.html__;!!D9dNQwwGXtA!UXs13P7Zdf-J6x0HmI1pkRQ7dxPXonmaR08N9UtrXNcoixhdJMhbWu2-wEKkxP8qjCcbDTbNpaJyJP224dxuZU6gbW1FV7rFvg$ it is implicitly assumed that the sacctmgr command talks only to a single slurmdbd instance. It is not, however, explicitly stated as an answer to your question. And hence my question.. because as I was saying in a previous mail, reading the documentation I understand that this is the standard way to do it, but right now I got it working the other way: in each cluster I have one slurmdbd daemon that connects with a single mysqld daemon in a third machine (option 2 from my question). I have a single database with detailed accounting data for each cluster in separate tables, and from each cluster I can query the whole database so as far as I can see all is working fine but it is implemented different to the standard approach. I did it this way not because I wanted something special or outside of the standard, simply because it was not very clear to me from the documentation which way to go and this came natural when implementing it (maybe simply because in the database machine I don't have Slurm installed). And I have no problem with changing the installation to a single slurmdbd daemon if I need to. But this being my first time I just hope to learn if this is really a bad idea that is going to bite me in the near future when these machines go to production and I should change to the standard way, or in general whether someone has a clear idea of the pros/cons of both ways. If implementing Slurm for the first time, the slurm-users mailing list is probably the most helpful way to ask questions. The official Slurm documentation is of course the place to start learning. Some people have found my Slurm Wiki page helpful: https://wiki.fysik.dtu.dk/Niflheim_system/SLURM/ However, I do not describe federated clusters because we don't use this aspect. I also recommend SchedMD's paid support contracts, since they are the experts and give a fantastic service: https://www.schedmd.com/support.php /Ole
Re: [slurm-users] Several slurmdbds against one mysql server?
Hello, Ole Holm Nielsen writes: > Some people have found my Slurm Wiki page helpful: > https://urldefense.com/v3/__https://wiki.fysik.dtu.dk/Niflheim_system/SLURM/__;!!D9dNQwwGXtA!XMmnNXjYeab2rG3idS5c4OZZWOH-xBHl13dhN9GL954dY5t_semYQVyc07oGLuO7iq3gfU-zuirJ59nt9GIGA7TmbnZfVPtBJw$ me being one of them :-) [in my bookmarks your wiki sits next to the official Slurm page]. This installation was done in Ubuntu machines, so I could only use the information in your wiki to a certain extent, but it is very clearly organized and very useful. Thanks a lot, -- Ángel de Vicente Research Software Engineer (Supercomputing and BigData) Tel.: +34 922-605-747 Web.: http://research.iac.es/proyecto/polmag/ GPG: 0x8BDC390B69033F52 smime.p7s Description: S/MIME cryptographic signature