cpu limit using ulimit is pretty straightforward with pam_limits and /etc/security/limits.conf. On some of the login nodes we have a cpu limit of 10 minutes, so heavy processes will fail.
The memory was a bit more complicated (i.e. not pretty). We wanted that a user won't be able to use more than e.g. 1G for all processes combined. Using systemd we added the file /etc/systemd/system/user-.slice.d/20-memory.conf which contains: [Slice] MemoryLimit=1024M MemoryAccounting=true But we also wanted to restrict swap usage and we're still on cgroupv1, so systemd didn't help there. The ugly part comes with a pam_exec to a script that updates the memsw limit of the cgroup for the above slice. The script does more things, but the swap section is more or less: if [ "x$PAM_TYPE" = 'xopen_session' ]; then _id=`id -u $PAM_USER` if [ -z "$_id" ]; then exit 1 fi if [[ -e /sys/fs/cgroup/memory/user.slice/user-${_id}.slice/memory.memsw.limit_in_bytes ]]; then swap=$((1126 * 1024 * 1024)) echo $swap > /sys/fs/cgroup/memory/user.slice/user-${_id}.slice/memory.memsw.limit_in_bytes fi fi On Sun, Oct 31, 2021 at 6:36 PM Brian Andrus <toomuc...@gmail.com> wrote: > That is interesting to me. > > How do you use ulimit and systemd to limit user usage on the login nodes? > This sounds like something very useful. > > Brian Andrus > On 10/31/2021 1:08 AM, Yair Yarom wrote: > > Hi, > > If it helps, this is our setup: > 6 clusters (actually a bit more) > 1 mysql + slurmdbd on the same host > 6 primary slurmctld on 3 hosts (need to make sure each have a distinct > SlurmctldPort) > 6 secondary slurmctld on an arbitrary node on the clusters themselves. > 1 login node per cluster (this is a very small VM, and the users are > limited both to cpu time (with ulimit) and memory (with systemd)) > The slurm.conf's are shared on nfs to everyone in /path/to/nfs/<cluster > name>/slurm.conf. With symlink to /etc for the relevant cluster per node. > > The -M generally works, we can submit/query jobs from a login node of one > cluster to another. But there's a caveat to notice when upgrading. slurmdbd > must be upgraded first, but usually we have a not so small gap between > upgrading the different clusters. This causes the -M to stop working > because binaries of one version won't work on the other (I don't remember > in which direction). > We solved this by using an lmod module per cluster, which both sets the > SLURM_CONF environment, and the PATH to the correct slurm binaries (which > we install in /usr/local/slurm/<version>/ so that they co-exists). So when > the -M won't work, users can use: > module load slurm/clusterA > squeue > module load slurm/clusterB > squeue > > BR, > > > > > > > > On Thu, Oct 28, 2021 at 7:39 PM navin srivastava <navin.alt...@gmail.com> > wrote: > >> Thank you Tina. >> It will really help >> >> Regards >> Navin >> >> On Thu, Oct 28, 2021, 22:01 Tina Friedrich <tina.friedr...@it.ox.ac.uk> >> wrote: >> >>> Hello, >>> >>> I have the database on a separate server (it runs the database and the >>> database only). The login nodes run nothing SLURM related, they simply >>> have the binaries installed & a SLURM config. >>> >>> I've never looked into having multiple databases & using >>> AccountingStorageExternalHost (in fact I'd forgotten you could do that), >>> so I can't comment on that (maybe someone else can); I think that works, >>> yes, but as I said never tested that (didn't see much point in running >>> multiple databases if one would do the job). >>> >>> I actually have specific login nodes for both of my clusters, to make it >>> easier for users (especially those with not much experience using the >>> HPC environment); so I have one login node connecting to cluster 1 and >>> one connecting to cluster 1. >>> >>> I think the relevant bits of slurm.conf Relevant config entries (if I'm >>> not mistaken) on the login nodes are probably: >>> >>> The differences in the slurm config files (that haven't got to do with >>> topology & nodes & scheduler tuning) are >>> >>> ClusterName=cluster1 >>> ControlMachine=cluster1-slurm >>> ControlAddr=/IP_OF_SLURM_CONTROLLER/ >>> >>> ClusterName=cluster2 >>> ControlMachine=cluster2-slurm >>> ControlAddr=/IP_OF_SLURM_CONTROLLER/ >>> >>> (where IP_OF_SLURM_CONTROLLER is the IP address of host cluster1-slurm, >>> same for cluster2) >>> >>> And then the have common entries for the AccountingStorageHost: >>> >>> AccountingStorageHost=slurm-db-prod >>> AccountingStorageBackupHost=slurm-db-prod >>> AccountingStoragePort=7030 >>> AccountingStorageType=accounting_storage/slurmdbd >>> >>> (slurm-db-prod is simply the hostname of the SLURM database server) >>> >>> Does that help? >>> >>> Tina >>> >>> On 28/10/2021 14:59, navin srivastava wrote: >>> > Thank you Tina. >>> > >>> > so if i understood correctly.Database is global to both cluster and >>> > running on login Node? >>> > or is the database running on one of the master Node and shared with >>> > another master server Node? >>> > >>> > but as far I have read that the slurm database can also be separate on >>> > both the master and just use the parameter >>> > AccountingStorageExternalHost so that both databases are aware of each >>> > other. >>> > >>> > Also on the login node in slurm .conf file pointed to which Slurmctld? >>> > is it possible to share the sample slurm.conf file of login Node. >>> > >>> > Regards >>> > Navin. >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > On Thu, Oct 28, 2021 at 7:06 PM Tina Friedrich >>> > <tina.friedr...@it.ox.ac.uk <mailto:tina.friedr...@it.ox.ac.uk>> >>> wrote: >>> > >>> > Hi Navin, >>> > >>> > well, I have two clusters & login nodes that allow access to both. >>> That >>> > do? I don't think a third would make any difference in setup. >>> > >>> > They need to share a database. As long as the share a database, the >>> > clusters have 'knowledge' of each other. >>> > >>> > So if you set up one database server (running slurmdbd), and then a >>> > SLURM controller for each cluster (running slurmctld) using that >>> one >>> > central database, the '-M' option should work. >>> > >>> > Tina >>> > >>> > On 28/10/2021 10:54, navin srivastava wrote: >>> > > Hi , >>> > > >>> > > I am looking for a stepwise guide to setup multi cluster >>> > implementation. >>> > > We wanted to set up 3 clusters and one Login Node to run the job >>> > using >>> > > -M cluster option. >>> > > can anybody have such a setup and can share some insight into >>> how it >>> > > works and it is really a stable solution. >>> > > >>> > > >>> > > Regards >>> > > Navin. >>> > >>> > -- >>> > Tina Friedrich, Advanced Research Computing Snr HPC Systems >>> > Administrator >>> > >>> > Research Computing and Support Services >>> > IT Services, University of Oxford >>> > http://www.arc.ox.ac.uk <http://www.arc.ox.ac.uk> >>> > http://www.it.ox.ac.uk <http://www.it.ox.ac.uk> >>> > >>> >>> -- >>> Tina Friedrich, Advanced Research Computing Snr HPC Systems Administrator >>> >>> Research Computing and Support Services >>> IT Services, University of Oxford >>> http://www.arc.ox.ac.uk http://www.it.ox.ac.uk >>> >>> > > -- > > /| | > \/ | Yair Yarom | System Group (DevOps) > [] | The Rachel and Selim Benin School > [] /\ | of Computer Science and Engineering > []//\\/ | The Hebrew University of Jerusalem > [// \\ | T +972-2-5494522 | F +972-2-5494522 > // \ | ir...@cs.huji.ac.il > // | > > -- /| | \/ | Yair Yarom | System Group (DevOps) [] | The Rachel and Selim Benin School [] /\ | of Computer Science and Engineering []//\\/ | The Hebrew University of Jerusalem [// \\ | T +972-2-5494522 | F +972-2-5494522 // \ | ir...@cs.huji.ac.il // |