I completely agree with what Chris says regarding cgroups. Implement them, and you will not regret it.
I have worked with other simulation frameworks, which work in a similar fashion to Trick, ie a master process which spawns off independent worker processes on compute nodes. I am thinking on an internal application we have, and if I also say it Matlab. In the Trick documentation: <https://github.com/nasa/trick/wiki/UserGuide-Monte-Carlo#notes>Notes 1. SSH <https://en.wikipedia.org/wiki/Secure_Shell> is used to launch slaves across the network 2. Each slave machine will work in parallel with other slaves, greatly reducing the computation time of a simulation However I must say that there must be plenty of folks at NASA who use this simulation framework on HPC clusters with batch systems. It would surprise me that there are not 'adapation layers' available for Slurm, SGE, PBS etc. So in SLurm, you would do an sbatch which would reserve your worker nodes then run a series of srun which run the worker processes. (I hope I have that round the right way - I seem to recall doing srun then a series of sbatches in the past) But looking at the Trick Wiki quickly, I am wrong. It does seem to work on the model of "get a list of hosts allocated by your batch system"", ie the SLURM_JOB_HOSTLIST then Trick will set up simulation queues which spwan off models using ssh. Looking at the Advanced Topics guide this does seem to be so: https://github.com/nasa/trick/blob/master/share/doc/trick/Trick_Advanced_Topics.pdf The model is that you allocate up to 16 remote worker hosts for a long time. Then various modelling tasks are started on those hosts via ssh. Trick expects those hosts to be available for more tasks during your simulation session. Also there is discussion there about turning off irqbalance and cpuspeed, and disabling non necessary system services. As someone who has spent endless oodles of hours either killing orphaned processes on nodes, or seeing rogueprocess alarms, or running ps --forest to trace connections into batch job nodes which bypass the pbs/slurm daemons I despair slightly... I am probably very wrong, and NASA have excellent slurm integration. So I agree with Chris - implement cgroups, and try to make sure your ssh 'lands'on a cgroup. 'lscgroup' is a nice command to see what cgroups are active on a compte node. Also run an interactive job, ssh into one of your allocated workr nodes, then cat /proc/self/cgroups shows which cgroups you have landed iin. On 5 March 2018 at 02:20, Christopher Samuel <ch...@csamuel.org> wrote: > On 05/03/18 12:12, Dan Jordan wrote: > > What is the /correct /way to clean up processes across the nodes >> given to my program by SLURM_JOB_NODELIST? >> > > I'd strongly suggest using cgroups in your Slurm config to ensure that > processes are corralled and tracked correctly. > > You can use pam_slurm_adopt from the contrib directory to capture > inbound SSH sessions into a running job on the node (and deny access to > people who don't). > > Then Slurm should take care of everything for you without needing an > epilog. > > Hope this helps! > Chris > >