Is this Northwestern’s Quest HPC or another one? I know at least a few of the people involved with Quest, and I wouldn’t have thought they’d be in dire need of coaching.
And to follow on with Davide’s point, this really sounds like a case for submitting multiple jobs with dependencies between them, as per [1, 2, 3]. [1] https://services.northwestern.edu/TDClient/30/Portal/KB/ArticleDet?ID=1795 [2] https://bioinformaticsworkbook.org/Appendix/HPC/SLURM/submitting-dependency-jobs-using-slurm.html#gsc.tab=0 [3] https://slurm.schedmd.com/sbatch.html#OPT_dependency From: slurm-users <slurm-users-boun...@lists.schedmd.com> on behalf of Laurence Marks <laurence.ma...@gmail.com> Date: Wednesday, December 20, 2023 at 1:40 PM To: Slurm User Community List <slurm-users@lists.schedmd.com> Subject: Re: [slurm-users] Reproducible irreproducible problem (timeout?) External Email Warning This email originated from outside the university. Please use caution when opening attachments, clicking links, or responding to requests. ________________________________ It is a University "supercomputer", not a national facility. Hence they are not that expert, which is why I am asking here. I am pretty certain that it is some form of communication issue, but beyond that it is not clear. If I get suggestions such as "why don't they look for ABC in XYZ" then I may persuade them to look at specifics. They will need the coaching, alas. On Wed, Dec 20, 2023 at 1:25 PM Gerhard Strangar <g...@arcor.de<mailto:g...@arcor.de>> wrote: Laurence Marks wrote: > After some (irreproducible) time, often one of the three slow tasks hangs. > A symptom is that if I try and ssh into the main node of the subtask (which > is running 128 mpi on the 4 nodes) I get "Authentication failed". How about asking an admin to check why it hangs? -- Emeritus Professor Laurence Marks (Laurie) Northwestern University Webpage<http://www.numis.northwestern.edu/> and Google Scholar link<http://scholar.google.com/citations?user=zmHhI9gAAAAJ&hl=en> "Research is to see what everybody else has seen, and to think what nobody else has thought", Albert Szent-Györgyi