Hi Marcus, This is a good point, thanks! Maybe the salloc variant isn't so good as a general solution.
Cheers, Loris Marcus Wagner <wag...@itc.rwth-aachen.de> writes: > Hi Loris, > > I know, it has been some time, but I have one additional remark. > If you just use ssh -X to login to the nodes, you will have a plain ssh > session, > which means, none of SLURMs environment variables will be set. > So if your X11-Jobs are in need of that, you will have to use X11 forwarding > through SLURM. > > Best > Marcus > > > On 3/29/19 7:45 PM, Marcus Wagner wrote: >> Hi Loris, >> >> Am 29.03.2019 um 14:01 schrieb Loris Bennett: >>> Hi Marcus, >>> >>> Marcus Wagner <wag...@itc.rwth-aachen.de> writes: >>> >>>> Hi Loris, >>>> >>>> On 3/25/19 1:42 PM, Loris Bennett wrote: >>>>> >>>>>> 3. salloc works fine too without --x11, subsequent srun with a x11 app >>>>>> works great >>>>> Doing 'salloc' followed by 'ssh -X' works for us too, which is surprising >>>>> to me. >>>>> >>>>> This last option currently seems to me to be the best option for users, >>>>> being slightly less confusing than logging into the login node again >>>>> from the login node, which is our current workaround. >>>>> >>>>> Still, it's all a bit odd. >>>> >>>> I assume, you use pam_slurm_adopt? >>> >>> Yes. >>> >>>> Then it is clear, that this is working and has nothing to do with the x11 >>>> forwarding feature of slurm. This is plain ssh X11-forwarding in this case. >>> >>> OK, I see that, but if I don't need --x11 with salloc, what is it >>> for? Just to control to control on which nodes forwarding is done >>> viz. --x11[=<all|first|last>]? What might be a use-case for not having >>> X11 forwarding for all the nodes, which is the default? >>> >> >> The default is (according to the manpage) 'batch', which means the node, >> where >> the batchscript will be executed (the first of the allocation, I think). >> I do not know what first or last should be intended. >> In fact I do not have a use case for x11-forwarding to all nodes, might have >> to think a little bit more about that one. >> >>>> Please keep in mind, that processes started with an adopted ssh session are >>>> in >>>> the jobs cgroup (good), but are accounted in the 'extern' step of the job. >>>> >>>> e.g. >>>> * sbatch --wrap "sleep 10m" >>>> * ssh to compute-node >>>> * do some work in the compute node >>>> after job is done >>>> * sacct -j <jobid> -o JobID,JobName,MaxRSS,CPUTime,TotalCPU >>>> JobID JobName MaxRSS CPUTime TotalCPU >>>> ------------ ---------- ---------- ---------- ---------- >>>> 1053837 wrap 00:01:42 02:00.159 >>>> 1053837.bat+ batch 412K 00:01:43 00:00.158 >>>> 1053837.ext+ extern 543880K 00:01:42 02:00.001 >>> >>> That's interesting, although is there any advantage/difference compared >>> with just doing >>> >>> srun --x11 --pty bash >>> >>> ? >> With >> srun --x11 --pty bash >> the accounting will be in the batch step of the job, that is the only >> difference I'm aware of at the moment. >> With LSF we used that kind of mechanism to start e.g. vtune directly out of >> the job. Without the X11-Forwarding feature of Slurm you would have to salloc >> some hosts and then ssh to the nodes with x11 forwarding enabled to then >> start >> vtune. >> So it is a little bit more to do for the user if you do not do X11-Forwarding >> the SLURM style. >> >> >> Best >> Marcus >>> >>> Cheers, >>> >>> Loris >>> >> -- Dr. Loris Bennett (Mr.) ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de