On Thursday, 27 May 2021, at 08:19:14 (+0200), Loris Bennett wrote:
Thanks for the detailed explanations. I was obviously completely confused about what MUNGE does. Would it be possible to say, in very hand-waving terms, that MUNGE performs a similar role for the access of processes to nodes as SSH does for the access of users to nodes?
If you replace the word "processes" with the word "jobs," you've got it. :-) MUNGE is really just intended to be a simple, lightweight solution to allow for creating a single, global "credential domain" among all the hosts in an HPC cluster using a single shared secret. Without going into too much detail with the crypto stuff, it basically allows a trusted local entity to cryptographically prove to another that they're both part of the same trust/cred domain; having established this, they know they can trust each other to provide and/or validate credentials between hosts. But I want to emphasize the "single shared secret" part. That means there's a single trust domain. Think "root of trust" with nothing but the root of trust. So you can authenticate a single group of hosts to all the rest of the group such that all are equals, but that's it. There's no additional facility for authenticating different roles or anything like that. Either you have the same shared secret or you don't; nothing else is possible.
Regarding keys vs. host-based SSH, I see that host-based would be more elegant, but would involve more configuration. What exactly are the simplification gains you see? I just have a single cluster and naively I would think dropping a script into /etc/profile.d on the login node would be less work than re-configuring SSH for the login node and multiple compute node images.
I like to think of it as "one and done." At least in our case at LANL, and at LBNL previously, all nodes of the same type/group boot the same VNFS image. As long as I don't need to cryptographically differentiate among, say, compute nodes, I only have to set up a single set of credentials for all the hosts, and I'm done. It also saves overall support time in my experience. By taking the responsibility for inter-machine trust myself at the system level, I don't have to worry about (1) modifying a user's SSH config without their knowledge, (2) running the risk of them messing with their config and breaking it, or (3) any user support/services calls about "why can't I do any of the things on the stuff?!" :-) It is totally a personal/team choice, but I'll be honest: Once I "discovered" host-based authentication and all the headaches it saved our sysadmin and consulting teams, I was kicking myself for having done it the other way for so long! :-D
Regarding AuthorizedKeysCommand, I don't think we can use that, because users don't necessarily have existing SSH keys. What abuse scenarios where you thinking of in connection with in-homedir key pairs?
Users don't have to have existing keys for it to work; the command you specify can easily create a key pair, drop the private key, and output the public key. Or even simpler, you can specify a value for "AuthorizedKeysFile" that points to a directory users can't write to, and store a key pair for each user in that location. Lots of ways to do it. But if I'm being frank about it, if I had my druthers, we'd be using certificates for authentication, not files. The advantages are, in my very humble opinion, well worth a little extra setup time! As far as abuse of keys goes: What's stopping your user from taking that private key you created for them (which is, as you recall, *unencrypted*) outside of your cluster to another host somewhere else on campus. Maybe something that has tons of untrusted folks with root. Then any of those folks can SSH to your cluster as that user. Credential theft is a *huge* problem in HPC across the world, so I always recommend that sysadmins think of it as Public Enemy #1! The more direct and permanent control you have over user credentials, the better. :-)
Would it be correct to say that, if one were daft enough, one could build some sort of terminal server on top of MUNGE without using SSH, but which could then replicate basic SSH behaviour?
No; that would only provide a method to authenticate servers at best. You can't authenticate users for the reasons I noted above. Single shared key, single trust domain.
Your explanation is very clear, but it still seems like quite a few steps with various gotchas, like the fact that, as I understand it, shosts.equiv has to contain all the possible ways a host might be addressed (short name, long name, IP).
You are correct, though that's easy to automate with a teensy weensy shell script. But yes, there's more up-front configuration. Again, though, I truly believe it saves admin time in the long run (not to mention user support staff time and user pain). But again, that's a personal or team choice. I'm not sure if I'm clearing things up or just muddying the waters. But hopefully at least *some* of that helped! :-D Michael -- Michael E. Jennings <m...@lanl.gov> - [PGPH: he/him/his/Mr] -- hpc.lanl.gov HPC Systems Engineer -- Platforms Team -- HPC Systems Group (HPC-SYS) Strategic Computing Complex, Bldg. 03-2327, Rm. 2341 W: +1 (505) 606-0605 Los Alamos National Laboratory, P.O. Box 1663, Los Alamos, NM 87545-0001