Hi,
How does my user ("dannym")'s shepherd (and user dbus--probably by
shepherd?) get started? (I'm using guix home--if it's important)
I ask because I just had to debug a race with deadlock of two dannym
shepherds (one of them was me starting it in
~/.config/autostart/shepherd.desktop same like I have been doing the
last few years[1]) fighting over pipewire and wireplumber, deadlocking
very very often--which was not fun. (at some point I was debugging the
kernel :P)
Let's please avoid breaking backward compat like this. Also, why could
I even start two shepherds at the same time? Did someone remove the
check?
Maybe a small overview of what terms mean in elogind:
- USER means a user account like "dannym".
- SESSION means a logged-in instance of a user account "dannym".
There can be any number of sessions per user (logged in at the same
time!), or 0.
- SEAT means one specific pair of keyboard, mouse and screen assigned to
a fixed physical location.
There can be any number of current sessions (and thus, implicitly,
users) per seat, or 0.
It seems that the directory /run/user/1000 exists once per user (uid
1000 in this case).
Since the user shepherd creates /run/user/1000/shepherd/socket, that
means we assume there's one shepherd per user (NOT per session).
In my opinion that makes sense for timers (and maybe log rotation),
ssh-agent. Not sure about pipewire (maybe?).
Likewise, since the user dbus uses /run/user/1000/bus , it is one dbus
instance per USER, not session.
If you want a session dbus, that's something you can do--but then you
have to forward dbus messages between user dbus and session dbus (the
latter you have to start manually with a random socket, NOT with
/run/user/1000/bus).
Let's keep that in mind when designing services.
If we don't have tests for that, we should totally add tests where two
sessions of the same user are opened (without closing in-between) and
then one is closed and then the remaining session used.
Note: elogind has a "rm -rf /run/user/1000" that runs after
UserStopDelaySec after all that user's sessions ended[2].
Note: elogind-configuration's "kill-user-processes?" actually kills all
that user's processes OF THAT CGROUP on SESSION exit. What cgroup do we
use in guix home? Which one for the user shepherd?
[1] shepherd has (had?) a check whether the socket was already there and
if so, it would (correctly) NOT remove it and then (correctly) would
fail startup. That way, there's only at most be one shepherd running
per user (not per session).
[2] elogind source has: "#if 0 /// elogind does not start a user service
manager, the delay is unneeded."
elogind not starting a user (not session) service manager seems
ill-advised.