Hi,

How does my user ("dannym")'s shepherd (and user dbus--probably by shepherd?) get started? (I'm using guix home--if it's important)

I ask because I just had to debug a race with deadlock of two dannym shepherds (one of them was me starting it in ~/.config/autostart/shepherd.desktop same like I have been doing the last few years[1]) fighting over pipewire and wireplumber, deadlocking very very often--which was not fun. (at some point I was debugging the kernel :P)

Let's please avoid breaking backward compat like this. Also, why could I even start two shepherds at the same time? Did someone remove the check?

Maybe a small overview of what terms mean in elogind:
- USER means a user account like "dannym".
- SESSION means a logged-in instance of a user account "dannym".
There can be any number of sessions per user (logged in at the same time!), or 0. - SEAT means one specific pair of keyboard, mouse and screen assigned to a fixed physical location. There can be any number of current sessions (and thus, implicitly, users) per seat, or 0.

It seems that the directory /run/user/1000 exists once per user (uid 1000 in this case).

Since the user shepherd creates /run/user/1000/shepherd/socket, that means we assume there's one shepherd per user (NOT per session). In my opinion that makes sense for timers (and maybe log rotation), ssh-agent. Not sure about pipewire (maybe?).

Likewise, since the user dbus uses /run/user/1000/bus , it is one dbus instance per USER, not session.

If you want a session dbus, that's something you can do--but then you have to forward dbus messages between user dbus and session dbus (the latter you have to start manually with a random socket, NOT with /run/user/1000/bus).

Let's keep that in mind when designing services.

If we don't have tests for that, we should totally add tests where two sessions of the same user are opened (without closing in-between) and then one is closed and then the remaining session used.

Note: elogind has a "rm -rf /run/user/1000" that runs after UserStopDelaySec after all that user's sessions ended[2].

Note: elogind-configuration's "kill-user-processes?" actually kills all that user's processes OF THAT CGROUP on SESSION exit. What cgroup do we use in guix home? Which one for the user shepherd?

[1] shepherd has (had?) a check whether the socket was already there and if so, it would (correctly) NOT remove it and then (correctly) would fail startup. That way, there's only at most be one shepherd running per user (not per session).

[2] elogind source has: "#if 0 /// elogind does not start a user service manager, the delay is unneeded."

elogind not starting a user (not session) service manager seems ill-advised.

Reply via email to