On Thu, 2015-10-15 at 19:45 +0200, Michael Biebl wrote: > Btw, in both cases (#770135 and #793814) the users have been restarting > the dbus daemon. You mentioned that you did not do that. > > So I'm not sure if it's actually the same issue after all and your > problem should be tracked as a separate issue.
I'm starting to think it's a logind robustness problem. I traced back through the logs to the first instance after reboot and this is what I find: Oct 11 04:17:01 bedivere CRON[2185]: pam_unix(cron:session): session opened for user root by (uid=0) Oct 11 04:17:01 bedivere CRON[2185]: pam_unix(cron:session): session closed for user root Oct 11 04:17:40 bedivere sshd[2193]: Accepted publickey for clh15 from 184.11.141.41 port 38172 ssh2: DSA SHA256:UK5h9LRMtBthvW0Ncv1SG4WRmSFNs1hPcowPzzyt+iY Oct 11 04:17:40 bedivere sshd[2193]: pam_unix(sshd:session): session opened for user clh15 by (uid=0) Oct 11 04:17:40 bedivere systemd-logind[640]: New session 916 of user clh15. Oct 11 04:17:40 bedivere sshd[2195]: Accepted publickey for clh15 from 184.11.141.41 port 38173 ssh2: DSA SHA256:UK5h9LRMtBthvW0Ncv1SG4WRmSFNs1hPcowPzzyt+iY Oct 11 04:17:40 bedivere sshd[2195]: pam_unix(sshd:session): session opened for user clh15 by (uid=0) Oct 11 04:17:40 bedivere systemd-logind[640]: New session 917 of user clh15. Oct 11 04:17:40 bedivere sshd[2195]: pam_systemd(sshd:session): Failed to create session: Message recipient disconnected from message bus without replying Oct 11 04:17:40 bedivere systemd-logind[640]: Failed to abandon session scope: Transport endpoint is not connected Oct 11 04:17:40 bedivere sshd[2204]: Received disconnect from 184.11.141.41: 11: disconnected by user Oct 11 04:17:40 bedivere sshd[2204]: Disconnected from 184.11.141.41 Oct 11 04:17:40 bedivere sshd[2193]: pam_unix(sshd:session): session closed for user clh15 Oct 11 04:17:40 bedivere sshd[2203]: Received disconnect from 184.11.141.41: 11: disconnected by user Oct 11 04:17:40 bedivere sshd[2203]: Disconnected from 184.11.141.41 Oct 11 04:17:40 bedivere sshd[2195]: pam_unix(sshd:session): session closed for user clh15 Oct 11 04:18:05 bedivere sshd[2193]: pam_systemd(sshd:session): Failed to release session: Connection timed out Oct 11 04:18:05 bedivere systemd-logind[640]: Failed to abandon session scope: Transport endpoint is not connected Oct 11 04:18:05 bedivere dbus[698]: [system] Failed to activate service 'org.freedesktop.login1': timed out Thereafter everything fails to activate org.freedesktop.login1. However, it looks like the trouble is here: Oct 11 04:17:40 bedivere sshd[2195]: pam_systemd(sshd:session): Failed to create session: Message recipient disconnected from message bus without replying Logind is still active and replies later in the trace, so it looks like dbus either dropped a message or did some type of unexpected disconnect. After this, logind works until it can't abandon the session, then it never replies on the bus again. So I suspect somewhere in the error handling inside logind it doesn't cope with unexpected loss of dbus messages. James
signature.asc
Description: This is a digitally signed message part