Am 02.03.26 um 12:52 schrieb Nicolas George:
Paul Leiber (HE12026-03-02):
There are no open processes of the user after logout. (The line <logout
wait="2000" hup="0" term="1" kill="1"/> should have taken care of that
anyway, if I am not mistaken.)
You might think that. But I have a slew of computers where KDE, and
GNOME before that, leaves process lingering long after the user has
logged out, keeping the session nominally open. And filling up /var
because we log running process regularly.
I checked with ps -u username -U username after logging out, and it gave an 
empty result. Is there something else I could check?

It is on Ubuntu 22, it does not seem to happen on Trixie, but I have not
yet deployed it.

I still think that the problem lies with pmvarrun not being called to
decrease the entry for a user at logout.
I have slightly different diagnostic. See below.

Here is an excerpt of my journal:
Two pieces of advice about it:

- Avoid localizing the dates in your journal, especially if it causes
   non-ASCII characters to be present. If you got that from journalctl,
   then it was localized only when printing, so you need to worry about
   it only when sharing excerpts like that.
Thank you for the advice. I'll try to find a way to not localize the date in 
the future.
   As a general principle, localizing system messages means you can get
   confusing information from the message itself or from its
   localization, not just from the message itself.
- Avoid line-rewrapping on log messages. I had to
   `%s/> Mär.*\zs\n> \ze[^M]/` multiple times to read it properly.
I think this was Thunderbird, I think I have switched off line wrapping now.
Mär 02 10:22:44 computer login[13770]: command: '/usr/sbin/pmvarrun' '-u' 'xxx'
Mär 02 10:22:44 computer login[13770]: (pam_mount.c:441): pmvarrun says login 
count is 17
Mär 02 10:22:47 computer login[13770]: command: '/usr/sbin/pmvarrun' '-u' 'xxx'
Mär 02 10:22:47 computer login[13770]: (pam_mount.c:441): pmvarrun says login 
count is 18
What I am seeing is pam called from login calling pmvarrun twice the
same way with options `-u xxx`.

But in the source code, I see:

        modify_pm_count(&Config, Config.user, "1");

in pam_sm_open_session() and

        if (modify_pm_count(&Config, Config.user, "-1") > 0)

in pam_sm_close_session(), and it connects to:

         {CMD_PMVARRUN,   NULL,     {"pmvarrun", "-u", "%(USER)", "-o", 
"%(OPERATION)", NULL}},

So what we should be seeing is:

        command: '/usr/sbin/pmvarrun' '-u' 'xxx' '-o' '1'
        command: '/usr/sbin/pmvarrun' '-u' 'xxx' '-o' '-1'

If pam-mount forgets the -o option, then it defaults to 1, and it would
explain the mount count increasing each time.

But it needs to be confirmed, because it also could just be a bug /
misfeature of the code that logs the command. There is an easy way to do
it: just before you valid your password, from another terminal, run as
root:

strace -f -s 10000 -e execve -p $(pidof login | tr ' ' ,) -o /tmp/strace_login

Translation:
strace → dump the system calls done by processes;
-f → including subprocesses after a fork;
-s 10000 → with a very high limit to truncate strings and lists
-e execve → only the system call that starts a new process
-p … → trace these processes
$(pidof login …) → theses processes = the login process
| tr ' ' , → replace spaces with commas in case there are several
-o … save the output in that file

Then you look at the file to see how pmvarrun is invoked.
For login:

9414  execve("/usr/sbin/pmvarrun", ["/usr/sbin/pmvarrun", "-u", "xxx"], 
0x55ec223822c0 /* 11 vars */) = 0

For logout:

9483  execve("/usr/sbin/pmvarrun", ["/usr/sbin/pmvarrun", "-u", "xxx"], 
0x55ec223822c0 /* 17 vars */) = 0

If you see it invoked with the -o option, then it is a false lead and
the issue is somewhere else.

If you see it invoked as shown in the logs, then we have surrounded the
issue, and somebody needs to debug the source code to see where the -o
option gets lost. (But we have reached the limit of the investigating
that I was willing to do.)
The output matches the second description. I am not sure if I will be able to 
find the bug myself due to my limited knowledge, we'll see. In any case, would 
an appropriate next step be filing a bug report in the Debian bugtracker?

Thanks for your efforts, Nicolas, you helped me a lot!

Paul

Reply via email to