The root cause of this problem is plymouth, indeed.

The 'chvt N' command blocks if the VT/tty is in VT_AUTO + KD_GRAPHICS
state.

In this state the kernel bails out early in the ioctl(VT_ACTIVATE) syscall
and does not post the VT_EVENT_SWITCH event the ioctl(VT_WAITACTIVE) syscall
will be waiting for -- causing chvt to block.

The function path is:

  vt_ioctl(tty0, VT_ACTIVATE, ...)
  -> set_console()
     -> if (... VT_AUTO && KD_GRAPHICS ...) return -EINVAL; <<-- bails out.
     -> schedule_console_callback(); return 0; <<-- continue to send event.
        -> console_callback()
           -> change_console()
              -> complete_change_console()
                 -> vt_event_post(VT_EVENT_SWITCH, ...)

  vt_ioctl(tty0, VT_WAITACTIVE, ...)
  -> vt_waitactive()
     -> __vt_event_wait(VT_EVENT_SWITCH) <<-- blocks/wait to receive event.     
  

gdm properly sets the VT out of VT_AUTO mode (which causes chvt not to block)
after it tells plymouth to deactivate.

BUT plymouth can set it back to VT_AUTO mode afterward, regardless, while it
handles the udev event of the DRM/DRI graphics card addition, as that causes
the VT/tty to be reconfigured.

This can be verified with plymouth debugging, e.g., kernel boot option
'plymouth.debug=file:/run/plymouth.debug', plus source code inspection:

1) gdm calls 'plymouth deactivate', which calls
   ply_terminal_close()
    -> ply_terminal_stop_watching_for_vt_changes()
       -> if (terminal->is_watching_for_vt_changes == true) ioctl(VT_SETMODE, 
VT_AUTO)
       -> terminal->is_watching_for_vt_changes = false

 [ply-boot-server.c:LINE]             
print_connection_process_identity:connection is from pid PID (/bin/plymouth 
deactivate) with parent pid PID (/usr/sbin/gdm)
...
 [ply-terminal.c:LINE]                            ply_terminal_close:restoring 
color palette
 [ply-terminal.c:LINE]                            ply_terminal_close:stop 
watching tty fd
...

2) plymouth udev event timeout expires, it notices the DRM/DRI devices,
   and re-enables the VT watching while processing those; in the calls:

   ply_terminal_open()
   -> ply_terminal_watch_for_vt_changes()
      -> terminal->is_watching_for_vt_changes = true;

 [ply-device-manager.c:LINE]                      
create_devices_from_udev:Timeout elapsed, looking for devices from udev
 ...
 [ply-device-manager.c:LINE] 
create_devices_for_terminal_and_renderer_type:creating devices for 
/dev/dri/card0 (renderer type: 1) (terminal: /dev/tty1)
 ...
 [./plugin.c:LINE]                                   load_driver:Opening 
'/dev/dri/card0'
 [ply-terminal.c:LINE]                             ply_terminal_open:trying to 
open terminal '/dev/tty1'


3) init calls 'plymouth quit --retain-splash' which goes into
   ply_terminal_close() again, and since watching is true, it
   sets the VT into VT_AUTO again (see calls in #1 above) ...
   (*after* gdm had already set the VT up out of VT_AUTO).

 [ply-boot-server.c:LINE]             
print_connection_process_identity:connection is from pid PID (/bin/plymouth 
quit --retain-splash) with parent pid PID (/sbin/init splash)
...
 [ply-terminal.c:LINE]                            ply_terminal_close:restoring 
color palette
 [ply-terminal.c:LINE]                            ply_terminal_close:stop 
watching tty fd
....


That depends on timing (plymouth udev event watch timeout + device detection)
and this probably explains why the problem does not happen every single time
(but apparently it's the case often; the problem reproduces most of the time
in this KVM guest with Ubuntu 18.04.2 Desktop).

After understanding that this behavior / code path is responsible for the
problem, I found there's an upstream for this in plymouth, which realizes
that after 'plymouth deactivate' the udev events should not be reacted on,
which prevents re-setting the VT_AUTO mode.

Interestingly this fix is already applied in Ubuntu Cosmic and later, for
LP: #1795637, due to a different problem (cayses wayland/xorg to fail).

The patch needed just a small refresh to apply to Bionic, and a test kernel
with it applied successfully passes all 'chvt' tests, multiple times.
1) While gdm is in the login screen, 'ssh <guest> -- sudo chvt 4'
2) With gdm autologin, try the same.

Possible workarounds for this are disabling plymouth (remove the 'splash'
option from kernel/grub boot options) OR setting the kernel console to
a device other than tty0/tty1 (check it with 'dmesg | grep console'),
for example, console=ttyS0 or console=ttyS1 (serial/non-graphic consoles).
This causes plymouth not to mess with the VT used by gdm (a graphic one).

** Description changed:

  [Impact]
  
- When AutomaticLogin is enable in gdm3. The "chvt" command hangs forever,
- preventing from changing foreground virtual terminal.
+ When AutomaticLogin is enabled in gdm3, or it is showing the login screen,
+ the "chvt" command blocks indefinitely (usually resumes with gdm3 restart).
+ 
+ This prevents users to change the foreground virtual terminal, and it can
+ also prevent pm-suspend to complete (as it invokes chvt).
+ 
+ This problem happens in Bionic; it's already fixed in Cosmic and later.
+ 
+ This patch to plymouth helps it not to revert the VT/tty to VT_AUTO
+ (after gdm calls 'plymouth deactivate' and changes it to VT_PROCESS)
+ which causes the ioctl(VT_SETACTIVE) not to generate the event that
+ the ioctl(VT_WAITACTIVE) will block/wait on just afterward.
+ 
+ Workarounds are to either disable plymouth / remove 'splash' from the
+ kernel command line or change it to use a different/non-graphical VT
+ for console (console=tty0 [default] or console=tty1 [equivalent] are
+ affected, but console=ttyS0 or console=ttyS1 are not, being serial).
  
  [Test case]
  
- 1) Install Bionic/18.04LTS Desktop
+ 1) Install Bionic/18.04 LTS Desktop
  
- 2) Enable AutomaticLogin
-  2.1) Modify /etc/gdm3/custom.conf
- # Enabling automatic login
-   AutomaticLoginEnable = true
-   AutomaticLogin = <YOUR_USER>
+ 2) Ensure plymouth / 'splash' is enabled (default)
  
- 3) Reboot your system and make sure AutoLogin works by not requesting
+ $ grep splash /proc/cmdline
+ BOOT_IMAGE=... root=... splash ...
+ 
+ 3) Ensure console is tty0 (default) or tty1 
+ $ dmesg | grep console
+ [    0.004000] console [tty0] enabled
+ 
+ 
+ A) Login screen, regardless of automatic login
+ 
+    4) Ensure the login screen/tty 1 is the displaying
+       (i.e., it's foregound/active VT) or change to it:
+ 
+       $ sudo chvt 1 # this works/finishes.
+       $
+ 
+    5) $ ssh <SYSTEM> 'sudo fgconsole' # check tty1 is foreground VT
+       1
+ 
+    6) $ ssh <SYSTEM> 'sudo chvt 4' # this blocks/doesn't finish
+ 
+ 
+ B) Automatic login, regardless of login screen
+ 
+    4) Enable AutomaticLogin in /etc/gdm3/custom.conf
+      [daemon]
+      AutomaticLoginEnable = true
+      AutomaticLogin = <YOUR_USER>
+ 
+    5) Reboot your system and make sure AutoLogin works by not requesting
  password before opening the <YOUR_USER> session.
  
- 4) Print active VT
- $ sudo fgconsole
+    6) Print active VT
+       (in Bionic, autologin user session runs on tty1)
  
- Without the fix, it will be "1". # BAD
- With the fix, it will be "2". # GOOD
+       $ sudo fgconsole
+       1
  
- 5) sudo chvt 4 ## chvt will hang here.
+    7) sudo chvt 4 # this blocks/doesn't finish
  
- Verification can be made from a 2nd terminal, run :
+ 
+ From SSH one can check that chvt is blocked waiting
+ on new VT to become active, which doesn't happen in
+ this case (old VT in VT_AUTO + KB_GRAPHICS mode):
+ 
  $ cat /proc/$(pidof chvt)/stack
  [<0>] __vt_event_wait.isra.2.part.3+0x40/0x90
  [<0>] vt_waitactive+0x80/0xd0
  [<0>] vt_ioctl+0xd26/0x1140
  [<0>] tty_ioctl+0xf6/0x8c0
  [<0>] do_vfs_ioctl+0xa8/0x630
  [<0>] SyS_ioctl+0x79/0x90
  [<0>] do_syscall_64+0x73/0x130
  [<0>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
  [<0>] 0xffffffffffffffff
  
- It's basically waiting for the VT to be activated, but it never happens.
- 
  [Potential regression]
  
- Low.
+ Low.  This plymouth patch is upstream and it's already applied
+ in Cosmic and later for ~6 months (0.9.3-1ubuntu10 / Oct 2018)
+ for LP: #1795637 (different problem/effect, same root cause).
  
- Current gdm3 run autologin display on tty1. tty1 is really meant for the
- login screen. This commit changes autologin to not use the initial vt.
- 
- If one switch to tty1, the next VT switch attempt will hangs again and
- one won't be able to switch it. tty1 is really the problem here, so by
- forcing the autologin to not use tty1 we improve the current behaviour
- where we can't switch VTs at all when autologin is enabled.
- 
- The tty1 behaviour will still need (normal behaviour or not ???) to be
- investigated, but not mandatory required for the sake of this SRU IMHO.
- 
- I suspect systemd-logind to be the reason of the tty1 behaviour:
- 
- # ps
- root      1350     1  0 Mar03 ?        00:00:03 /lib/systemd/systemd-logind
- 
- #lsof
- systemd-l  1350                   root   24u      CHR                4,1      
 0t0         45 /dev/tty1
- 
- But I haven't dig much in it for now.
- 
- So the fix will works as long as one doesn't do run on tty1.
- 
- Exactly like when autologin isn't enable.
- 
- * From a machine with autologin enable:
- 
- /etc/gdm3/customer.conf
- # Enabling automatic login
-   AutomaticLoginEnable = true
-   AutomaticLogin = user1
- 
- $ sudo fgconsole
- 1
- 
- * From a machine with autologin disable:
- 
- /etc/gdm3/customer.conf
- # Enabling automatic login
- #  AutomaticLoginEnable = true
- #  AutomaticLogin = user1
- 
- $ sudo fgconsole
- 2
- 
- [Other information]
- 
- * Upstream fix:
- https://github.com/GNOME/gdm/commit/39fb4ff6
- 
- $ git describe --contains 39fb4ff6
- 3.30.1~2^2~3
- 
- $ rmadision gdm3
-  ==> gdm3 | 3.28.3-0ubuntu18.04.4       | bionic-updates   | ...
-      gdm3 | 3.30.1-1ubuntu5             | cosmic           | ...
-      gdm3 | 3.30.1-1ubuntu5             | disco            | ...
-      gdm3 | 3.30.1-1ubuntu5.1           | cosmic-security  | ...
-      gdm3 | 3.30.1-1ubuntu5.1           | cosmic-updates   | ...
-      gdm3 | 3.31.4+git20190225-1ubuntu1 | disco-proposed   | ...
+ Besides, it's conservative in nature, and it's spirit makes a
+ lot of sense (stop handling more udev events after deactivate).
+ There are no additional fixes to its code changes upstream.
  
  [Original Description]
- sudo strace chvt 4
- execve("/bin/chvt", ["chvt", "4"], 0x7ffd63e5c758 /* 17 vars */) = 0
- brk(NULL) = 0x561e18430000
- access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
- access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
- openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
- fstat(3, {st_mode=S_IFREG|0644, st_size=74655, ...}) = 0
- mmap(NULL, 74655, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f5059e7d000
- close(3) = 0
- access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
- openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
- read(3, 
"\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\260\34\2\0\0\0\0\0"..., 832) = 
832
- fstat(3, {st_mode=S_IFREG|0755, st_size=2030544, ...}) = 0
- mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x7f5059e7b000
- mmap(NULL, 4131552, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x7f5059878000
- mprotect(0x7f5059a5f000, 2097152, PROT_NONE) = 0
- mmap(0x7f5059c5f000, 24576, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1e7000) = 0x7f5059c5f000
- mmap(0x7f5059c65000, 15072, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f5059c65000
- close(3) = 0
- arch_prctl(ARCH_SET_FS, 0x7f5059e7c500) = 0
- mprotect(0x7f5059c5f000, 16384, PROT_READ) = 0
- mprotect(0x561e17e87000, 4096, PROT_READ) = 0
- mprotect(0x7f5059e90000, 4096, PROT_READ) = 0
- munmap(0x7f5059e7d000, 74655) = 0
- brk(NULL) = 0x561e18430000
- brk(0x561e18451000) = 0x561e18451000
- openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3
- fstat(3, {st_mode=S_IFREG|0644, st_size=10281936, ...}) = 0
- mmap(NULL, 10281936, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f5058ea9000
- close(3) = 0
- openat(AT_FDCWD, "/proc/self/fd/0", O_RDWR) = 3
- ioctl(3, TCGETS, {B38400 opost isig icanon echo ...}) = 0
- ioctl(3, KDGKBTYPE, 0x7ffdcdb0efa7) = -1 ENOTTY (Inappropriate ioctl for 
device)
- close(3) = 0
- openat(AT_FDCWD, "/dev/tty", O_RDWR) = 3
- ioctl(3, TCGETS, {B38400 opost isig icanon echo ...}) = 0
- ioctl(3, KDGKBTYPE, 0x7ffdcdb0efa7) = -1 ENOTTY (Inappropriate ioctl for 
device)
- close(3) = 0
+ 
+ $ sudo strace chvt 4
+ <...>
  openat(AT_FDCWD, "/dev/tty0", O_RDWR) = 3
  ioctl(3, TCGETS, {B38400 opost isig icanon echo ...}) = 0
  ioctl(3, KDGKBTYPE, 0x7ffdcdb0efa7) = 0
  ioctl(3, VT_ACTIVATE, 0x4) = 0
  ioctl(3, VT_WAITACTIVE, 0x4
  
  VT_ACTIVATE will cause a switch to VT number.
  VT_WAITACTIVE will sleep/wait until the specified VT has been activated.
  
  $ sudo cat /proc/$(pidof chvt)/stack
  [<0>] __vt_event_wait.isra.2.part.3+0x40/0x90
  [<0>] vt_waitactive+0x80/0xd0
  [<0>] vt_ioctl+0xd26/0x1140
  [<0>] tty_ioctl+0xf6/0x8c0
  [<0>] do_vfs_ioctl+0xa8/0x630
  [<0>] SyS_ioctl+0x79/0x90
  [<0>] do_syscall_64+0x73/0x130
  [<0>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
  [<0>] 0xffffffffffffffff
- 
- Enable debuglogs doesn't provide additional details.
- 
- As soon as auto-login is turned off, chvt is back to normal.
- 
- The above has been reproduced on Ubuntu:
- - Ubuntu Bionic w/ gdm3 3.28.3 & kbd 2.0.4

-- 
You received this bug notification because you are a member of Ubuntu
Desktop Bugs, which is subscribed to gdm3 in Ubuntu.
https://bugs.launchpad.net/bugs/1817738

Title:
  Can't change virtual terminal on login screen or when auto-login is
  enabled

To manage notifications about this bug go to:
https://bugs.launchpad.net/oem-priority/+bug/1817738/+subscriptions

-- 
desktop-bugs mailing list
desktop-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/desktop-bugs

Reply via email to