The root cause of this problem is plymouth, indeed. The 'chvt N' command blocks if the VT/tty is in VT_AUTO + KD_GRAPHICS state.
In this state the kernel bails out early in the ioctl(VT_ACTIVATE) syscall and does not post the VT_EVENT_SWITCH event the ioctl(VT_WAITACTIVE) syscall will be waiting for -- causing chvt to block. The function path is: vt_ioctl(tty0, VT_ACTIVATE, ...) -> set_console() -> if (... VT_AUTO && KD_GRAPHICS ...) return -EINVAL; <<-- bails out. -> schedule_console_callback(); return 0; <<-- continue to send event. -> console_callback() -> change_console() -> complete_change_console() -> vt_event_post(VT_EVENT_SWITCH, ...) vt_ioctl(tty0, VT_WAITACTIVE, ...) -> vt_waitactive() -> __vt_event_wait(VT_EVENT_SWITCH) <<-- blocks/wait to receive event. gdm properly sets the VT out of VT_AUTO mode (which causes chvt not to block) after it tells plymouth to deactivate. BUT plymouth can set it back to VT_AUTO mode afterward, regardless, while it handles the udev event of the DRM/DRI graphics card addition, as that causes the VT/tty to be reconfigured. This can be verified with plymouth debugging, e.g., kernel boot option 'plymouth.debug=file:/run/plymouth.debug', plus source code inspection: 1) gdm calls 'plymouth deactivate', which calls ply_terminal_close() -> ply_terminal_stop_watching_for_vt_changes() -> if (terminal->is_watching_for_vt_changes == true) ioctl(VT_SETMODE, VT_AUTO) -> terminal->is_watching_for_vt_changes = false [ply-boot-server.c:LINE] print_connection_process_identity:connection is from pid PID (/bin/plymouth deactivate) with parent pid PID (/usr/sbin/gdm) ... [ply-terminal.c:LINE] ply_terminal_close:restoring color palette [ply-terminal.c:LINE] ply_terminal_close:stop watching tty fd ... 2) plymouth udev event timeout expires, it notices the DRM/DRI devices, and re-enables the VT watching while processing those; in the calls: ply_terminal_open() -> ply_terminal_watch_for_vt_changes() -> terminal->is_watching_for_vt_changes = true; [ply-device-manager.c:LINE] create_devices_from_udev:Timeout elapsed, looking for devices from udev ... [ply-device-manager.c:LINE] create_devices_for_terminal_and_renderer_type:creating devices for /dev/dri/card0 (renderer type: 1) (terminal: /dev/tty1) ... [./plugin.c:LINE] load_driver:Opening '/dev/dri/card0' [ply-terminal.c:LINE] ply_terminal_open:trying to open terminal '/dev/tty1' 3) init calls 'plymouth quit --retain-splash' which goes into ply_terminal_close() again, and since watching is true, it sets the VT into VT_AUTO again (see calls in #1 above) ... (*after* gdm had already set the VT up out of VT_AUTO). [ply-boot-server.c:LINE] print_connection_process_identity:connection is from pid PID (/bin/plymouth quit --retain-splash) with parent pid PID (/sbin/init splash) ... [ply-terminal.c:LINE] ply_terminal_close:restoring color palette [ply-terminal.c:LINE] ply_terminal_close:stop watching tty fd .... That depends on timing (plymouth udev event watch timeout + device detection) and this probably explains why the problem does not happen every single time (but apparently it's the case often; the problem reproduces most of the time in this KVM guest with Ubuntu 18.04.2 Desktop). After understanding that this behavior / code path is responsible for the problem, I found there's an upstream for this in plymouth, which realizes that after 'plymouth deactivate' the udev events should not be reacted on, which prevents re-setting the VT_AUTO mode. Interestingly this fix is already applied in Ubuntu Cosmic and later, for LP: #1795637, due to a different problem (cayses wayland/xorg to fail). The patch needed just a small refresh to apply to Bionic, and a test kernel with it applied successfully passes all 'chvt' tests, multiple times. 1) While gdm is in the login screen, 'ssh <guest> -- sudo chvt 4' 2) With gdm autologin, try the same. Possible workarounds for this are disabling plymouth (remove the 'splash' option from kernel/grub boot options) OR setting the kernel console to a device other than tty0/tty1 (check it with 'dmesg | grep console'), for example, console=ttyS0 or console=ttyS1 (serial/non-graphic consoles). This causes plymouth not to mess with the VT used by gdm (a graphic one). ** Description changed: [Impact] - When AutomaticLogin is enable in gdm3. The "chvt" command hangs forever, - preventing from changing foreground virtual terminal. + When AutomaticLogin is enabled in gdm3, or it is showing the login screen, + the "chvt" command blocks indefinitely (usually resumes with gdm3 restart). + + This prevents users to change the foreground virtual terminal, and it can + also prevent pm-suspend to complete (as it invokes chvt). + + This problem happens in Bionic; it's already fixed in Cosmic and later. + + This patch to plymouth helps it not to revert the VT/tty to VT_AUTO + (after gdm calls 'plymouth deactivate' and changes it to VT_PROCESS) + which causes the ioctl(VT_SETACTIVE) not to generate the event that + the ioctl(VT_WAITACTIVE) will block/wait on just afterward. + + Workarounds are to either disable plymouth / remove 'splash' from the + kernel command line or change it to use a different/non-graphical VT + for console (console=tty0 [default] or console=tty1 [equivalent] are + affected, but console=ttyS0 or console=ttyS1 are not, being serial). [Test case] - 1) Install Bionic/18.04LTS Desktop + 1) Install Bionic/18.04 LTS Desktop - 2) Enable AutomaticLogin - 2.1) Modify /etc/gdm3/custom.conf - # Enabling automatic login - AutomaticLoginEnable = true - AutomaticLogin = <YOUR_USER> + 2) Ensure plymouth / 'splash' is enabled (default) - 3) Reboot your system and make sure AutoLogin works by not requesting + $ grep splash /proc/cmdline + BOOT_IMAGE=... root=... splash ... + + 3) Ensure console is tty0 (default) or tty1 + $ dmesg | grep console + [ 0.004000] console [tty0] enabled + + + A) Login screen, regardless of automatic login + + 4) Ensure the login screen/tty 1 is the displaying + (i.e., it's foregound/active VT) or change to it: + + $ sudo chvt 1 # this works/finishes. + $ + + 5) $ ssh <SYSTEM> 'sudo fgconsole' # check tty1 is foreground VT + 1 + + 6) $ ssh <SYSTEM> 'sudo chvt 4' # this blocks/doesn't finish + + + B) Automatic login, regardless of login screen + + 4) Enable AutomaticLogin in /etc/gdm3/custom.conf + [daemon] + AutomaticLoginEnable = true + AutomaticLogin = <YOUR_USER> + + 5) Reboot your system and make sure AutoLogin works by not requesting password before opening the <YOUR_USER> session. - 4) Print active VT - $ sudo fgconsole + 6) Print active VT + (in Bionic, autologin user session runs on tty1) - Without the fix, it will be "1". # BAD - With the fix, it will be "2". # GOOD + $ sudo fgconsole + 1 - 5) sudo chvt 4 ## chvt will hang here. + 7) sudo chvt 4 # this blocks/doesn't finish - Verification can be made from a 2nd terminal, run : + + From SSH one can check that chvt is blocked waiting + on new VT to become active, which doesn't happen in + this case (old VT in VT_AUTO + KB_GRAPHICS mode): + $ cat /proc/$(pidof chvt)/stack [<0>] __vt_event_wait.isra.2.part.3+0x40/0x90 [<0>] vt_waitactive+0x80/0xd0 [<0>] vt_ioctl+0xd26/0x1140 [<0>] tty_ioctl+0xf6/0x8c0 [<0>] do_vfs_ioctl+0xa8/0x630 [<0>] SyS_ioctl+0x79/0x90 [<0>] do_syscall_64+0x73/0x130 [<0>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [<0>] 0xffffffffffffffff - It's basically waiting for the VT to be activated, but it never happens. - [Potential regression] - Low. + Low. This plymouth patch is upstream and it's already applied + in Cosmic and later for ~6 months (0.9.3-1ubuntu10 / Oct 2018) + for LP: #1795637 (different problem/effect, same root cause). - Current gdm3 run autologin display on tty1. tty1 is really meant for the - login screen. This commit changes autologin to not use the initial vt. - - If one switch to tty1, the next VT switch attempt will hangs again and - one won't be able to switch it. tty1 is really the problem here, so by - forcing the autologin to not use tty1 we improve the current behaviour - where we can't switch VTs at all when autologin is enabled. - - The tty1 behaviour will still need (normal behaviour or not ???) to be - investigated, but not mandatory required for the sake of this SRU IMHO. - - I suspect systemd-logind to be the reason of the tty1 behaviour: - - # ps - root 1350 1 0 Mar03 ? 00:00:03 /lib/systemd/systemd-logind - - #lsof - systemd-l 1350 root 24u CHR 4,1 0t0 45 /dev/tty1 - - But I haven't dig much in it for now. - - So the fix will works as long as one doesn't do run on tty1. - - Exactly like when autologin isn't enable. - - * From a machine with autologin enable: - - /etc/gdm3/customer.conf - # Enabling automatic login - AutomaticLoginEnable = true - AutomaticLogin = user1 - - $ sudo fgconsole - 1 - - * From a machine with autologin disable: - - /etc/gdm3/customer.conf - # Enabling automatic login - # AutomaticLoginEnable = true - # AutomaticLogin = user1 - - $ sudo fgconsole - 2 - - [Other information] - - * Upstream fix: - https://github.com/GNOME/gdm/commit/39fb4ff6 - - $ git describe --contains 39fb4ff6 - 3.30.1~2^2~3 - - $ rmadision gdm3 - ==> gdm3 | 3.28.3-0ubuntu18.04.4 | bionic-updates | ... - gdm3 | 3.30.1-1ubuntu5 | cosmic | ... - gdm3 | 3.30.1-1ubuntu5 | disco | ... - gdm3 | 3.30.1-1ubuntu5.1 | cosmic-security | ... - gdm3 | 3.30.1-1ubuntu5.1 | cosmic-updates | ... - gdm3 | 3.31.4+git20190225-1ubuntu1 | disco-proposed | ... + Besides, it's conservative in nature, and it's spirit makes a + lot of sense (stop handling more udev events after deactivate). + There are no additional fixes to its code changes upstream. [Original Description] - sudo strace chvt 4 - execve("/bin/chvt", ["chvt", "4"], 0x7ffd63e5c758 /* 17 vars */) = 0 - brk(NULL) = 0x561e18430000 - access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) - access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) - openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 - fstat(3, {st_mode=S_IFREG|0644, st_size=74655, ...}) = 0 - mmap(NULL, 74655, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f5059e7d000 - close(3) = 0 - access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) - openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 - read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\260\34\2\0\0\0\0\0"..., 832) = 832 - fstat(3, {st_mode=S_IFREG|0755, st_size=2030544, ...}) = 0 - mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5059e7b000 - mmap(NULL, 4131552, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f5059878000 - mprotect(0x7f5059a5f000, 2097152, PROT_NONE) = 0 - mmap(0x7f5059c5f000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1e7000) = 0x7f5059c5f000 - mmap(0x7f5059c65000, 15072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f5059c65000 - close(3) = 0 - arch_prctl(ARCH_SET_FS, 0x7f5059e7c500) = 0 - mprotect(0x7f5059c5f000, 16384, PROT_READ) = 0 - mprotect(0x561e17e87000, 4096, PROT_READ) = 0 - mprotect(0x7f5059e90000, 4096, PROT_READ) = 0 - munmap(0x7f5059e7d000, 74655) = 0 - brk(NULL) = 0x561e18430000 - brk(0x561e18451000) = 0x561e18451000 - openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3 - fstat(3, {st_mode=S_IFREG|0644, st_size=10281936, ...}) = 0 - mmap(NULL, 10281936, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f5058ea9000 - close(3) = 0 - openat(AT_FDCWD, "/proc/self/fd/0", O_RDWR) = 3 - ioctl(3, TCGETS, {B38400 opost isig icanon echo ...}) = 0 - ioctl(3, KDGKBTYPE, 0x7ffdcdb0efa7) = -1 ENOTTY (Inappropriate ioctl for device) - close(3) = 0 - openat(AT_FDCWD, "/dev/tty", O_RDWR) = 3 - ioctl(3, TCGETS, {B38400 opost isig icanon echo ...}) = 0 - ioctl(3, KDGKBTYPE, 0x7ffdcdb0efa7) = -1 ENOTTY (Inappropriate ioctl for device) - close(3) = 0 + + $ sudo strace chvt 4 + <...> openat(AT_FDCWD, "/dev/tty0", O_RDWR) = 3 ioctl(3, TCGETS, {B38400 opost isig icanon echo ...}) = 0 ioctl(3, KDGKBTYPE, 0x7ffdcdb0efa7) = 0 ioctl(3, VT_ACTIVATE, 0x4) = 0 ioctl(3, VT_WAITACTIVE, 0x4 VT_ACTIVATE will cause a switch to VT number. VT_WAITACTIVE will sleep/wait until the specified VT has been activated. $ sudo cat /proc/$(pidof chvt)/stack [<0>] __vt_event_wait.isra.2.part.3+0x40/0x90 [<0>] vt_waitactive+0x80/0xd0 [<0>] vt_ioctl+0xd26/0x1140 [<0>] tty_ioctl+0xf6/0x8c0 [<0>] do_vfs_ioctl+0xa8/0x630 [<0>] SyS_ioctl+0x79/0x90 [<0>] do_syscall_64+0x73/0x130 [<0>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [<0>] 0xffffffffffffffff - - Enable debuglogs doesn't provide additional details. - - As soon as auto-login is turned off, chvt is back to normal. - - The above has been reproduced on Ubuntu: - - Ubuntu Bionic w/ gdm3 3.28.3 & kbd 2.0.4 -- You received this bug notification because you are a member of Ubuntu Desktop Bugs, which is subscribed to gdm3 in Ubuntu. https://bugs.launchpad.net/bugs/1817738 Title: Can't change virtual terminal on login screen or when auto-login is enabled To manage notifications about this bug go to: https://bugs.launchpad.net/oem-priority/+bug/1817738/+subscriptions -- desktop-bugs mailing list desktop-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/desktop-bugs