One day of debugging gave more insight. Apport is not directly the culprit. If reading from stdin takes too long, the input will be truncated. Attached a patch for apport to write the coredump at the beginning and quit. It can be slowed down by sleeping between the 1 MB blocks. The longer the sleep, the smaller the files get. For my VM: sleeping for 0.001s truncates the file slightly to ~400 MB; sleeping 0.005s truncates it to 42 MB. I verified that /proc/<pid> is still there afterwards.
So I would say that the kernel team should have a look next. ** Patch added: "0001-apport-Write-coredump-at-beginning-and-quit.patch" https://bugs.launchpad.net/ubuntu/+source/apport/+bug/2015857/+attachment/5665206/+files/0001-apport-Write-coredump-at-beginning-and-quit.patch ** Also affects: linux (Ubuntu) Importance: Undecided Status: New ** Description changed: Repeatedly unusable truncated crash files: Bug 2012974, bug 2015842, bug 2015140, bug 2012075 Based on my own testing, the problems seems to happen if multiple binaries crash simultaneously (like at logout). One crash file gets fully written and the other is incomplete. If I remove apport from the equation and just get the kernel to dump core files then the core files are always written reliably (should be several hundred MB in the case of gnome-shell). Running `apport-retrace -g $crash` on the .crash files will fail with "/tmp/apport_core_[...] is not a core dump: file format not recognized" + + Steps to reproduce + ================== + + 1. Use Ubuntu 23.04 (lunar) desktop with GNOME shell + + 2. Downgrade gjs and libgjs0g to 1.76.0-1 (for triggering bug 1974293) + + 3. Remove all crashes: `sudo rm -f /var/crash/*` + + 4. Interact with the desktop, icon grid and calendar for 30 seconds. + + 5. Log out. + + 6. Crash files written. + + To test the core file, either use `apport-unpack` on the + /var/crash/*.crash file to extract the CoreDump file or use `apport- + retrace -g` on the .crash file. It will fail to print the backtrace: + + ``` + $ apport-retrace -g /var/crash/_usr_bin_gnome-shell.1000.crash + [...] + Warnung: Error reading shared library list entry at 0x646c747200000000 + Failed to read a valid object file image from memory. + Core was generated by `/usr/bin/gnome-shell'. + Program terminated with signal SIGSEGV, Segmentation fault. + Warnung: Section `.reg-xstate/10527' in core file too small. + (gdb) bt + #0 0x00007fe1c8090ffb in ?? () + Backtrace stopped: Cannot access memory at address 0x7ffd59a40fe0 + ``` ** Description changed: Repeatedly unusable truncated crash files: Bug 2012974, bug 2015842, bug 2015140, bug 2012075 Based on my own testing, the problems seems to happen if multiple binaries crash simultaneously (like at logout). One crash file gets fully written and the other is incomplete. If I remove apport from the equation and just get the kernel to dump core files then the core files are always written reliably (should be several hundred MB in the case of gnome-shell). Running `apport-retrace -g $crash` on the .crash files will fail with "/tmp/apport_core_[...] is not a core dump: file format not recognized" Steps to reproduce ================== 1. Use Ubuntu 23.04 (lunar) desktop with GNOME shell 2. Downgrade gjs and libgjs0g to 1.76.0-1 (for triggering bug 1974293) 3. Remove all crashes: `sudo rm -f /var/crash/*` 4. Interact with the desktop, icon grid and calendar for 30 seconds. 5. Log out. 6. Crash files written. To test the core file, either use `apport-unpack` on the /var/crash/*.crash file to extract the CoreDump file or use `apport- retrace -g` on the .crash file. It will fail to print the backtrace: ``` $ apport-retrace -g /var/crash/_usr_bin_gnome-shell.1000.crash [...] Warnung: Error reading shared library list entry at 0x646c747200000000 Failed to read a valid object file image from memory. Core was generated by `/usr/bin/gnome-shell'. Program terminated with signal SIGSEGV, Segmentation fault. Warnung: Section `.reg-xstate/10527' in core file too small. (gdb) bt #0 0x00007fe1c8090ffb in ?? () Backtrace stopped: Cannot access memory at address 0x7ffd59a40fe0 ``` + + You can use the 0001-apport-Write-coredump-at-beginning-and-quit.patch + on apport to only write the coredump and adjust the slowdown to 100% + reproduce the behavior on log-out. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2015857 Title: Repeatedly unusable truncated crash files (on log-out) Status in apport package in Ubuntu: Triaged Status in linux package in Ubuntu: Incomplete Bug description: Repeatedly unusable truncated crash files: Bug 2012974, bug 2015842, bug 2015140, bug 2012075 Based on my own testing, the problems seems to happen if multiple binaries crash simultaneously (like at logout). One crash file gets fully written and the other is incomplete. If I remove apport from the equation and just get the kernel to dump core files then the core files are always written reliably (should be several hundred MB in the case of gnome-shell). Running `apport-retrace -g $crash` on the .crash files will fail with "/tmp/apport_core_[...] is not a core dump: file format not recognized" Steps to reproduce ================== 1. Use Ubuntu 23.04 (lunar) desktop with GNOME shell 2. Downgrade gjs and libgjs0g to 1.76.0-1 (for triggering bug 1974293) 3. Remove all crashes: `sudo rm -f /var/crash/*` 4. Interact with the desktop, icon grid and calendar for 30 seconds. 5. Log out. 6. Crash files written. To test the core file, either use `apport-unpack` on the /var/crash/*.crash file to extract the CoreDump file or use `apport- retrace -g` on the .crash file. It will fail to print the backtrace: ``` $ apport-retrace -g /var/crash/_usr_bin_gnome-shell.1000.crash [...] Warnung: Error reading shared library list entry at 0x646c747200000000 Failed to read a valid object file image from memory. Core was generated by `/usr/bin/gnome-shell'. Program terminated with signal SIGSEGV, Segmentation fault. Warnung: Section `.reg-xstate/10527' in core file too small. (gdb) bt #0 0x00007fe1c8090ffb in ?? () Backtrace stopped: Cannot access memory at address 0x7ffd59a40fe0 ``` You can use the 0001-apport-Write-coredump-at-beginning-and-quit.patch on apport to only write the coredump and adjust the slowdown to 100% reproduce the behavior on log-out. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/apport/+bug/2015857/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp