retitle 924891 glibc: misc/tst-pkey fails due to cleared PKRU register after signal in amd64 32-bit compat mode thanks
* Lucas Nussbaum: > On 27/03/19 at 08:48 +0100, Florian Weimer wrote: >> > If that's useful, I can easily provide access to an AWS VM to debug this >> > issue. >> >> Oh, that would be quite helpful indeed. > > Can you send your SSH key? (I thought there was a way to get the SSH key > for a DD, but I cannot find it anymore) > > Then you will be able to ssh to root@18.184.55.40. > There's sbuild and schroot setup on the VM. > > When you are done, please 'poweroff' the machine, which will terminate > it. The issue reproduces outside the chroot, with the stretch userland. What happens is that once we get out of the SIGUSR1 signal handler, the PKRU register has value zero. This happens around this code in the test: /* Check that in a signal handler, there is no access. */ xsignal (SIGUSR1, &sigusr1_handler); xraise (SIGUSR1); xsignal (SIGUSR1, SIG_DFL); TEST_COMPARE (sigusr1_handler_ran, 1); I checked the following (via a breakpoint in pkey_get; I don't think GDB can read the PKRU register directly): Inside the SIGUSR1 signal handler, PKRU has value 0x55555554, as expected for this kernel, but after the return, we get zero. This is the first time a signal is delivered on the main thread, so it's consistent with fairly broken signal handling as far as the PKRU register is concerned. I guess clearing PKRU in this way might even constitute a minor security bug (because the zero value means no restrictions). This commit looks highly relevant: commit a4455082dc6f0b5d51a23523f77600e8ede47c79 Author: Dave Hansen <dave.han...@linux.intel.com> Date: Wed Jun 8 10:25:33 2016 -0700 x86/signals: Add missing signal_compat code for x86 features The 32-bit siginfo is a different binary format than the 64-bit one. So, when running 32-bit binaries on 64-bit kernels, we have to convert the kernel's 64-bit version to a 32-bit version that userspace can grok. If the siginfo_t layout is incorrect (with regards to what the hardware writes), I expect that we might end up copying back the wrong PKRU value. I'm not sure what to do here. This really looks like a kernel bug. Maybe we should just verify that this is fixed in the buster kernel and move on? Lucas, can you run your rebuild tests on newer kernels?