Launchpad has imported 9 comments from the remote bug at https://bugs.freedesktop.org/show_bug.cgi?id=109695.
If you reply to an imported comment from within Launchpad, your comment will be sent to the remote bug automatically. Read more about Launchpad's inter-bugtracker facilities at https://help.launchpad.net/InterBugTracking. ------------------------------------------------------------------------ On 2019-02-20T20:12:32+00:00 Ahzo wrote: Since upgrading Mesa from 18.2 to 18.3, launching a QEMU virtual machine with Spice OpenGL enabled (for virgl), causes QEMU to crash with SIGSYS inside the radeonsi driver. The reason for this is that the QEMU sandbox option 'resourcecontrol=deny' disables the sched_setaffinity syscall called in pthread_setaffinity_np, which is now used by the radeonsi driver. A simple way to reproduce this problem is: $ gdb --batch --ex run --ex bt --args qemu-system-x86_64 -spice gl=on -sandbox on,resourcecontrol=deny [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". [New Thread 0x7ffff45aa700 (LWP 23432)] [New Thread 0x7ffff08e5700 (LWP 23433)] [New Thread 0x7fffe3fff700 (LWP 23434)] [New Thread 0x7fffe37fe700 (LWP 23435)] Thread 4 "qemu-system-x86" received signal SIGSYS, Bad system call. [Switching to Thread 0x7fffe3fff700 (LWP 23434)] 0x00007ffff68cc9cf in __pthread_setaffinity_new (th=<optimized out>, cpusetsize=cpusetsize@entry=128, cpuset=cpuset@entry=0x7fffe3ffe680) at ../sysdeps/unix/sysv/linux/pthread_setaffinity.c:34 34 ../sysdeps/unix/sysv/linux/pthread_setaffinity.c: No such file or directory. #0 0x00007ffff68cc9cf in __pthread_setaffinity_new (th=<optimized out>, cpusetsize=cpusetsize@entry=128, cpuset=cpuset@entry=0x7fffe3ffe680) at ../sysdeps/unix/sysv/linux/pthread_setaffinity.c:34 #1 0x00007ffff12ba2b3 in util_queue_thread_func (input=input@entry=0x55555640b1f0) at ../src/util/u_queue.c:252 #2 0x00007ffff12b9c17 in impl_thrd_routine (p=<optimized out>) at ../src/../include/c11/threads_posix.h:87 #3 0x00007ffff68c1fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486 #4 0x00007ffff67f280f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 The problematic code at src/util/u_queue.c:252 was added in the following commit: commit d877451b48a59ab0f9a4210fc736f51da5851c9a Author: Marek Olšák <marek.ol...@amd.com> Date: Mon Oct 1 15:51:06 2018 -0400 util/u_queue: add UTIL_QUEUE_INIT_SET_FULL_THREAD_AFFINITY Initial version discussed with Rob Clark under a different patch name. This approach leaves his driver unaffected. Since setting the thread affinity seems non-essential here, the failing syscall should be handled gracefully, for example by setting a signal handler to ignore the SIGSYS signal. Reply at: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1815889/comments/12 ------------------------------------------------------------------------ On 2019-02-20T21:30:42+00:00 Marek Olšák wrote: Mesa needs a way to query that it can't set thread affinity. Reply at: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1815889/comments/13 ------------------------------------------------------------------------ On 2019-02-21T17:15:56+00:00 Ahzo wrote: To check for the availability of the syscall, one can try it in a child process and see if the child is terminated by a signal, e.g. like this: #include <stdbool.h> #include <unistd.h> #include <sys/resource.h> #include <sys/syscall.h> #include <sys/wait.h> static bool can_set_affinity() { pid_t pid = fork(); int status = 0; if (!pid) { /* Disable coredumps, because a SIGSYS crash is expected. */ struct rlimit limit = { 0 }; limit.rlim_cur = 1; limit.rlim_max = 1; setrlimit(RLIMIT_CORE, &limit); /* Test the syscall in the child process. */ syscall(SYS_sched_setaffinity, 0, 0, 0); _exit(0); } else if (pid < 0) { return false; } if (waitpid(pid, &status, 0) < 0) { return false; } if (WIFSIGNALED(status)) { /* The child process was terminated by a signal, * thus the syscall cannot be used. */ return false; } return true; } Reply at: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1815889/comments/14 ------------------------------------------------------------------------ On 2019-02-27T14:42:21+00:00 Dan-freedesktop wrote: (In reply to Ahzo from comment #2) > To check for the availability of the syscall, one can try it in a child > process and see if the child is terminated by a signal, e.g. like this: Afraid not, QEMU's seccomp filter blocks use of fork() too :-) Reply at: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1815889/comments/27 ------------------------------------------------------------------------ On 2019-02-27T14:45:24+00:00 Dan-freedesktop wrote: (In reply to Ahzo from comment #0) > The problematic code at src/util/u_queue.c:252 was added in the following > commit: > commit d877451b48a59ab0f9a4210fc736f51da5851c9a > Author: Marek Olšák <marek.ol...@amd.com> > Date: Mon Oct 1 15:51:06 2018 -0400 > > util/u_queue: add UTIL_QUEUE_INIT_SET_FULL_THREAD_AFFINITY > > Initial version discussed with Rob Clark under a different patch name. > This approach leaves his driver unaffected. > > > Since setting the thread affinity seems non-essential here, the failing > syscall should be handled gracefully, for example by setting a signal > handler to ignore the SIGSYS signal. I'm curious what motivated this change to start with ? Even if QEMU was not enforcing seccomp filters, I think I'd consider it a bug for mesa to be setting its process affinity in this way. The mgmt application or sysadmin has decided that the process must have a certain affinity, based on how it/they want the host CPUs utilized. Why is mesa wanting to override this administrative policy decision to restrict CPU usage ? Reply at: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1815889/comments/28 ------------------------------------------------------------------------ On 2019-02-27T14:54:09+00:00 Alexdeucher wrote: (In reply to Daniel P. Berrange from comment #4) > > I'm curious what motivated this change to start with ? Even if QEMU was not > enforcing seccomp filters, I think I'd consider it a bug for mesa to be > setting its process affinity in this way. The mgmt application or sysadmin > has decided that the process must have a certain affinity, based on how > it/they want the host CPUs utilized. Why is mesa wanting to override this > administrative policy decision to restrict CPU usage ? To improve performance on modern multi-core NUMA architectures. Reply at: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1815889/comments/29 ------------------------------------------------------------------------ On 2019-02-27T23:14:50+00:00 elmarco wrote: Sent a quick RFC for an env variable workaround on the ML "[PATCH] RFC: Workaround for pthread_setaffinity_np() seccomp filtering". Reply at: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1815889/comments/30 ------------------------------------------------------------------------ On 2019-02-28T00:15:48+00:00 Marek Olšák wrote: (In reply to Daniel P. Berrange from comment #4) > I'm curious what motivated this change to start with ? Even if QEMU was not > enforcing seccomp filters, I think I'd consider it a bug for mesa to be > setting its process affinity in this way. The mgmt application or sysadmin > has decided that the process must have a certain affinity, based on how > it/they want the host CPUs utilized. Why is mesa wanting to override this > administrative policy decision to restrict CPU usage ? The correct solution is to fix pthread_setaffinity such that it returns an error code instead of crashing. An even better solution would be to have a virtual thread affinity that only the application can see and change, which should be silently masked by administrative policies not visible to the application. Reply at: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1815889/comments/31 ------------------------------------------------------------------------ On 2019-02-28T10:21:44+00:00 Michel Dänzer wrote: (In reply to Marek Olšák from comment #7) > An even better solution would be to have a virtual thread affinity that only > the application can see and change, which should be silently masked by > administrative policies not visible to the application. Mesa doesn't really need explicit thread affinity at all. All it wants is that certain sets of threads run on the same CPU module; it doesn't care which particular CPU module that is. What's really needed is an API to express this affinity between threads, instead of to specific CPU cores. Reply at: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1815889/comments/34 ** Changed in: mesa Status: Unknown => Confirmed ** Changed in: mesa Importance: Unknown => High -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1815889 Title: qemu-system-x86_64 crashed with signal 31 in __pthread_setaffinity_new() Status in Mesa: Confirmed Status in QEMU: New Status in mesa package in Ubuntu: New Status in qemu package in Ubuntu: Triaged Bug description: Unable to launch Default Fedora 29 images in gnome-boxes ProblemType: Crash DistroRelease: Ubuntu 19.04 Package: qemu-system-x86 1:3.1+dfsg-2ubuntu1 ProcVersionSignature: Ubuntu 4.19.0-12.13-generic 4.19.18 Uname: Linux 4.19.0-12-generic x86_64 ApportVersion: 2.20.10-0ubuntu20 Architecture: amd64 Date: Thu Feb 14 11:00:45 2019 ExecutablePath: /usr/bin/qemu-system-x86_64 KvmCmdLine: COMMAND STAT EUID RUID PID PPID %CPU COMMAND MachineType: Dell Inc. Precision T3610 ProcEnviron: PATH=(custom, user) ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.19.0-12-generic root=UUID=939b509b-d627-4642-a655-979b44972d17 ro splash quiet vt.handoff=1 Signal: 31 SourcePackage: qemu StacktraceTop: __pthread_setaffinity_new (th=<optimized out>, cpusetsize=128, cpuset=0x7f5771fbf680) at ../sysdeps/unix/sysv/linux/pthread_setaffinity.c:34 () at /usr/lib/x86_64-linux-gnu/dri/radeonsi_dri.so () at /usr/lib/x86_64-linux-gnu/dri/radeonsi_dri.so start_thread (arg=<optimized out>) at pthread_create.c:486 clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Title: qemu-system-x86_64 crashed with signal 31 in __pthread_setaffinity_new() UpgradeStatus: Upgraded to disco on 2018-11-14 (91 days ago) UserGroups: adm cdrom dip lpadmin plugdev sambashare sudo video dmi.bios.date: 11/14/2018 dmi.bios.vendor: Dell Inc. dmi.bios.version: A18 dmi.board.name: 09M8Y8 dmi.board.vendor: Dell Inc. dmi.board.version: A01 dmi.chassis.type: 7 dmi.chassis.vendor: Dell Inc. dmi.modalias: dmi:bvnDellInc.:bvrA18:bd11/14/2018:svnDellInc.:pnPrecisionT3610:pvr00:rvnDellInc.:rn09M8Y8:rvrA01:cvnDellInc.:ct7:cvr: dmi.product.name: Precision T3610 dmi.product.sku: 05D2 dmi.product.version: 00 dmi.sys.vendor: Dell Inc. To manage notifications about this bug go to: https://bugs.launchpad.net/mesa/+bug/1815889/+subscriptions