On 09/30/16 18:38, Hans de Goede wrote: > Hi, > > On 30-09-16 17:33, Laszlo Ersek wrote: >> On 09/30/16 16:59, Hans de Goede wrote: >>> Hi, >>> >>> On 30-09-16 16:51, Laszlo Ersek wrote: >>>> On 09/30/16 12:35, Hans de Goede wrote: >>>> >>>>> Attached are 2 patches against the xserver which should fix this, >>>>> please give them a try. >>>> >>>> Sorry about the delay. >>>> >>>> The patches don't seem to fix the issue for me. Please see the Xorg log >>>> attached. >>>> >>>> I tested the patches as follows. Given that my bisection had been done >>>> in a Fedora 24 guest, using >>>> >>>> xorg-x11-server-1.18.4-4.fc24 >>>> http://koji.fedoraproject.org/koji/buildinfo?buildID=794494 >>>> >>>> I now rebuilt the guest kernel exactly at the failing commit (a325725 >>>> "drm: Lobotomize set_busid nonsense for !pci drivers"), and first >>>> reproduced the issue with the above X server. >>>> >>>> Then, I ported your patches to "xorg-server-1.18.4" (using the upstream >>>> xserver tree), and rebuilt the Fedora package with the backport. For >>>> the >>>> backport, I had to cherry-pick the following two patches from master >>>> first: >>>> >>>> 1 ca8d88e50310 xfree86: recognize primary BUS_PCI device in >>>> xf86IsPrimaryPlatform() >>>> 2 ea91db4b8331 config: fix GPUDevice fail when AutoAddGPU off + BusID >>>> >>>> This way your patches applied cleanly. (Cherry pick #1 above is >>>> actually >>>> necessary for semantics, while cherry pick #2 is needed for a clean >>>> context only, and has no impact for this test.) >>>> >>>> That is, in total, I added the following four patches to the Fedora 24 >>>> package: >>>> >>>> 1 xfree86: recognize primary BUS_PCI device in xf86IsPrimaryPlatform() >>>> 2 config: fix GPUDevice fail when AutoAddGPU off + BusID >>>> 3 xfree86: Make adding unclaimed devices as GPU devices a separate step >>>> 4 xfree86: Try harder to find atleast 1 non GPU Screen >>>> >>>> You can find the scratch build that I used for testing here: >>>> >>>> xorg-x11-server-1.18.4-4.hans_bz1366842_2.fc24 >>>> http://koji.fedoraproject.org/koji/taskinfo?taskID=15875087 >>>> >>>> Another reason I used F24's X server as basis, rather than upstream >>>> HEAD, is that Fedora 24 is pretty young, and it's already on kernel >>>> 4.7.4, and I believe it will soon move to kernel 4.8, without >>>> (necessarily) rebasing its X server package to upstream. IOW the kernel >>>> upgrade to 4.8 will break X in Fedora 24 too, and then I expect the >>>> Fedora X maintainers would have to cherry pick those two patches as >>>> dependencies just the same. >>>> >>>> To summarize, the patches don't seem to help. I shall nonetheless thank >>>> you for spending your Friday on this! >>> >>> Hmm, do you have a xorg.conf file lying around somewhere, the message >>> about the xserver not being able to find an entry for screen 0 does >>> not make sense ... >> >> Good catch, I actually had two files under "/etc/X11/xorg.conf.d/": >> >> * "00-keyboard.conf", from package "systemd-229-13.fc24.x86_64", with >> contents >> >> ------------ >> # Read and parsed by systemd-localed. It's probably wise not to edit >> this file >> # manually too freely. >> Section "InputClass" >> Identifier "system-keyboard" >> MatchIsKeyboard "on" >> Option "XkbLayout" "us" >> EndSection >> ------------ >> >> * "01-resolution.conf", which I had created, in order to set the >> preferred display resolution: >> >> ------------ >> Section "Screen" >> Identifier "Default Screen" >> Device "Default Device" >> Monitor "Default Monitor" >> EndSection >> >> Section "Device" >> Identifier "Default Device" >> Driver "modesetting" >> EndSection >> >> Section "Monitor" >> Identifier "Default Monitor" >> Option "PreferredMode" "640x480" >> # Option "PreferredMode" "1440x900" >> EndSection >> ------------ >> >> I removed these files now, and repeated the test. Again, the X server >> wouldn't start, but I think the log file looks a bit different now. >> Attached. > > Ah, ok so it seems that my initial analysis is wrong, the problem > is not a re-occuring of the device getting identified as a GPU screen, > libdrm sorta depends on bus-ids and the lack of one is causing the > server to misbehave. I guess that even with a xorg.conf things > will fail with the troublesome kernel version (might be worth > trying). > > Emil's analysis seems to be spot on. This does not seem easily > fixable in userspace / does seem like a real regression as it > even breaks things when specifying the device through xorg.conf > (I or so I believe) which is something which uses to work ...
In order to check this hypothesis, I did the following: - I downgraded my xorg-x11-server installation to the most recent official F24 packages, that is, "1.18.4-4.fc24", - I kept the kernel that I built exactly at the regressive commit (a325725633c2) - I modified "01-resolution.conf" (see it above in the context) like this: ---- Section "Device" Identifier "Default Device" Driver "modesetting" BusID "PCI:00:02:0" <------------ new option added EndSection ---- where BusID matches the B/D/F of the virtio-vga device from "lspci". This setup (modulo the kernel of course) was known to work, but now the X server actually segfaults (apparently in the xf86PlatformDeviceCheckBusID() function). Please find the logfile attached. (NB: this is unrelated to upstream commit de9ce6757c2e -- which the pristine FC24 build lacks -- because I don't set AutoAddGPU to "off" -- it is left at its default "on" value.) Therefore, you are right. :) Thanks Laszlo > I made the mistake of thinking the kernel change was re-triggering > the old problem Laszlo fixed, but that does not seem to be the > case. > > Regards, > > Hans -------------- next part -------------- A non-text attachment was scrubbed... Name: Xorg.0.log Type: text/x-log Size: 4610 bytes Desc: not available URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20160930/920def26/attachment-0001.bin>