Re: Panic on boot after upgrade from r320827 -> r320869
On 07/11/17 19:53, Michael Butler wrote: On 07/11/17 13:13, I wrote: Take sdhci out of the kernel and try again. If that works, it tells us one thing (need to troubleshoot sdhci stuff more). If not it tells us another (need to troubleshoot CAM more), do we get errors with the ATA_IDENTIFY command? Does it try multiple times per AHCI port? What AHCI device do you have? You may need to scroll back with the screen-lock / pageup keys to see these messages. [ .. snip .. ] I'll try this tonight when I'm back at home. The laptop concerned uses the ICH-7M part in "legacy mode" so it doesn't do AHCI at all :-( Without sdhci and mmc, it actually boots but everything KDE aborts with signal 6 :-( I'm not prepared to rebuild the ~1900 ports on this box to pursue this further, Something about SVN r320844 causes almost all KDE applications to fail on a signal 6. I've recompiled KDE and other components obviously dependent on kernel structures (e.g. everything dbus-related). I still get core-files with a back-trace that looks like: (gdb) bt #0 0x000804232f6a in thr_kill () from /lib/libc.so.7 #1 0x000804232f34 in raise () from /lib/libc.so.7 #2 0x000804232ea9 in abort () from /lib/libc.so.7 #3 0x0008188597af in ?? () from /usr/local/lib/libdbus-1.so.3 #4 0x00081884ef2c in _dbus_warn_check_failed () from /usr/local/lib/libdbus-1.so.3 #5 0x00081883f539 in dbus_message_new_method_call () from /usr/local/lib/libdbus-1.so.3 #6 0x000801bddfe8 in ?? () from /usr/local/lib/qt4/libQtDBus.so.4 #7 0x000801bd591e in ?? () from /usr/local/lib/qt4/libQtDBus.so.4 #8 0x000801bd9af6 in ?? () from /usr/local/lib/qt4/libQtDBus.so.4 #9 0x000801be656d in ?? () from /usr/local/lib/qt4/libQtDBus.so.4 #10 0x000801be6807 in QDBusInterface::QDBusInterface(QString const&, QString const&, QString const&, QDBusConnection const&, QObject*) () from /usr/local/lib/qt4/libQtDBus.so.4 #11 0x00080d12728e in ?? () from /usr/local/lib/libsolid.so.4 #12 0x00080d11e68c in ?? () from /usr/local/lib/libsolid.so.4 #13 0x00080d12a525 in ?? () from /usr/local/lib/libsolid.so.4 #14 0x00080d0e7aac in Solid::Device::listFromType(Solid::DeviceInterface::Type const&, QString const&) () from /usr/local/lib/libsolid.so.4 #15 0x00080e7e889a in ?? () from /usr/local/lib/libplasma.so.3 #16 0x00080e7e6094 in Plasma::RunnerManager::RunnerManager(QObject*) () from /usr/local/lib/libplasma.so.3 #17 0x0008172fab42 in ?? () from /usr/local/lib/libkdeinit4_krunner.so #18 0x0008172fa9b4 in ?? () from /usr/local/lib/libkdeinit4_krunner.so #19 0x0008172fd303 in kdemain () from /usr/local/lib/libkdeinit4_krunner.so #20 0x0040a015 in ?? () #21 0x0040aec0 in ?? () SVN r320843 works, r320844 doesn't :-( imb ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Panic on boot after upgrade from r320827 -> r320869
On Sat, Jul 15, 2017 at 1:32 PM, Michael Butler wrote: > On 07/11/17 19:53, Michael Butler wrote: > >> On 07/11/17 13:13, I wrote: >> >>> Take sdhci out of the kernel and try again. If that works, it tells us one thing (need to troubleshoot sdhci stuff more). If not it tells us another (need to troubleshoot CAM more), do we get errors with the ATA_IDENTIFY command? Does it try multiple times per AHCI port? What AHCI device do you have? You may need to scroll back with the screen-lock / pageup keys to see these messages. >>> >> [ .. snip .. ] >> >> I'll try this tonight when I'm back at home. The laptop concerned uses >>> the ICH-7M part in "legacy mode" so it doesn't do AHCI at all :-( >>> >> >> Without sdhci and mmc, it actually boots but everything KDE aborts with >> signal 6 :-( >> >> I'm not prepared to rebuild the ~1900 ports on this box to pursue this >> further, >> > > Something about SVN r320844 causes almost all KDE applications to fail on > a signal 6. > I don't think that's possible, unless (a) your build hit the 'not everything in the kernel rebuilt' bug or (b) KDE is issuing raw CAM requests. Since I don't know KDE, don't run KDE or have any clue about KDE, I can't help you trace it down further. > I've recompiled KDE and other components obviously dependent on kernel > structures (e.g. everything dbus-related). I still get core-files with a > back-trace that looks like: > > (gdb) bt > #0 0x000804232f6a in thr_kill () from /lib/libc.so.7 > #1 0x000804232f34 in raise () from /lib/libc.so.7 > #2 0x000804232ea9 in abort () from /lib/libc.so.7 > #3 0x0008188597af in ?? () from /usr/local/lib/libdbus-1.so.3 > #4 0x00081884ef2c in _dbus_warn_check_failed () from > /usr/local/lib/libdbus-1.so.3 > #5 0x00081883f539 in dbus_message_new_method_call () from > /usr/local/lib/libdbus-1.so.3 > #6 0x000801bddfe8 in ?? () from /usr/local/lib/qt4/libQtDBus.so.4 > #7 0x000801bd591e in ?? () from /usr/local/lib/qt4/libQtDBus.so.4 > #8 0x000801bd9af6 in ?? () from /usr/local/lib/qt4/libQtDBus.so.4 > #9 0x000801be656d in ?? () from /usr/local/lib/qt4/libQtDBus.so.4 > #10 0x000801be6807 in QDBusInterface::QDBusInterface(QString const&, > QString const&, QString const&, QDBusConnection const&, QObject*) () >from /usr/local/lib/qt4/libQtDBus.so.4 > #11 0x00080d12728e in ?? () from /usr/local/lib/libsolid.so.4 > #12 0x00080d11e68c in ?? () from /usr/local/lib/libsolid.so.4 > #13 0x00080d12a525 in ?? () from /usr/local/lib/libsolid.so.4 > #14 0x00080d0e7aac in > Solid::Device::listFromType(Solid::DeviceInterface::Type > const&, QString const&) () from /usr/local/lib/libsolid.so.4 > #15 0x00080e7e889a in ?? () from /usr/local/lib/libplasma.so.3 > #16 0x00080e7e6094 in Plasma::RunnerManager::RunnerManager(QObject*) > () from /usr/local/lib/libplasma.so.3 > #17 0x0008172fab42 in ?? () from /usr/local/lib/libkdeinit4_krunner.so > #18 0x0008172fa9b4 in ?? () from /usr/local/lib/libkdeinit4_krunner.so > > #19 0x0008172fd303 in kdemain () from > /usr/local/lib/libkdeinit4_krunner.so > > #20 0x0040a015 in ?? () > > #21 0x0040aec0 in ?? () > > > SVN r320843 works, r320844 doesn't :-( > I'd look to see if any of that software uses a CAM CCB for any reason. That's the only thing I can think of that might have been affected. Perhaps it's doing an identify? There was one CCB that changed size (and did so in an incompatible way between rev 320844 and 320878), I didn't think it was user visible. Does camcontrol identify or camcontrol inquiry work? Warner ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Panic on boot after upgrade from r320827 -> r320869
On 07/15/17 20:39, Mark Millard wrote: FYI for Michael B.: the incomplete kernel rebuild problem has a fix: -r320919 . See the fix (to the building problem that was created in -r320220 ): https://lists.freebsd.org/pipermail/svn-src-head/2017-July/102622.html If the KDE problem persists based on a -r320919 or later build, it would be appropriate to report it again as a separate issue. Unfortunately various odd problems have shown up over -r320220 through -r320918 from incorrect rebuilds (and other oddities overlapping in the time frame). Of course if you built (or build) -r320844 based on a empty directory in the first place so that it was a full-build but the KDE problem persisted when using the rebuilt kernel then the above material does not apply. In such a case reporting that about the context for the KDE problem would be appropriate. You may well have other things to be doing instead of what the above suggests. If so, just take the above as background information. Prior to testing this, I did 'rm -rf /usr/obj/*' so it is a clean build. I can run with user-land at SVN r321021 but any kernel at or after r320844 fails :-( imb ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Panic on boot after upgrade from r320827 -> r320869
On Sat, Jul 15, 2017 at 6:49 PM, Michael Butler wrote: > On 07/15/17 20:39, Mark Millard wrote: > >> FYI for Michael B.: the incomplete kernel rebuild problem has a fix: >> -r320919 . >> See the fix (to the building problem that was created in -r320220 ): >> >> https://lists.freebsd.org/pipermail/svn-src-head/2017-July/102622.html >> >> If the KDE problem persists based on a -r320919 or later build, it would >> be appropriate to report it again as a separate issue. >> >> Unfortunately various odd problems have shown up over -r320220 through >> -r320918 from incorrect rebuilds (and other oddities overlapping in the >> time frame). >> >> Of course if you built (or build) -r320844 based on a empty directory in >> the first place so that it was a full-build but the KDE problem persisted >> when using the rebuilt kernel then the above material does not apply. In >> such a case reporting that about the context for the KDE problem would be >> appropriate. >> >> You may well have other things to be doing instead of what the above >> suggests. If so, just take the above as background information. >> > > Prior to testing this, I did 'rm -rf /usr/obj/*' so it is a clean build. I > can run with user-land at SVN r321021 but any kernel at or after r320844 > fails :-( > Right. We need to find out what, exactly, is failing to make progress. I have exactly one guess as to what might be going on, and it's a long shot at best. To gather more evidence, I need to know if the kde thing that's segfaulting is accessing /dev/pass* or /dev/xpt*. If you can confirm that it is, then we'll need to see how to fix that. Also, you'll need an installworld as well as an installkernel so the new headers are installed prior to running kde. If that fixes it, then my guess goes from a long shot to close to a sure thing. Warner ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Panic on boot after upgrade from r320827 -> r320869
Warner Losh imp at bsdimp.com wrote on Sat Jul 15 23:22:22 UTC 2017 : > On Sat, Jul 15, 2017 at 1:32 PM, Michael Butler protected-networks.net> > wrote: > > > On 07/11/17 19:53, Michael Butler wrote: > > > . . . > > > > Something about SVN r320844 causes almost all KDE applications to fail on > > a signal 6. > > > > I don't think that's possible, unless (a) your build hit the 'not > everything in the kernel rebuilt' bug or (b) KDE is issuing raw CAM > requests. Since I don't know KDE, don't run KDE or have any clue about KDE, > I can't help you trace it down further. FYI for Michael B.: the incomplete kernel rebuild problem has a fix: -r320919 . See the fix (to the building problem that was created in -r320220 ): https://lists.freebsd.org/pipermail/svn-src-head/2017-July/102622.html If the KDE problem persists based on a -r320919 or later build, it would be appropriate to report it again as a separate issue. Unfortunately various odd problems have shown up over -r320220 through -r320918 from incorrect rebuilds (and other oddities overlapping in the time frame). Of course if you built (or build) -r320844 based on a empty directory in the first place so that it was a full-build but the KDE problem persisted when using the rebuilt kernel then the above material does not apply. In such a case reporting that about the context for the KDE problem would be appropriate. You may well have other things to be doing instead of what the above suggests. If so, just take the above as background information. === Mark Millard markmi at dsl-only.net ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"