On Fri, Jul 06, 2018 at 09:52:24AM +0200, Niclas Zeising wrote:
> On 07/06/18 00:02, Warner Losh wrote:
> >
> >
> > On Thu, Jul 5, 2018 at 1:44 PM, John Baldwin <j...@freebsd.org
> > <mailto:j...@freebsd.org>> wrote:
> >
> > On 7/5/18 12:36 PM, Konstantin Belousov wrote:
> > > On Thu, Jul 05, 2018 at 09:12:24PM +0200, Hans Petter Selasky wrote:
> > >> On 07/05/18 20:59, Hans Petter Selasky wrote:
> > >>> On 07/05/18 19:48, Pete Wright wrote:
> > >>>>
> > >>>>
> > >>>> On 07/05/2018 10:10, John Baldwin wrote:
> > >>>>> On 7/3/18 5:10 PM, Pete Wright wrote:
> > >>>>>>
> > >>>>>> On 07/03/2018 15:56, John Baldwin wrote:
> > >>>>>>> On 7/3/18 3:34 PM, Pete Wright wrote:
> > >>>>>>>> On 07/03/2018 15:29, John Baldwin wrote:
> > >>>>>>>>> That seems like kgdb is looking at the wrong CPU. Can
> > you use
> > >>>>>>>>> 'info threads' and look for threads not stopped in
> > 'sched_switch'
> > >>>>>>>>> and get their backtraces? You could also just do 'thread
> > apply
> > >>>>>>>>> all bt' and put that file at a URL if that is easiest.
> > >>>>>>>>>
> > >>>>>>>> sure thing John - here's a gist of "thread apply all bt"
> > >>>>>>>>
> > >>>>>>>>
> > https://gist.github.com/gem-pete/d8d7ab220dc8781f0827f965f09d43ed
> > <https://gist.github.com/gem-pete/d8d7ab220dc8781f0827f965f09d43ed>
> > >>>>>>> That doesn't look right at all. Are you sure the kernel
> > matches the
> > >>>>>>> vmcore? Also, which kgdb version are you using?
> > >>>>>>>
> > >>>>>> yea i agree that doesn't look right at all. here is my setup:
> > >>>>>>
> > >>>>>> $ which kgdb
> > >>>>>> /usr/bin/kgdb
> > >>>>>> $ kgdb
> > >>>>>> GNU gdb 6.1.1 [FreeBSD]
> > >>>>>> $ ls -lh /var/crash/vmcore.1
> > >>>>>> -rw------- 1 root wheel 1.6G Jul 3 15:03
> > /var/crash/vmcore.1
> > >>>>>> $ ls -l /usr/lib/debug/boot/kernel/kernel.debug
> > >>>>>> -r-xr-xr-x 1 root wheel 87840496 Jul 3 13:54
> > >>>>>> /usr/lib/debug/boot/kernel/kernel.debug
> > >>>>>>
> > >>>>>> and i invoke kgdb like so:
> > >>>>>> $ sudo kgdb /usr/lib/debug/boot/kernel/kernel.debug
> > /var/crash/vmcore.1
> > >>>>>>
> > >>>>>> here's a gist of my full gdb session:
> > >>>>>> http://termbin.com/krsn
> > >>>>>>
> > >>>>>> dunno - maybe i have a bad core dump? regardless, more than
> > happy to
> > >>>>>> help so let me know if i should try anything else or patches
> > etc..
> > >>>>> Can you try installing gdb from ports and using
> > /usr/local/bin/kgdb?
> > >>>>>
> > >>>>
> > >>>> that seems to have done the trick, at least the output looks more
> > >>>> encouraging.
> > >>>>
> > >>>> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> > >>>> KDB: enter: panic
> > >>>>
> > >>>> __curthread () at ./machine/pcpu.h:231
> > >>>> 231 __asm("movq %%gs:%1,%0" : "=r" (td)
> > >>>>
> > >>>>
> > >>>> here's my full kgdb session:
> > >>>> http://termbin.com/qa4f
> > >>>>
> > >>>> i don't see any threads not in "sched_switch" though :(
> > >>>
> > >>> Hi,
> > >>>
> > >>> The problem may be that the patch to enable atomic inlining of all
> > >>> macros forgot to set the SMP keyword which means SMP is not
> > defined at
> > >>> all for KLD's so all non-kernel atomic usage is with MPLOCKED
> > empty!
> > > Problem is that out-of-tree modules build does not have opt*.h files
> > > from the kernel. UP config is a valid one, flipping some option's
> > > default value does not solve the problem.
> >
> > Yes, but using the lock prefix in a generic module is ok (it will still
> > work, just not quite as fast) whereas the lack of lock is fatal on
> > SMP. I would amend Hans' patch slightly to honor the opt_* setting
> > for KLD_TIED (but that is only true if KLD_TIED means "built as part of
> > a kernel build, so has valid opt_foo.h headers" and not
> > 'a standalone module where someone put MODULES_TIED=1 on the command
> > line
> > to make').
> >
> >
> > I agree with this default. It's sensible to default to (a) the most
> > popular thing and (b) thing that always works, especially when (a) and
> > (b) are identical.
> >
> > Don't make me start the "Do we really need an SMP option, why not make
> > it always on" thread :) The number of relevant uniprocessor x86 boxes
> > that benefit from omitting SMP is so small as to be irrelevant, IMHO. A
> > MP kernel runs just fine on them...
> >
> > Warner
>
> Where are we on this?
> It is important to get it fixed, it's already been 4 days, which means 4
> days of all modern FreeBSD desktop systems being broken, and possibly
> other systems with kernel modules from ports as well.
>
>
> Another question, how hard would it be to expose how the kernel was
> built to modules built from ports, so that they can figure out stuff
> like SMP and others, that might affect the module build?
Point the KERNBUILDDIR variable to the directory of the kernel build.
This is the directory where *.o and opt*.h are located. Then everything
would just work.
_______________________________________________
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"