On Mon, Nov 09, 2020 at 07:39:09PM +0000, Anthony Richardby wrote: > On 2020-11-08 22:45:21 +0000 Jonathan Gray <j...@jsg.id.au> wrote: > > > On Sun, Nov 08, 2020 at 03:21:33PM +0000, Anthony Richardby wrote: > > > On 2020-11-08 14:06:00 +0000 Jonathan Gray <j...@jsg.id.au> wrote: > > > > > > On Sun, Nov 08, 2020 at 01:09:40PM +0000, Anthony Richardby wrote: > > > > > On 2020-11-08 11:14:50 +0000 Jonathan Gray <j...@jsg.id.au> wrote: > > > > > > > On Sun, Nov 08, 2020 at 10:51:11AM +0000, Anthony Richardby > > > > wrote: > > > > > > > On 2020-11-08 02:40:26 +0000 Jonathan Gray <j...@jsg.id.au> > > > > wrote: > > > > > > > > > On Sat, Nov 07, 2020 at 04:01:34PM +0000, Anthony > > > > Richardby > > > > > > wrote: > > > > > > > > > Hey, > > > > > > > > > I'm experiencing a kernel panic whenever I run glxgears, > > > > or it > > > > > > > > would > > > > > > > > > seem when I try to run other gpu-related tasks. I first > > > > > > noticed > > > > > > > > this > > > > > > > > > when playing video via the application 'mpv', after a > > > > couple > > > > > > of > > > > > > > > > seconds of video the system freezes!! I've got the trace > > > > > > output > > > > > > > > for > > > > > > > > > that as the time delay allows me to switch out of X11 to a > > > > > > console > > > > > > > > > before the system crashes. Anyway after this I tried > > > > > > 'glxgears' > > > > > > > > and > > > > > > > > > that crashes immediately (so I'm unable to get the trace, > > > > but > > > > > > it > > > > > > > > > sounds like it's related?). > > > > > > > > > > Just wondering if anyone else is able to reproduce this > > > > on > > > > > > their > > > > > > > > > powerbooks / other g4 macs, does glxgears work for you? > > > > > > > > > > Pic of the trace output: https://postimg.cc/wRLGLKdF > > > > > > > > > What is the actual panic message? > > > > > > > > > For the archives the trace was: > > > > > > > > > panic(b6ae0c) at panic+0x154 > > > > > > > > mtx_enter(0) at mtx_enter+0x8c > > > > > > > > drm_update_vblank_count(2a0fcdf8,e000a000,b6ae78) at > > > > > > > > drm_update_vblank_count+0x4b8 > > > > > > > > drm_handle_vblank(2000611,82720e) at drm_handle_vblank+0xd8 > > > > > > > > r100_irq_process(a19ca0) at r100_irq_process+0x110 > > > > > > > > radeon_driver_irq_handler_kms(0) at > > > > > > > > radeon_driver_irq_handler_kms+0x28 > > > > > > > > openpic_ext_intr() at openpic_ext_intr+0x274 > > > > > > > > extint_call() at extint_call > > > > > > > > --- interrupt --- > > > > > > > > at 0xe7055e1c > > > > > > > > drm_wait_vblank_ioctl(e0732f00,e06b9b84,e7055c60) at > > > > > > > > drm_wait_vblank_ioctl+0xb2 > > > > > > > > drm_do_ioctl(251571d0) at drm_do_ioctl+0x298 > > > > > > > > drmioctl(25157100,e7055ca8,990000,c010643a,3a033428) at > > > > > > > > drmioctl+0xbc > > > > > > > > > > > > > > > > > The panic message is: > > > > > > > panic: mtx 0xe024cc50: locking against myself > > > > > > > > I forgot to mention that this is new in 6.8, I can run > > > > > > glxgears/play > > > > > > > video with mpv fine using 6.7. > > > > > > > The seqlock is used from an interrupt context > > > > > > drm_update_vblank_count() -> store_vblank() > > > > > > > Can you try a kernel with the following diff? > > > > > > Patch against 6.8 branch. > > > > > > > Index: sys/dev/pci/drm/drm_vblank.c > > > > > > > > > > =================================================================== > > > > > > RCS file: /cvs/src/sys/dev/pci/drm/drm_vblank.c,v > > > > > > retrieving revision 1.6 > > > > > > diff -u -p -r1.6 drm_vblank.c > > > > > > --- sys/dev/pci/drm/drm_vblank.c 29 Jul 2020 09:52:21 -0000 > > > > > > 1.6 > > > > > > +++ sys/dev/pci/drm/drm_vblank.c 8 Nov 2020 11:04:58 -0000 > > > > > > @@ -481,7 +481,7 @@ int drm_vblank_init(struct drm_device *d > > > > > > init_waitqueue_head(&vblank->queue); > > > > > > setup_timer(&vblank->disable_timer, vblank_disable_fn, > > > > > > (unsigned long)vblank); > > > > > > - seqlock_init(&vblank->seqlock, IPL_NONE); > > > > > > + seqlock_init(&vblank->seqlock, IPL_TTY); > > > > > > } > > > > > > > DRM_INFO("Supports vblank timestamp caching Rev 2 > > > > > > (21.10.2013).\n"); > > > > > > > > > > > > Thanks for the time taken to look into this and offer this diff! > > > > > Unfortunately its still crashing, the trace and panic message > > > > look to > > > > > be the same: > > > > > > https://postimg.cc/G4KNQ9c6 > > > > > I've managed to use ports clang and lldb to build and inspect a > > > > powerpc > > > > drm_vblank.o and it points to > > > > src/sys/dev/pci/drm/include/linux/atomic.h:165:2 > > > > which is the ilp32 version of atomic64_read(). > > > > > The following patch makes sure the lock involved is initialised. > > > > > Index: sys/dev/pci/drm/drm_vblank.c > > > > =================================================================== > > > > RCS file: /cvs/src/sys/dev/pci/drm/drm_vblank.c,v > > > > retrieving revision 1.6 > > > > diff -u -p -r1.6 drm_vblank.c > > > > --- sys/dev/pci/drm/drm_vblank.c 29 Jul 2020 09:52:21 -0000 > > > > 1.6 > > > > +++ sys/dev/pci/drm/drm_vblank.c 8 Nov 2020 13:51:27 -0000 > > > > @@ -481,7 +481,8 @@ int drm_vblank_init(struct drm_device *d > > > > init_waitqueue_head(&vblank->queue); > > > > setup_timer(&vblank->disable_timer, vblank_disable_fn, > > > > (unsigned long)vblank); > > > > - seqlock_init(&vblank->seqlock, IPL_NONE); > > > > + seqlock_init(&vblank->seqlock, IPL_TTY); > > > > + atomic64_set(&vblank->count, 0); > > > > } > > > > > DRM_INFO("Supports vblank timestamp caching Rev 2 > > > > (21.10.2013).\n"); > > > > > > > > > Absolute rockstar! That looks to have done the trick, I've been > > > watching a video in mpv while running glxgears for a good 10 minutes > > > and everything it's still up and running. > > > > > Many thanks for your help, I'm looking forward to updating my other > > > machine to 6.8 now that this is working. > > > > There are other atomic64 uses where this could come up. > > Can you try this diff to use a single preinitialised mutex? > > > > Index: sys/dev/pci/drm/drm_linux.c > > =================================================================== > > RCS file: /cvs/src/sys/dev/pci/drm/drm_linux.c,v > > retrieving revision 1.63 > > diff -u -p -r1.63 drm_linux.c > > --- sys/dev/pci/drm/drm_linux.c 26 Aug 2020 03:29:06 -0000 1.63 > > +++ sys/dev/pci/drm/drm_linux.c 8 Nov 2020 22:13:24 -0000 > > @@ -67,6 +67,7 @@ tasklet_run(void *arg) > > } > > } > > > > +struct mutex atomic64_mtx = MUTEX_INITIALIZER(IPL_HIGH); > > struct mutex sch_mtx = MUTEX_INITIALIZER(IPL_SCHED); > > volatile struct proc *sch_proc; > > volatile void *sch_ident; > > Index: sys/dev/pci/drm/drm_vblank.c > > =================================================================== > > RCS file: /cvs/src/sys/dev/pci/drm/drm_vblank.c,v > > retrieving revision 1.6 > > diff -u -p -r1.6 drm_vblank.c > > --- sys/dev/pci/drm/drm_vblank.c 29 Jul 2020 09:52:21 -0000 1.6 > > +++ sys/dev/pci/drm/drm_vblank.c 8 Nov 2020 22:13:42 -0000 > > @@ -481,7 +481,7 @@ int drm_vblank_init(struct drm_device *d > > init_waitqueue_head(&vblank->queue); > > setup_timer(&vblank->disable_timer, vblank_disable_fn, > > (unsigned long)vblank); > > - seqlock_init(&vblank->seqlock, IPL_NONE); > > + seqlock_init(&vblank->seqlock, IPL_TTY); > > } > > > > DRM_INFO("Supports vblank timestamp caching Rev 2 (21.10.2013).\n"); > > Index: sys/dev/pci/drm/include/linux/atomic.h > > =================================================================== > > RCS file: /cvs/src/sys/dev/pci/drm/include/linux/atomic.h,v > > retrieving revision 1.10 > > diff -u -p -r1.10 atomic.h > > --- sys/dev/pci/drm/include/linux/atomic.h 17 Jun 2020 01:03:57 > > -0000 1.10 > > +++ sys/dev/pci/drm/include/linux/atomic.h 8 Nov 2020 22:12:00 -0000 > > @@ -143,18 +143,20 @@ atomic64_xchg(volatile int64_t *v, int64 > > > > #else > > > > +extern struct mutex atomic64_mtx; > > + > > typedef struct { > > volatile int64_t val; > > - struct mutex lock; > > } atomic64_t; > > > > -#define ATOMIC64_INIT(x) { (x), .lock = MUTEX_INITIALIZER(IPL_HIGH) } > > +#define ATOMIC64_INIT(x) { (x) } > > > > static inline void > > atomic64_set(atomic64_t *v, int64_t i) > > { > > - mtx_init(&v->lock, IPL_HIGH); > > + mtx_enter(&atomic64_mtx); > > v->val = i; > > + mtx_leave(&atomic64_mtx); > > } > > > > static inline int64_t > > @@ -162,9 +164,9 @@ atomic64_read(atomic64_t *v) > > { > > int64_t val; > > > > - mtx_enter(&v->lock); > > + mtx_enter(&atomic64_mtx); > > val = v->val; > > - mtx_leave(&v->lock); > > + mtx_leave(&atomic64_mtx); > > > > return val; > > } > > @@ -174,10 +176,10 @@ atomic64_xchg(atomic64_t *v, int64_t n) > > { > > int64_t val; > > > > - mtx_enter(&v->lock); > > + mtx_enter(&atomic64_mtx); > > val = v->val; > > v->val = n; > > - mtx_leave(&v->lock); > > + mtx_leave(&atomic64_mtx); > > > > return val; > > } > > @@ -185,9 +187,9 @@ atomic64_xchg(atomic64_t *v, int64_t n) > > static inline void > > atomic64_add(int i, atomic64_t *v) > > { > > - mtx_enter(&v->lock); > > + mtx_enter(&atomic64_mtx); > > v->val += i; > > - mtx_leave(&v->lock); > > + mtx_leave(&atomic64_mtx); > > } > > > > #define atomic64_inc(p) atomic64_add(p, 1) > > @@ -197,10 +199,10 @@ atomic64_add_return(int i, atomic64_t *v > > { > > int64_t val; > > > > - mtx_enter(&v->lock); > > + mtx_enter(&atomic64_mtx); > > val = v->val + i; > > v->val = val; > > - mtx_leave(&v->lock); > > + mtx_leave(&atomic64_mtx); > > > > return val; > > } > > @@ -210,9 +212,9 @@ atomic64_add_return(int i, atomic64_t *v > > static inline void > > atomic64_sub(int i, atomic64_t *v) > > { > > - mtx_enter(&v->lock); > > + mtx_enter(&atomic64_mtx); > > v->val -= i; > > - mtx_leave(&v->lock); > > + mtx_leave(&atomic64_mtx); > > } > > #endif > > > > > > > > > > I've just given this latest diff a shot. > I'm testing it the same was as I did the last, running a video + > glxgears - and everything looks to be working well.
Thanks for the report and testing. I've committed this to -current.