On Sun, Nov 08, 2020 at 03:21:33PM +0000, Anthony Richardby wrote:
> On 2020-11-08 14:06:00 +0000 Jonathan Gray <j...@jsg.id.au> wrote:
> 
> > On Sun, Nov 08, 2020 at 01:09:40PM +0000, Anthony Richardby wrote:
> > > On 2020-11-08 11:14:50 +0000 Jonathan Gray <j...@jsg.id.au> wrote:
> > 
> > > > On Sun, Nov 08, 2020 at 10:51:11AM +0000, Anthony Richardby wrote:
> > > > > On 2020-11-08 02:40:26 +0000 Jonathan Gray <j...@jsg.id.au> wrote:
> > > > > > > On Sat, Nov 07, 2020 at 04:01:34PM +0000, Anthony Richardby
> > > > wrote:
> > > > > > > Hey,
> > > > > > > I'm experiencing a kernel panic whenever I run glxgears, or it
> > > > > > would
> > > > > > > seem when I try to run other gpu-related tasks. I first
> > > > noticed
> > > > > > this
> > > > > > > when playing video via the application 'mpv', after a couple
> > > > of
> > > > > > > seconds of video the system freezes!! I've got the trace
> > > > output
> > > > > > for
> > > > > > > that as the time delay allows me to switch out of X11 to a
> > > > console
> > > > > > > before the system crashes. Anyway after this I tried
> > > > 'glxgears'
> > > > > > and
> > > > > > > that crashes immediately (so I'm unable to get the trace, but
> > > > it
> > > > > > > sounds like it's related?).
> > > > > > > > Just wondering if anyone else is able to reproduce this on
> > > > their
> > > > > > > powerbooks / other g4 macs, does glxgears work for you?
> > > > > > > > Pic of the trace output: https://postimg.cc/wRLGLKdF
> > > > > > > What is the actual panic message?
> > > > > > > For the archives the trace was:
> > > > > > > panic(b6ae0c) at panic+0x154
> > > > > > mtx_enter(0) at mtx_enter+0x8c
> > > > > > drm_update_vblank_count(2a0fcdf8,e000a000,b6ae78) at
> > > > > > drm_update_vblank_count+0x4b8
> > > > > > drm_handle_vblank(2000611,82720e) at drm_handle_vblank+0xd8
> > > > > > r100_irq_process(a19ca0) at r100_irq_process+0x110
> > > > > > radeon_driver_irq_handler_kms(0) at
> > > > > > radeon_driver_irq_handler_kms+0x28
> > > > > > openpic_ext_intr() at openpic_ext_intr+0x274
> > > > > > extint_call() at extint_call
> > > > > > --- interrupt ---
> > > > > > at 0xe7055e1c
> > > > > > drm_wait_vblank_ioctl(e0732f00,e06b9b84,e7055c60) at
> > > > > > drm_wait_vblank_ioctl+0xb2
> > > > > > drm_do_ioctl(251571d0) at drm_do_ioctl+0x298
> > > > > > drmioctl(25157100,e7055ca8,990000,c010643a,3a033428) at
> > > > > > drmioctl+0xbc
> > > > > > > >
> > > > > The panic message is:
> > > > > panic: mtx 0xe024cc50: locking against myself
> > > > > > I forgot to mention that this is new in 6.8, I can run
> > > > glxgears/play
> > > > > video with mpv fine using 6.7.
> > > > > The seqlock is used from an interrupt context
> > > > drm_update_vblank_count() -> store_vblank()
> > > > > Can you try a kernel with the following diff?
> > > > Patch against 6.8 branch.
> > > > > Index: sys/dev/pci/drm/drm_vblank.c
> > > > ===================================================================
> > > > RCS file: /cvs/src/sys/dev/pci/drm/drm_vblank.c,v
> > > > retrieving revision 1.6
> > > > diff -u -p -r1.6 drm_vblank.c
> > > > --- sys/dev/pci/drm/drm_vblank.c        29 Jul 2020 09:52:21 -0000      
> > > > 1.6
> > > > +++ sys/dev/pci/drm/drm_vblank.c        8 Nov 2020 11:04:58 -0000
> > > > @@ -481,7 +481,7 @@ int drm_vblank_init(struct drm_device *d
> > > >                 init_waitqueue_head(&vblank->queue);
> > > >                 setup_timer(&vblank->disable_timer, vblank_disable_fn,
> > > >                     (unsigned long)vblank);
> > > > -               seqlock_init(&vblank->seqlock, IPL_NONE);
> > > > +               seqlock_init(&vblank->seqlock, IPL_TTY);
> > > >         }
> > > > >       DRM_INFO("Supports vblank timestamp caching Rev 2
> > > > (21.10.2013).\n");
> > > > >
> > > Thanks for the time taken to look into this and offer this diff!
> > > Unfortunately its still crashing, the trace and panic message look to
> > > be the same:
> > 
> > > https://postimg.cc/G4KNQ9c6
> > 
> > I've managed to use ports clang and lldb to build and inspect a
> > powerpc
> > drm_vblank.o and it points to
> > src/sys/dev/pci/drm/include/linux/atomic.h:165:2
> > which is the ilp32 version of atomic64_read().
> > 
> > The following patch makes sure the lock involved is initialised.
> > 
> > Index: sys/dev/pci/drm/drm_vblank.c
> > ===================================================================
> > RCS file: /cvs/src/sys/dev/pci/drm/drm_vblank.c,v
> > retrieving revision 1.6
> > diff -u -p -r1.6 drm_vblank.c
> > --- sys/dev/pci/drm/drm_vblank.c    29 Jul 2020 09:52:21 -0000      1.6
> > +++ sys/dev/pci/drm/drm_vblank.c    8 Nov 2020 13:51:27 -0000
> > @@ -481,7 +481,8 @@ int drm_vblank_init(struct drm_device *d
> >             init_waitqueue_head(&vblank->queue);
> >             setup_timer(&vblank->disable_timer, vblank_disable_fn,
> >                 (unsigned long)vblank);
> > -           seqlock_init(&vblank->seqlock, IPL_NONE);
> > +           seqlock_init(&vblank->seqlock, IPL_TTY);
> > +           atomic64_set(&vblank->count, 0);
> >     }
> > 
> >     DRM_INFO("Supports vblank timestamp caching Rev 2 (21.10.2013).\n");
> > 
> > 
> > 
> 
> Absolute rockstar! That looks to have done the trick, I've been
> watching a video in mpv while running glxgears for a good 10 minutes
> and everything it's still up and running.
> 
> Many thanks for your help, I'm looking forward to updating my other
> machine to 6.8 now that this is working.

There are other atomic64 uses where this could come up.
Can you try this diff to use a single preinitialised mutex?

Index: sys/dev/pci/drm/drm_linux.c
===================================================================
RCS file: /cvs/src/sys/dev/pci/drm/drm_linux.c,v
retrieving revision 1.63
diff -u -p -r1.63 drm_linux.c
--- sys/dev/pci/drm/drm_linux.c 26 Aug 2020 03:29:06 -0000      1.63
+++ sys/dev/pci/drm/drm_linux.c 8 Nov 2020 22:13:24 -0000
@@ -67,6 +67,7 @@ tasklet_run(void *arg)
        }
 }
 
+struct mutex atomic64_mtx = MUTEX_INITIALIZER(IPL_HIGH);
 struct mutex sch_mtx = MUTEX_INITIALIZER(IPL_SCHED);
 volatile struct proc *sch_proc;
 volatile void *sch_ident;
Index: sys/dev/pci/drm/drm_vblank.c
===================================================================
RCS file: /cvs/src/sys/dev/pci/drm/drm_vblank.c,v
retrieving revision 1.6
diff -u -p -r1.6 drm_vblank.c
--- sys/dev/pci/drm/drm_vblank.c        29 Jul 2020 09:52:21 -0000      1.6
+++ sys/dev/pci/drm/drm_vblank.c        8 Nov 2020 22:13:42 -0000
@@ -481,7 +481,7 @@ int drm_vblank_init(struct drm_device *d
                init_waitqueue_head(&vblank->queue);
                setup_timer(&vblank->disable_timer, vblank_disable_fn,
                    (unsigned long)vblank);
-               seqlock_init(&vblank->seqlock, IPL_NONE);
+               seqlock_init(&vblank->seqlock, IPL_TTY);
        }
 
        DRM_INFO("Supports vblank timestamp caching Rev 2 (21.10.2013).\n");
Index: sys/dev/pci/drm/include/linux/atomic.h
===================================================================
RCS file: /cvs/src/sys/dev/pci/drm/include/linux/atomic.h,v
retrieving revision 1.10
diff -u -p -r1.10 atomic.h
--- sys/dev/pci/drm/include/linux/atomic.h      17 Jun 2020 01:03:57 -0000      
1.10
+++ sys/dev/pci/drm/include/linux/atomic.h      8 Nov 2020 22:12:00 -0000
@@ -143,18 +143,20 @@ atomic64_xchg(volatile int64_t *v, int64
 
 #else
 
+extern struct mutex atomic64_mtx;
+
 typedef struct {
        volatile int64_t val;
-       struct mutex lock;
 } atomic64_t;
 
-#define ATOMIC64_INIT(x)       { (x), .lock = MUTEX_INITIALIZER(IPL_HIGH) }
+#define ATOMIC64_INIT(x)       { (x) }
 
 static inline void
 atomic64_set(atomic64_t *v, int64_t i)
 {
-       mtx_init(&v->lock, IPL_HIGH);
+       mtx_enter(&atomic64_mtx);
        v->val = i;
+       mtx_leave(&atomic64_mtx);
 }
 
 static inline int64_t
@@ -162,9 +164,9 @@ atomic64_read(atomic64_t *v)
 {
        int64_t val;
 
-       mtx_enter(&v->lock);
+       mtx_enter(&atomic64_mtx);
        val = v->val;
-       mtx_leave(&v->lock);
+       mtx_leave(&atomic64_mtx);
 
        return val;
 }
@@ -174,10 +176,10 @@ atomic64_xchg(atomic64_t *v, int64_t n)
 {
        int64_t val;
 
-       mtx_enter(&v->lock);
+       mtx_enter(&atomic64_mtx);
        val = v->val;
        v->val = n;
-       mtx_leave(&v->lock);
+       mtx_leave(&atomic64_mtx);
 
        return val;
 }
@@ -185,9 +187,9 @@ atomic64_xchg(atomic64_t *v, int64_t n)
 static inline void
 atomic64_add(int i, atomic64_t *v)
 {
-       mtx_enter(&v->lock);
+       mtx_enter(&atomic64_mtx);
        v->val += i;
-       mtx_leave(&v->lock);
+       mtx_leave(&atomic64_mtx);
 }
 
 #define atomic64_inc(p)                atomic64_add(p, 1)
@@ -197,10 +199,10 @@ atomic64_add_return(int i, atomic64_t *v
 {
        int64_t val;
 
-       mtx_enter(&v->lock);
+       mtx_enter(&atomic64_mtx);
        val = v->val + i;
        v->val = val;
-       mtx_leave(&v->lock);
+       mtx_leave(&atomic64_mtx);
 
        return val;
 }
@@ -210,9 +212,9 @@ atomic64_add_return(int i, atomic64_t *v
 static inline void
 atomic64_sub(int i, atomic64_t *v)
 {
-       mtx_enter(&v->lock);
+       mtx_enter(&atomic64_mtx);
        v->val -= i;
-       mtx_leave(&v->lock);
+       mtx_leave(&atomic64_mtx);
 }
 #endif
 

Reply via email to