On 2020-11-08 22:45:21 +0000 Jonathan Gray <j...@jsg.id.au> wrote:

On Sun, Nov 08, 2020 at 03:21:33PM +0000, Anthony Richardby wrote:
On 2020-11-08 14:06:00 +0000 Jonathan Gray <j...@jsg.id.au> wrote:

On Sun, Nov 08, 2020 at 01:09:40PM +0000, Anthony Richardby wrote:
> On 2020-11-08 11:14:50 +0000 Jonathan Gray <j...@jsg.id.au> wrote:
> > > On Sun, Nov 08, 2020 at 10:51:11AM +0000, Anthony Richardby
wrote:
> > > On 2020-11-08 02:40:26 +0000 Jonathan Gray <j...@jsg.id.au>
wrote:
> > > > > On Sat, Nov 07, 2020 at 04:01:34PM +0000, Anthony
Richardby
> > wrote:
> > > > > Hey,
> > > > > I'm experiencing a kernel panic whenever I run glxgears,
or it
> > > > would
> > > > > seem when I try to run other gpu-related tasks. I first
> > noticed
> > > > this
> > > > > when playing video via the application 'mpv', after a
couple
> > of
> > > > > seconds of video the system freezes!! I've got the trace
> > output
> > > > for
> > > > > that as the time delay allows me to switch out of X11 to a
> > console
> > > > > before the system crashes. Anyway after this I tried
> > 'glxgears'
> > > > and
> > > > > that crashes immediately (so I'm unable to get the trace,
but
> > it
> > > > > sounds like it's related?).
> > > > > > Just wondering if anyone else is able to reproduce this
on
> > their
> > > > > powerbooks / other g4 macs, does glxgears work for you?
> > > > > > Pic of the trace output: https://postimg.cc/wRLGLKdF
> > > > > What is the actual panic message?
> > > > > For the archives the trace was:
> > > > > panic(b6ae0c) at panic+0x154
> > > > mtx_enter(0) at mtx_enter+0x8c
> > > > drm_update_vblank_count(2a0fcdf8,e000a000,b6ae78) at
> > > > drm_update_vblank_count+0x4b8
> > > > drm_handle_vblank(2000611,82720e) at drm_handle_vblank+0xd8
> > > > r100_irq_process(a19ca0) at r100_irq_process+0x110
> > > > radeon_driver_irq_handler_kms(0) at
> > > > radeon_driver_irq_handler_kms+0x28
> > > > openpic_ext_intr() at openpic_ext_intr+0x274
> > > > extint_call() at extint_call
> > > > --- interrupt ---
> > > > at 0xe7055e1c
> > > > drm_wait_vblank_ioctl(e0732f00,e06b9b84,e7055c60) at
> > > > drm_wait_vblank_ioctl+0xb2
> > > > drm_do_ioctl(251571d0) at drm_do_ioctl+0x298
> > > > drmioctl(25157100,e7055ca8,990000,c010643a,3a033428) at
> > > > drmioctl+0xbc
> > > > > >
> > > The panic message is:
> > > panic: mtx 0xe024cc50: locking against myself
> > > > I forgot to mention that this is new in 6.8, I can run
> > glxgears/play
> > > video with mpv fine using 6.7.
> > > The seqlock is used from an interrupt context
> > drm_update_vblank_count() -> store_vblank()
> > > Can you try a kernel with the following diff?
> > Patch against 6.8 branch.
> > > Index: sys/dev/pci/drm/drm_vblank.c
> >
===================================================================
> > RCS file: /cvs/src/sys/dev/pci/drm/drm_vblank.c,v
> > retrieving revision 1.6
> > diff -u -p -r1.6 drm_vblank.c
> > --- sys/dev/pci/drm/drm_vblank.c      29 Jul 2020 09:52:21 -0000      1.6
> > +++ sys/dev/pci/drm/drm_vblank.c      8 Nov 2020 11:04:58 -0000
> > @@ -481,7 +481,7 @@ int drm_vblank_init(struct drm_device *d
> >               init_waitqueue_head(&vblank->queue);
> >               setup_timer(&vblank->disable_timer, vblank_disable_fn,
> >                   (unsigned long)vblank);
> > -             seqlock_init(&vblank->seqlock, IPL_NONE);
> > +             seqlock_init(&vblank->seqlock, IPL_TTY);
> >       }
> > >  DRM_INFO("Supports vblank timestamp caching Rev 2
> > (21.10.2013).\n");
> > >
> Thanks for the time taken to look into this and offer this diff!
> Unfortunately its still crashing, the trace and panic message
look to
> be the same:
> > https://postimg.cc/G4KNQ9c6
> I've managed to use ports clang and lldb to build and inspect a
powerpc
drm_vblank.o and it points to
src/sys/dev/pci/drm/include/linux/atomic.h:165:2
which is the ilp32 version of atomic64_read().
> The following patch makes sure the lock involved is initialised.
> Index: sys/dev/pci/drm/drm_vblank.c
===================================================================
RCS file: /cvs/src/sys/dev/pci/drm/drm_vblank.c,v
retrieving revision 1.6
diff -u -p -r1.6 drm_vblank.c
--- sys/dev/pci/drm/drm_vblank.c        29 Jul 2020 09:52:21 -0000      1.6
+++ sys/dev/pci/drm/drm_vblank.c        8 Nov 2020 13:51:27 -0000
@@ -481,7 +481,8 @@ int drm_vblank_init(struct drm_device *d
                init_waitqueue_head(&vblank->queue);
                setup_timer(&vblank->disable_timer, vblank_disable_fn,
                    (unsigned long)vblank);
-               seqlock_init(&vblank->seqlock, IPL_NONE);
+               seqlock_init(&vblank->seqlock, IPL_TTY);
+               atomic64_set(&vblank->count, 0);
        }
>    DRM_INFO("Supports vblank timestamp caching Rev 2
(21.10.2013).\n");
> >
Absolute rockstar! That looks to have done the trick, I've been
watching a video in mpv while running glxgears for a good 10 minutes
and everything it's still up and running.

Many thanks for your help, I'm looking forward to updating my other
machine to 6.8 now that this is working.

There are other atomic64 uses where this could come up.
Can you try this diff to use a single preinitialised mutex?

Index: sys/dev/pci/drm/drm_linux.c
===================================================================
RCS file: /cvs/src/sys/dev/pci/drm/drm_linux.c,v
retrieving revision 1.63
diff -u -p -r1.63 drm_linux.c
--- sys/dev/pci/drm/drm_linux.c 26 Aug 2020 03:29:06 -0000      1.63
+++ sys/dev/pci/drm/drm_linux.c 8 Nov 2020 22:13:24 -0000
@@ -67,6 +67,7 @@ tasklet_run(void *arg)
        }
}

+struct mutex atomic64_mtx = MUTEX_INITIALIZER(IPL_HIGH);
struct mutex sch_mtx = MUTEX_INITIALIZER(IPL_SCHED);
volatile struct proc *sch_proc;
volatile void *sch_ident;
Index: sys/dev/pci/drm/drm_vblank.c
===================================================================
RCS file: /cvs/src/sys/dev/pci/drm/drm_vblank.c,v
retrieving revision 1.6
diff -u -p -r1.6 drm_vblank.c
--- sys/dev/pci/drm/drm_vblank.c        29 Jul 2020 09:52:21 -0000      1.6
+++ sys/dev/pci/drm/drm_vblank.c        8 Nov 2020 22:13:42 -0000
@@ -481,7 +481,7 @@ int drm_vblank_init(struct drm_device *d
                init_waitqueue_head(&vblank->queue);
                setup_timer(&vblank->disable_timer, vblank_disable_fn,
                    (unsigned long)vblank);
-               seqlock_init(&vblank->seqlock, IPL_NONE);
+               seqlock_init(&vblank->seqlock, IPL_TTY);
        }

        DRM_INFO("Supports vblank timestamp caching Rev 2 (21.10.2013).\n");
Index: sys/dev/pci/drm/include/linux/atomic.h
===================================================================
RCS file: /cvs/src/sys/dev/pci/drm/include/linux/atomic.h,v
retrieving revision 1.10
diff -u -p -r1.10 atomic.h
--- sys/dev/pci/drm/include/linux/atomic.h      17 Jun 2020 01:03:57
-0000   1.10
+++ sys/dev/pci/drm/include/linux/atomic.h      8 Nov 2020 22:12:00 -0000
@@ -143,18 +143,20 @@ atomic64_xchg(volatile int64_t *v, int64

#else

+extern struct mutex atomic64_mtx;
+
typedef struct {
        volatile int64_t val;
-       struct mutex lock;
} atomic64_t;

-#define ATOMIC64_INIT(x)       { (x), .lock = MUTEX_INITIALIZER(IPL_HIGH) }
+#define ATOMIC64_INIT(x)       { (x) }

static inline void
atomic64_set(atomic64_t *v, int64_t i)
{
-       mtx_init(&v->lock, IPL_HIGH);
+       mtx_enter(&atomic64_mtx);
        v->val = i;
+       mtx_leave(&atomic64_mtx);
}

static inline int64_t
@@ -162,9 +164,9 @@ atomic64_read(atomic64_t *v)
{
        int64_t val;

-       mtx_enter(&v->lock);
+       mtx_enter(&atomic64_mtx);
        val = v->val;
-       mtx_leave(&v->lock);
+       mtx_leave(&atomic64_mtx);

        return val;
}
@@ -174,10 +176,10 @@ atomic64_xchg(atomic64_t *v, int64_t n)
{
        int64_t val;

-       mtx_enter(&v->lock);
+       mtx_enter(&atomic64_mtx);
        val = v->val;
        v->val = n;
-       mtx_leave(&v->lock);
+       mtx_leave(&atomic64_mtx);

        return val;
}
@@ -185,9 +187,9 @@ atomic64_xchg(atomic64_t *v, int64_t n)
static inline void
atomic64_add(int i, atomic64_t *v)
{
-       mtx_enter(&v->lock);
+       mtx_enter(&atomic64_mtx);
        v->val += i;
-       mtx_leave(&v->lock);
+       mtx_leave(&atomic64_mtx);
}

#define atomic64_inc(p)         atomic64_add(p, 1)
@@ -197,10 +199,10 @@ atomic64_add_return(int i, atomic64_t *v
{
        int64_t val;

-       mtx_enter(&v->lock);
+       mtx_enter(&atomic64_mtx);
        val = v->val + i;
        v->val = val;
-       mtx_leave(&v->lock);
+       mtx_leave(&atomic64_mtx);

        return val;
}
@@ -210,9 +212,9 @@ atomic64_add_return(int i, atomic64_t *v
static inline void
atomic64_sub(int i, atomic64_t *v)
{
-       mtx_enter(&v->lock);
+       mtx_enter(&atomic64_mtx);
        v->val -= i;
-       mtx_leave(&v->lock);
+       mtx_leave(&atomic64_mtx);
}
#endif





I've just given this latest diff a shot.
I'm testing it the same was as I did the last, running a video +
glxgears - and everything looks to be working well.

Reply via email to