On Mon Jun 2, 2025 at 3:25 AM PDT, Philipp Reisner wrote:
> Hi Christopher,
>
> Thanks for following up. The bug still annoys me from time to time.
> It triggered last on May 8, May 12, and May 18.
> The crash on May 18 was already with the 6.14.5 kernel.
>
>> Could this sleep wake issue also be ca
Hi Christopher,
Thanks for following up. The bug still annoys me from time to time.
It triggered last on May 8, May 12, and May 18.
The crash on May 18 was already with the 6.14.5 kernel.
> Could this sleep wake issue also be caused by a similar thing to the
> panics and SMU hangs I was experienc
On Mon Jan 13, 2025 at 1:55 AM PST, Christian König wrote:
> Am 13.01.25 um 09:43 schrieb Philipp Stanner:
>> [SNIP]
The handling of NULL values is half-baked.
In my opinion, you should define if drm_sched_pick_best() may put a
NULL into
rq. If your answer is yes, it might
Am 13.01.25 um 09:43 schrieb Philipp Stanner:
[SNIP]
The handling of NULL values is half-baked.
In my opinion, you should define if drm_sched_pick_best() may put a
NULL into
rq. If your answer is yes, it might put a NULL there; then, there
should be a
BUG_ON(!entity->rq) after the invocation of
+cc Danilo
+cc myself
On Wed, 2025-01-08 at 09:19 +0100, Christian König wrote:
> Am 07.01.25 um 16:21 schrieb Philipp Reisner:
> > [...]
> > > > The OOPS happens because the rq member of entity is NULL in
> > > > drm_sched_job_arm() after the call to
> > > > drm_sched_entity_select_rq().
> > > >
Am 10.01.25 um 16:10 schrieb Alex Deucher:
On Fri, Jan 10, 2025 at 9:48 AM Christian König
wrote:
Am 10.01.25 um 15:32 schrieb Philipp Reisner:
[...]
Take a look at those messages right before the crash:
Jän 10 07:58:14 ryzen9 kernel: [drm] scheduler comp_1.2.1 is not ready,
skipping
Jän 10
On Fri, Jan 10, 2025 at 9:48 AM Christian König
wrote:
>
> Am 10.01.25 um 15:32 schrieb Philipp Reisner:
> > [...]
> >> Take a look at those messages right before the crash:
> >>
> >> Jän 10 07:58:14 ryzen9 kernel: [drm] scheduler comp_1.2.1 is not ready,
> >> skipping
> >> Jän 10 07:58:14 ryzen9
Am 10.01.25 um 15:32 schrieb Philipp Reisner:
[...]
Take a look at those messages right before the crash:
Jän 10 07:58:14 ryzen9 kernel: [drm] scheduler comp_1.2.1 is not ready,
skipping
Jän 10 07:58:14 ryzen9 kernel: [drm] scheduler comp_1.3.1 is not ready,
skipping
That is basically a 100% c
[...]
> Take a look at those messages right before the crash:
>
> Jän 10 07:58:14 ryzen9 kernel: [drm] scheduler comp_1.2.1 is not ready,
> skipping
> Jän 10 07:58:14 ryzen9 kernel: [drm] scheduler comp_1.3.1 is not ready,
> skipping
>
> That is basically a 100% certain confirm that an application
Am 10.01.25 um 08:37 schrieb Philipp Reisner:
[...]
Could this be due to amdgpu setting sched->ready when the rings are
finished initializing from long ago rather than when the scheduler has
been armed?
Yes and that is absolutely intentional.
Either the driver is not done with it's resume yet,
[...]
> > Could this be due to amdgpu setting sched->ready when the rings are
> > finished initializing from long ago rather than when the scheduler has
> > been armed?
>
> Yes and that is absolutely intentional.
>
> Either the driver is not done with it's resume yet, or it has already
> started it
Am 08.01.25 um 15:26 schrieb Alex Deucher:
On Tue, Jan 7, 2025 at 9:09 AM Christian König wrote:
Am 07.01.25 um 15:02 schrieb Philipp Reisner:
The following OOPS plagues me on about every 10th suspend and resume:
[160640.791304] BUG: kernel NULL pointer dereference, address: 0008
On Tue, Jan 7, 2025 at 9:09 AM Christian König wrote:
>
> Am 07.01.25 um 15:02 schrieb Philipp Reisner:
> > The following OOPS plagues me on about every 10th suspend and resume:
> >
> > [160640.791304] BUG: kernel NULL pointer dereference, address:
> > 0008
> > [160640.791309] #PF: su
Am 07.01.25 um 16:21 schrieb Philipp Reisner:
[...]
The OOPS happens because the rq member of entity is NULL in
drm_sched_job_arm() after the call to drm_sched_entity_select_rq().
In drm_sched_entity_select_rq(), the code considers that
drb_sched_pick_best() might return a NULL value. When NULL
The following OOPS plagues me on about every 10th suspend and resume:
[160640.791304] BUG: kernel NULL pointer dereference, address: 0008
[160640.791309] #PF: supervisor read access in kernel mode
[160640.791311] #PF: error_code(0x) - not-present page
[160640.791313] PGD 0 P4D 0
[1
[...]
> > The OOPS happens because the rq member of entity is NULL in
> > drm_sched_job_arm() after the call to drm_sched_entity_select_rq().
> >
> > In drm_sched_entity_select_rq(), the code considers that
> > drb_sched_pick_best() might return a NULL value. When NULL, it assigns
> > NULL to entit
Am 07.01.25 um 15:02 schrieb Philipp Reisner:
The following OOPS plagues me on about every 10th suspend and resume:
[160640.791304] BUG: kernel NULL pointer dereference, address: 0008
[160640.791309] #PF: supervisor read access in kernel mode
[160640.791311] #PF: error_code(0x) -
17 matches
Mail list logo