Hi Felix,
On Wed, 2023-05-03 at 11:08 -0400, Felix Kuehling wrote:
> That's the worst-case scenario where you're debugging HW or FW
> issues.
> Those should be pretty rare post-bringup. But are there hangs caused
> by
> user mode driver or application bugs that are easier to debug and
> probabl
Am 03.05.23 um 21:14 schrieb André Almeida:
Em 03/05/2023 14:43, Timur Kristóf escreveu:
Hi Felix,
On Wed, 2023-05-03 at 11:08 -0400, Felix Kuehling wrote:
That's the worst-case scenario where you're debugging HW or FW
issues.
Those should be pretty rare post-bringup. But are there hangs cause
On Wed, May 3, 2023, 14:53 André Almeida wrote:
> Em 03/05/2023 14:08, Marek Olšák escreveu:
> > GPU hangs are pretty common post-bringup. They are not common per user,
> > but if we gather all hangs from all users, we can have lots and lots of
> > them.
> >
> > GPU hangs are indeed not very debu
Em 03/05/2023 14:43, Timur Kristóf escreveu:
Hi Felix,
On Wed, 2023-05-03 at 11:08 -0400, Felix Kuehling wrote:
That's the worst-case scenario where you're debugging HW or FW
issues.
Those should be pretty rare post-bringup. But are there hangs caused
by
user mode driver or application bugs tha
Em 03/05/2023 14:08, Marek Olšák escreveu:
GPU hangs are pretty common post-bringup. They are not common per user,
but if we gather all hangs from all users, we can have lots and lots of
them.
GPU hangs are indeed not very debuggable. There are however some things
we can do:
- Identify the h
WRITE_DATA with ENGINE=PFP will execute the packet on the frontend engine,
while ENGINE=ME will execute the packet on the backend engine.
Marek
On Wed, May 3, 2023 at 1:08 PM Marek Olšák wrote:
> GPU hangs are pretty common post-bringup. They are not common per user,
> but if we gather all hang
GPU hangs are pretty common post-bringup. They are not common per user, but
if we gather all hangs from all users, we can have lots and lots of them.
GPU hangs are indeed not very debuggable. There are however some things we
can do:
- Identify the hanging IB by its VA (the kernel should know it)
-
Am 03.05.23 um 17:08 schrieb Felix Kuehling:
Am 2023-05-03 um 03:59 schrieb Christian König:
Am 02.05.23 um 20:41 schrieb Alex Deucher:
On Tue, May 2, 2023 at 11:22 AM Timur Kristóf
wrote:
[SNIP]
In my opinion, the correct solution to those problems would be
if
the kernel could give userspac
Am 2023-05-03 um 03:59 schrieb Christian König:
Am 02.05.23 um 20:41 schrieb Alex Deucher:
On Tue, May 2, 2023 at 11:22 AM Timur Kristóf
wrote:
[SNIP]
In my opinion, the correct solution to those problems would be
if
the kernel could give userspace the necessary information about
a
GPU hang b
Am 02.05.23 um 20:41 schrieb Alex Deucher:
On Tue, May 2, 2023 at 11:22 AM Timur Kristóf wrote:
[SNIP]
In my opinion, the correct solution to those problems would be
if
the kernel could give userspace the necessary information about
a
GPU hang before a GPU reset.
The fundamental problem he
Hi,
On Tue, 2023-05-02 at 13:14 +0200, Christian König wrote:
> >
> > Christian König ezt írta (időpont: 2023.
> > máj. 2., Ke 9:59):
> >
> > > Am 02.05.23 um 03:26 schrieb André Almeida:
> > > > Em 01/05/2023 16:24, Alex Deucher escreveu:
> > > >> On Mon, May 1, 2023 at 2:58 PM André Almeid
On Tue, 2023-05-02 at 09:45 -0400, Alex Deucher wrote:
> On Tue, May 2, 2023 at 9:35 AM Timur Kristóf
> wrote:
> >
> > Hi,
> >
> > On Tue, 2023-05-02 at 13:14 +0200, Christian König wrote:
> > > >
> > > > Christian König ezt írta (időpont:
> > > > 2023.
> > > > máj. 2., Ke 9:59):
> > > >
> >
Hi Christian,
Christian König ezt írta (időpont: 2023. máj.
2., Ke 9:59):
> Am 02.05.23 um 03:26 schrieb André Almeida:
> > Em 01/05/2023 16:24, Alex Deucher escreveu:
> >> On Mon, May 1, 2023 at 2:58 PM André Almeida
> >> wrote:
> >>>
> >>> I know that devcoredump is also used for this kind of
On Tue, May 2, 2023 at 11:22 AM Timur Kristóf wrote:
>
> On Tue, 2023-05-02 at 09:45 -0400, Alex Deucher wrote:
> > On Tue, May 2, 2023 at 9:35 AM Timur Kristóf
> > wrote:
> > >
> > > Hi,
> > >
> > > On Tue, 2023-05-02 at 13:14 +0200, Christian König wrote:
> > > > >
> > > > > Christian König ez
On Tue, May 2, 2023 at 9:35 AM Timur Kristóf wrote:
>
> Hi,
>
> On Tue, 2023-05-02 at 13:14 +0200, Christian König wrote:
> > >
> > > Christian König ezt írta (időpont: 2023.
> > > máj. 2., Ke 9:59):
> > >
> > > > Am 02.05.23 um 03:26 schrieb André Almeida:
> > > > > Em 01/05/2023 16:24, Alex De
Hi Timur,
Am 02.05.23 um 11:12 schrieb Timur Kristóf:
Hi Christian,
Christian König ezt írta (időpont: 2023.
máj. 2., Ke 9:59):
Am 02.05.23 um 03:26 schrieb André Almeida:
> Em 01/05/2023 16:24, Alex Deucher escreveu:
>> On Mon, May 1, 2023 at 2:58 PM André Almeida
>> w
On Tue, May 2, 2023 at 11:12 AM Timur Kristóf wrote:
>
> Hi Christian,
>
> Christian König ezt írta (időpont: 2023. máj. 2.,
> Ke 9:59):
>>
>> Am 02.05.23 um 03:26 schrieb André Almeida:
>> > Em 01/05/2023 16:24, Alex Deucher escreveu:
>> >> On Mon, May 1, 2023 at 2:58 PM André Almeida
>> >> wr
Am 02.05.23 um 03:26 schrieb André Almeida:
Em 01/05/2023 16:24, Alex Deucher escreveu:
On Mon, May 1, 2023 at 2:58 PM André Almeida
wrote:
I know that devcoredump is also used for this kind of information,
but I believe
that using an IOCTL is better for interfacing Mesa + Linux rather
than
Well first of all don't expose the VMID to userspace.
The UMD doesn't know (and shouldn't know) which VMID is used for a
submission since this is dynamically assigned and can change at any time.
For debugging there is an interface to use an reserved VMID for your
debugged process which allows
Em 01/05/2023 16:24, Alex Deucher escreveu:
On Mon, May 1, 2023 at 2:58 PM André Almeida wrote:
I know that devcoredump is also used for this kind of information, but I believe
that using an IOCTL is better for interfacing Mesa + Linux rather than parsing
a file that its contents are subjected
On Mon, May 1, 2023 at 2:58 PM André Almeida wrote:
>
> Currently UMD hasn't much information on what went wrong during a GPU reset.
> To
> help with that, this patch proposes a new IOCTL that can be used to query
> information about the resources that caused the hang.
If we went with the IOCTL,
Currently UMD hasn't much information on what went wrong during a GPU reset. To
help with that, this patch proposes a new IOCTL that can be used to query
information about the resources that caused the hang.
The goal of this RFC is to gather feedback about this interface. The mesa part
can be foun
22 matches
Mail list logo