On Mon, Aug 5, 2024 at 11:05 PM Mikhail Gavrilov
wrote:
>
> Hi,
> After commit 1b04dcca4fb1, launching some RenPy games causes computer hang.
> After the hang, even Alt + sysrq + REISUB can't reboot the computer!
> And no trace in the kernel log!
> For demonstration, I&
On Sun, Aug 25, 2024 at 2:12 AM Mikhail Gavrilov
wrote:
>
> Hi,
> Is anyone trying to look into it?
> I continue to reproduce this issue on fresh kernel builds 6.11-rc4+.
> In addition to the RenPy engine, the problem also reproduces on games
> from Ubisoft, such as Far Cry 4.
On Wed, Sep 4, 2024 at 4:15 AM Leo Li wrote:
> Hi Mike,
>
> Super sorry for the ridiculous wait. Your first two emails slipped by my
> inbox,
> which is really silly, given I'm first in the to field...
>
> Thanks for bisecting and finding a free game to reproduce it on. I did not
> have
> luck r
tch was definitely not enough.
Tested-by: Mikhail Gavrilov
--
Best Regards,
Mike Gavrilov.
On Sat, Sep 7, 2024 at 12:47 AM Leo Li wrote:
>
>
> Hi Mikhail,
>
> I've tried to align my system with yours as best as I can, but so far, I've
> had
> no luck reproducing the hang. A video of what I'm doing:
> https://youtu.be/VeD-LPCnfWM?si=b2baF8MyDBuU4jRH
> (Under the hood, the W7900 and 7900
tely hangs without
any messages in kernel logs.
On Wed, Sep 11, 2024 at 2:11 AM Leo Li wrote:
>
> Hi Mikhail,
>
> Can you give this patch a try to see if it helps?
> https://gist.github.com/leeonadoh/3271e90ec95d768424c572c970ada743
>
Thanks, with this patch, the issue is not r
On Sun, May 26, 2024 at 7:06 PM Mikhail Gavrilov
wrote:
>
> Hi,
> Day before yesterday I replaced 7900XTX to 6900XT for got clear in
> which kernel first time appeared warning message "DMA-API: amdgpu
> :0f:00.0: cacheline tracking EEXIST, overlapping mappings aren
On Fri, May 17, 2024 at 8:59 PM Mikhail Gavrilov
wrote:
>
> Thanks, Arun.
> With the patch this error did not appear anymore.
> Tested-by: Mikhail Gavrilov on 7900XTX
> hardware.
>
I see that this patch do the same but more correctly:
https://gitlab.freedesktop.org
On Fri, Jun 7, 2024 at 6:39 PM Alex Deucher wrote:
>
> --- a/drivers/gpu/drm/amd/display/dc/optc/dcn10/dcn10_optc.c
> +++ b/drivers/gpu/drm/amd/display/dc/optc/dcn10/dcn10_optc.c
> @@ -944,7 +944,7 @@ void optc1_set_drr(
> OTG_V_TOTAL_MAX_SEL, 1,
>
On Fri, Jun 7, 2024 at 5:29 PM Linux regression tracking (Thorsten
Leemhuis) wrote:
>
> [CCing the other amd drm maintainers]
>
> Mikhail: are those details in any way relevant? Then in the future best
> leave them out (or make things easier to follow), they make the bug
> report confusing and sou
On Fri, Jun 21, 2024 at 12:56 PM Linux regression tracking (Thorsten
Leemhuis) wrote:
> Hmmm, I might have missed something, but it looks like nothing happened
> here since then. What's the status? Is the issue still happening?
Yes. Tested on e5b3efbe1ab1.
I spotted that the problem disappears a
On Sat, Jun 29, 2024 at 9:46 PM Rodrigo Siqueira Jordao
wrote:
> Hi Mikhail,
>
> I'm trying to reproduce this issue, but until now, I've been unable to
> reproduce it. I tried some different scenarios with the following
> components:
>
> 1. Displays: I tried with one and two displays
> - 4k@120
On Tue, Jul 9, 2024 at 7:48 PM Rodrigo Siqueira Jordao
wrote:
> Hi,
>
> I also tried it with 6900XT. I got the same results on my side.
This is weird.
> Anyway, I could not reproduce the issue with the below components. I may
> be missing something that will trigger this bug; in this sense, coul
On Wed, Jul 10, 2024 at 12:01 PM Mikhail Gavrilov
wrote:
>
> On Tue, Jul 9, 2024 at 7:48 PM Rodrigo Siqueira Jordao
> wrote:
> > Hi,
> >
> > I also tried it with 6900XT. I got the same results on my side.
>
> This is weird.
>
> > Anyway, I could not rep
On Tue, Jul 16, 2024 at 10:10 PM Alex Deucher wrote:
>
> Does the attached partial revert fix it?
>
> Alex
>
Yes, thanks.
Tested-by: Mikhail Gavrilov
--
Best Regards,
Mike Gavrilov.
On Tue, Jul 23, 2024 at 2:34 AM Alex Deucher wrote:
> Do either of these patches help?
> https://patchwork.freedesktop.org/patch/605437/
Unfortunately, this patch didn't help. Please see the attached kernel log.
> https://patchwork.freedesktop.org/patch/605201/
For which kernel is this patch int
On Wed, Jul 24, 2024 at 10:16 PM Mikhail Gavrilov
wrote:
> > https://patchwork.freedesktop.org/patch/605201/
> For which kernel is this patch intended? The patch is not applied on
> top of d67978318827.
I am able to apply this patch on top of e4fc196f5ba3 and the issue is gone
Hi,
After commit 1b04dcca4fb1, launching some RenPy games causes computer hang.
After the hang, even Alt + sysrq + REISUB can't reboot the computer!
And no trace in the kernel log!
For demonstration, I'm going to use the game "Find the Orange Narwhal"
because it is free and has 100% reproducivity f
Thanks, Arun.
With the patch this error did not appear anymore.
Tested-by: Mikhail Gavrilov on 7900XTX hardware.
--
Best Regards,
Mike Gavrilov.
<>
Hi,
Day before yesterday I replaced 7900XTX to 6900XT for got clear in
which kernel first time appeared warning message "DMA-API: amdgpu
:0f:00.0: cacheline tracking EEXIST, overlapping mappings aren't
supported".
The kernel 6.3 and older won't boot on a computer with Radeon 7900XTX.
When I boo
Hi,
Yesterday came the 6.7-rc1 kernel.
And surprisingly it turned out it is not working with my LG C3.
I use this OLED TV as my primary monitor.
After login to GNOME I see a horizontal flashing bar with a picture of
the desktop background on white screen.
Demonstration: https://youtu.be/7F76VfRkrVo
On Tue, Nov 14, 2023 at 11:03 PM Mikhail Gavrilov
wrote:
>
> On Tue, Nov 14, 2023 at 3:55 PM Mikhail Gavrilov
> wrote:
> >
> > Hi,
> > Yesterday came the 6.7-rc1 kernel.
> > And surprisingly it turned out it is not working with my LG C3.
> > I use this O
On Wed, Nov 15, 2023 at 11:14 PM Hamza Mahfooz wrote:
>
> What version of DMUB firmware are you on?
> The easiest way to find out would be using the following:
>
> # dmesg | grep DMUB
>
Sapphire AMD Radeon RX 7900 XTX PULSE OC:
❯ dmesg | grep DMUB
[ 14.341362] [drm] Loading DMUB firmware via PS
On Wed, Nov 15, 2023 at 11:39 PM Lee, Alvin wrote:
>
> This change has a DMCUB dependency - are you able to update your DMCUB
> version as well?
>
I can confirm this issue was gone after updating firmware.
❯ dmesg | grep DMUB
[ 11.496679] [drm] Loading DMUB firmware via PSP: version=0x0700230
he first one patch is enough.
Tested-on: 7900XTX, 6900XT and 6800M.
Tested-by: Mikhail Gavrilov
--
Best Regards,
Mike Gavrilov.
On Tue, Feb 28, 2023 at 5:43 PM Christian König
wrote:
>
> The point is it doesn't need to talk to the amdgpu hardware. What it
> does is that it talks to the good old VGA/VESA emulation and that just
> happens to be still enabled by the BIOS/GRUB.
>
> And that VGA/VESA emulation doesn't need any
On Fri, Dec 15, 2023 at 9:14 PM Hamza Mahfooz wrote:
>
> Can you try the following patch with old fw (version 0x07002100 should
> be fine)?: https://patchwork.freedesktop.org/patch/572298/
>
Tested-by: Mikhail Gavrilov on 7900XTX hardware.
Can I ask?
What does SubVP actually d
On Fri, Dec 15, 2023 at 5:37 PM Christian König
wrote:
>
> I have no idea :)
>
> From the logs I can see that the AMDGPU now has the proper BARs assigned:
>
> [5.722015] pci :03:00.0: [1002:73df] type 00 class 0x038000
> [5.722051] pci :03:00.0: reg 0x10: [mem
> 0xf8-0xfbf
On Wed, Jan 24, 2024 at 7:19 AM Mikhail Gavrilov
wrote:
>
> Who could dig into it, please?
You decided to revert it?
https://lkml.org/lkml/2024/1/22/1866
Also I forgot to attach the kernel build .config in the previous
message. I'm going to fix it here.
It may be useful for reprodu
On Wed, 14 Apr 2021 at 11:48, Christian König <
ckoenig.leichtzumer...@gmail.com> wrote:
>
> That is expected behavior, the application is just buggy and causing a
> page fault on the GPU.
>
> The kernel should just not crash with a backtrace.
>
> Regards,
> Christian.
>
If after it GPU hangs wit
On Wed, 15 Sept 2021 at 14:55, Christian König wrote:
>
> Yes, absolutely. You should see GPU resets and recovery in the system log
> after that.
Unfortunately, not one DE will survive a GPU reset. All applications
will terminate abnormally in fact this would be equivalent to reboot
(and denial
Hi!
During the 5.12 testing cycle I observed the repeatable bug when
launching heavy graphic applications.
The kernel log is flooded with the message "Unexpected multihop in
swaput - likely driver bug.".
Trace:
[ 8707.814899] [ cut here ]
[ 8707.814920] Unexpected multihop
On Wed, 7 Apr 2021 at 15:46, Christian König
wrote:
>
> What hardware are you using
$ inxi -bM
System:Host: fedora Kernel: 5.12.0-0.rc6.184.fc35.x86_64+debug
x86_64 bits: 64 Desktop: GNOME 40.0
Distro: Fedora release 35 (Rawhide)
Machine: Type: Desktop Mobo: ASUSTeK model: ROG ST
Video demonstration: https://youtu.be/3nkvUeB0GSw
How looks kernel traces.
1.
[ 7315.156460] amdgpu :0b:00.0: amdgpu: [mmhub] page fault
(src_id:0 ring:0 vmid:6 pasid:32779, for process obs pid 23963 thread
obs:cs0 pid 23977)
[ 7315.156490] amdgpu :0b:00.0: amdgpu: in page starting at
a
On Tue, 13 Apr 2021 at 12:29, Christian König wrote:
>
> Hi Mikhail,
>
> the crash is a known issue and should be fixed by:
>
> commit f63da9ae7584280582cbc834b20cc18bfb203b14
> Author: Philip Yang
> Date: Thu Apr 1 00:22:23 2021 -0400
>
> drm/amdgpu: reserve fence slot to update page tabl
On Tue, 13 Apr 2021 at 04:55, Leo Liu wrote:
>
> >It curious why ffmpeg does not cause such issues.
> >For example such command not cause kernel panic:
> >$ ffmpeg -f x11grab -framerate 60 -video_size 3840x2160 -i :0.0 -vf
> >'format=nv12,hwupload' -vaapi_device /dev/dri/renderD128 -vcodec
> >h264
On Wed, 14 Apr 2021 at 03:22, Leo Liu wrote:
>
> This is decode command line, are you seeing issue with encode or
> decode?
I was means that described above the kernel panic happens only when
OBS record or stream video with VAAPI encoder.
Grabbing and encoding video with ffmpeg (given command exa
On Wed, 14 Apr 2021 at 11:48, Christian König
wrote:
>
> >> commit f63da9ae7584280582cbc834b20cc18bfb203b14
> >> Author: Philip Yang
> >> Date: Thu Apr 1 00:22:23 2021 -0400
> >>
> >> drm/amdgpu: reserve fence slot to update page table
> >>
>
> That is expected behavior, the application i
On Wed, 21 Apr 2021 at 11:42, Christian König wrote:
> I can try, but I'm not sure if we even have the full page fault handling
> for Navi in 5.12.
>
It would be great. For me this patch is working as expected and I
already for several days didn't see the panic "kernel BUG at
drivers/dma-buf/dma-
Hi folks.
I observed this issue since 5.3 and it still happens with 5.10 git.
This warning has reproductivity 100% reliable when I launch
"Wolfenstein: Youngblood" version of Mesa doesn't matter.
[73690.883948] [ cut here ]
[73690.883953] DEBUG_LOCKS_WARN_ON(ww_ctx->contend
Hi Christian,
On Tue, 12 Jan 2021 at 01:45, Christian König wrote:
>
> Hi Mike,
>
> Unfortunately not, that's DC stuff. Easiest is to assign this as a bug
> tracker to our DC team.
Ok
> At least some progress. Any objections that I add your e-mail address as
> tested-by tag?
Yes, feel free add m
On Tue, 12 Jan 2021 at 01:45, Christian König wrote:
>
> But what you have in your logs so far are only unrelated symptoms, the
> root of the problem is that somebody is leaking memory.
>
> What you could do as well is to try to enable kmemleak
I captured some memleaks.
Do they contain any useful
On Thu, 14 Jan 2021 at 18:56, Christian König wrote:
> Unfortunately not of hand.
>
> I also don't see any bug reports from other people and can't reproduce
> the last backtrace you send out TTM here.
Because only the most desperate will install kernels with enabled
debug flags and then load the
On Fri, 15 Jan 2021 at 03:43, Mikhail Gavrilov
wrote:
>
In rc4, the number of warnings has dropped dramatically.
No more errors "kasan slab-out-of-bounds" and no "DMA-API device
driver failed to check map error".
But still not fixed "sleeping function called from inva
On Thu, 21 Jan 2021 at 18:27, Christian König wrote:
>
> I still have no idea what's going on here.
>
> The KASAN messages from the DC code are completely unrelated.
>
> Please add the full dmesg to your bug report.
>
I did it.
https://gitlab.freedesktop.org/drm/amd/-/issues/1439#note_776267
--
The 5.11-rc5 (git 76c057c84d28) brought a new issue.
Now the kernel log is flooded with the message "page allocation failure".
Trace:
msedge:cs0: page allocation failure: order:10,
mode:0x190cc2(GFP_HIGHUSER|__GFP_NORETRY|__GFP_NOMEMALLOC),
nodemask=(null),cpuset=/,mems_allowed=0
CPU: 18 PID: 4540
On Sun, 31 Jan 2021 at 22:22, Christian König
wrote:
>
>
> Yeah, known issue. I already pushed Michel's fix to drm-misc-fixes.
> Should land in the next -rc by the weekend.
>
> Regards,
> Christian.
I checked this patch [1] for several days.
And I can confirm that the reported issue was gone.
[1
On Mon, 8 Feb 2021 at 14:18, Christian König
wrote:
>
> Are the other problems gone as well?
>
And yes and no.
The issue with monitor turns off was gone after rc6 (git3aaf0a27ffc2)
But both traces
1) BUG: sleeping function called from invalid context at
include/linux/sched/mm.h:196 (kernel 5.11 s
Hi folks.
I observed hard reproductible the set of bugs.
It always started as
1) kworker/u64:2: page allocation failure: order:5,
mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO),
nodemask=(null),cpuset=/,mems_allowed=0
Continious as:
2) WARNING: CPU: 21 PID: 806649 at
drivers/gpu/drm/amd/amdgpu/../d
On Sun, 27 Dec 2020 at 21:39, Mikhail Gavrilov
wrote:
> I suppose the root of cause my problem here:
>
> [3.961326] amdgpu :0b:00.0: Direct firmware load for
> amdgpu/sienna_cichlid_sos.bin failed with error -2
> [3.961359] amdgpu :0b:00.0: amdgpu: failed to in
On Tue, 29 Dec 2020 at 20:15, Deucher, Alexander
wrote:
>
> It looks like the driver is not able to access the firmware for some reason.
> Please make sure it is available in your initrd or compiled into the kernel
> depending on your config.
Exactly! Thanks!
# lsinitrd
/boot/initramfs-5.10.
Hi folks!
I started to see this message every boot after replacing Radeon VII to 6900XT.
$ journalctl | grep "BUG: key"
Dec 31 05:19:42 localhost.localdomain kernel: BUG: key
98b59ab01148 has not been registered!
Dec 31 05:25:44 localhost.localdomain kernel: BUG: key
8d425ba01148 has not b
Hi folks,
today I joined to testing Kernel 5.11 and saw that the kernel log was
flooded with BUG messages:
BUG: sleeping function called from invalid context at mm/vmalloc.c:1756
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 266, name: kswapd0
INFO: lockdep is turned off.
CPU: 15 PID: 266
On Mon, 11 Jan 2021 at 19:01, Christian König wrote:
> Changing the page table attributes while releasing memory might sleep.
> So we can't use a spinlock here.
>
> Thanks for the report, a patch to fix this is on the mailing list now.
Can you look also the first trace?
Here a same error message
On Mon, Nov 6, 2023 at 8:29 PM Alex Deucher wrote:
>
> Already fixed in this commit:
> https://gitlab.freedesktop.org/agd5f/linux/-/commit/d1d4c0b7b65b7fab2bc6f97af9e823b1c42ccdb0
> Which is in included in last weeks PR.
>
Thanks, it fixed the issue above.
But, unfortunately this is not the only
On Wed, Nov 8, 2023 at 12:12 AM Alex Deucher wrote:
>
> The attached patch should fix it. Not sure why your GPU shows up as
> busy. The AGP aperture was just disabled.
Tested-by: Mikhail Gavrilov
Thanks, after applying the patch GPU loading meets expectations.
Games are working so ov
Hi!
Unfortunately the use-after-free issue still happens on the 6.0-rc5 kernel.
The issue became hard to repeat. I spent the whole day at the computer
when use-after-free again happened, I was playing the game Tiny Tina's
Wonderlands.
Therefore, forget about repeatability. It remains only to hope f
Hi!
The hungs occurs randomly, but I found good reproductive scenario
(This is running the campaign in the game Halo Infinite)
The backtrace is look like this:
[ 147.260971] BUG: kernel NULL pointer dereference, address: 0088
[ 147.260987] [ cut here ]
[ 147.
Hi!
I bisected an issue of the 6.0 kernel which started happening after
6.0-rc7 on all my machines.
Backtrace of this issue looks like as:
[ 2807.339439] [ cut here ]
[ 2807.339445] WARNING: CPU: 11 PID: 2061 at
drivers/gpu/drm/drm_modeset_lock.c:276
drm_modeset_drop_locks
On Wed, May 11, 2022 at 5:01 PM Christian König
wrote:
>
>
> We have implemented a workaround, but still don't know the exact root cause.
>
> If anybody wants to look into this it would be rather helpful to be able
> to reproduce the issue.
>
> Regards,
> Christian.
I see that issue was returned
Hi!
I found that some games (Cyberpunk 2077, Forza Horizon 4/5) hang at
start after commit dd80d9c8eecac8c516da5b240d01a35660ba6cb6.
dd80d9c8eecac8c516da5b240d01a35660ba6cb6 is the first bad commit
commit dd80d9c8eecac8c516da5b240d01a35660ba6cb6
Author: Christian König
Date: Thu Jul 14 10:23:38
On Fri, Oct 21, 2022 at 1:33 PM Christian König
wrote:
>
> Hi,
>
> yes Bas already reported this issue, but I couldn't reproduce it. Need
> to come up with a patch to narrow this down further.
>
> Can I send you something to test?
I would appreciate to test any patches and ideas.
--
Best Regard
On Wed, Oct 26, 2022 at 12:29 PM Christian König
wrote:
>
> Attached is the original test patch rebased on current amd-staging-drm-next.
>
> Can you test if this is enough to make sure that the games start without
> crashing by fetching the userptrs?
1. Over the past week the list of games affect
On Tue, Nov 1, 2022 at 10:52 PM Christian König
wrote:
>
> Let's focus on one problem at a time.
>
> The issue here is that somehow userptr handling became racy after we
> removed the lock, but I don't see why.
>
> We need to fix this ASAP since it is probably a much wider problem and
> the additi
Hi,
Between commits ed4643521e6a and 34af78c4e616 something was broken.
I noted that kernel log flooded with warning message "WARNING: CPU: 31
PID: 51848 at drivers/dma-buf/dma-fence-array.c:191
dma_fence_array_create+0x101/0x120" when some games are running:
"Resident Evil Village", "Marvel's Aven
Hi Christian
> those are two independent and already known problems.
>
> The warning triggered from the sync_file is already fixed in
> drm-misc-next-fixes, but so far I couldn't figure out why the games
> suddenly doesn't work any more.
I thought that these warnings are related to the stuck of t
On Fri, 8 Apr 2022 at 16:13, Christian König wrote:
> I own you a beer.
>
> I still don't know what happens here, but that makes at least a bit more
> sense than a patch which only changes comments :)
>
> Looks like we are missing something here. Can I send you a patch to try
> something later to
ers/dma-buf/dma-fence-array.c:191
dma_fence_array_create+0x101/0x120" has gone.
Thanks.
Tested-by: Mikhail Gavrilov
--
Best Regards,
Mike Gavrilov.
On Sat, Apr 9, 2022 at 7:27 PM Christian König
wrote:
>
> That's unfortunately not the end of the story.
>
> This is fixing your problem, but reintroducing the original problem that
> we call the syncobj with a lock held which can crash badly as well.
>
> Going to take a closer look on Monday. I h
Hi guys.
Between commits fdaf9a5840ac and babf0bb978e3 GPU stopped entering in
graphic mode instead I see black screen with constantly glowing
cursor. Demonstration: https://youtu.be/rGL4LsHMae4
In the kernel logs there are references to hung processes:
[ 149.363465] rfkill: input handler disabled
On Tue, Jun 28, 2022 at 2:21 PM Mikhail Gavrilov
wrote:
>
Christian can you look why
drm_aperture_remove_conflicting_pci_framebuffers cause this kernel bug
on my machine?
[6.822385] amdgpu: Ignoring ACPI CRAT on non-APU system
[6.822462] amdgpu: Virtual CRAT table created for
On Thu, Jul 7, 2022 at 2:50 PM Christian König
wrote:
>
> Am 07.07.22 um 02:20 schrieb Mikhail Gavrilov:
> > On Tue, Jun 28, 2022 at 2:21 PM Mikhail Gavrilov
> > wrote:
> > Christian can you look why
> > drm_aperture_remove_conflicting_pci_framebuffers cause thi
On Sat, Jul 9, 2022 at 5:10 PM Mikhail Gavrilov
wrote:
> Hi Christian,
> if you read my initial post. You should see that I tried to bisect the issue.
> But it is very problematic because on each step I see different symptomes.
> And if mark different symptoms with skip step we got a
Hi guys I continue testing 5.19 rc7 and found the bug.
Command "clinfo" causes BUG: kernel NULL pointer dereference, address:
0008 on driver amdgpu.
Here is trace:
[ 1320.203332] BUG: kernel NULL pointer dereference, address: 0008
[ 1320.203338] #PF: supervisor read access
On Wed, Jul 13, 2022 at 5:38 PM Mikhail Gavrilov
wrote:
> # first bad commit: [9cbbd694a58bdf24def2462276514c90cab7cf80] Merge
> drm/drm-next into drm-misc-next
>
Don't know who to thank but the issue disappeared in 5.19 rc7.
--
Best Regards,
Mike Gavrilov.
On Tue, Jul 19, 2022 at 1:40 PM Mike Lothian wrote:
>
> I was told that this patch replaces the patch you mentioned
> https://patchwork.freedesktop.org/series/106078/ and it the one
> that'll hopefully land in Linus's tree
>
Great, I confirm that both patches solve the issue.
As I understand the
On Tue, Jul 19, 2022 at 4:26 PM Mikhail Gavrilov
wrote:
> In the kernel log there is no error so it is most likely a user space issue ,
> but I am not
> sure about it.
But I am confused by the message in the kernel log:
[ 1962.000909] amdgpu: HIQ MQD's queue_doorbell_id0 i
Hi folks.
Joined testing 5.20 today (7ebfc85e2cd7).
I encountered a frequently GPU freeze, after which a message appears
in the kernel logs:
[ 220.280990] [ cut here ]
[ 220.281000] refcount_t: underflow; use-after-free.
[ 220.281019] WARNING: CPU: 1 PID: 3746 at lib/refcoun
On Mon, Aug 15, 2022 at 5:20 AM Maíra Canal wrote:
>
> Hi Mikhail
>
> Looks like this use-after-free problem was introduced on
> 90af0ca047f3049c4b46e902f432ad6ef1e2ded6. Checking this patch it seems
> like: if amdgpu_cs_vm_handling return r != 0, then it will unlock
> bo_list_mutex inside the fun
On Mon, Aug 15, 2022 at 3:37 PM Mikhail Gavrilov
wrote:
>
> Thanks, I tested this patch.
> But with this patch use-after-free problem happening in another place:
Does anyone have an idea why the second use-after-free happened?
>From the trace I don't understand which code is
On Wed, Aug 17, 2022 at 9:08 PM Melissa Wen wrote:
>
> Hi Mikhail,
>
> IIUC, you got this second user-after-free by applying the first version
> of Maíra's patch, right? So, that version was adding another unbalanced
> unlock to the cs ioctl flow, but it was solved in the latest version,
> that yo
On Wed, Aug 17, 2022 at 11:43 PM Maíra Canal wrote:
>
> Hi Mikhail,
>
> Looks like 45ecaea738830b9d521c93520c8f201359dcbd95 ("drm/sched: Partial
> revert of 'drm/sched: Keep s_fence->parent pointer'") introduced the
> error. Try reverting it and check if the use-after-free still happens.
Thanks,
On Fri, Aug 19, 2022 at 5:13 PM Maíra Canal wrote:
>
> Hi Mikhail,
>
> Could you please specify the steps to reproduce this use-after-free? I
> will try to reproduce it on the RX5700 XT and bisect the issue.
>
Hi Maíra, thanks for help.
I'm afraid that it will be unrealistic to reproduce, becaus
9be6ebc61e will stop these warnings. I
also attached fresh logs from 6.2.0-0.rc6.
6.2-rc7 I started to build without commit
b261509952bc19d1012cf732f853659be6ebc61e to avoid these warnings.
On Thu, Oct 13, 2022 at 6:36 PM Mikhail Gavrilov
>
> Hi!
> I bisected an issue of the 6.0 kernel whic
drop_locks no longer appears anymore.
I hope this patch will have time to be merged in 6.2 before release.
Tested-by: Mikhail Gavrilov
--
Best Regards,
Mike Gavrilov.
uptime.tar.xz
Description: application/xz
On Fri, Dec 9, 2022 at 7:37 PM Leo Liu wrote:
>
> Please try the latest AMDGPU driver:
>
> https://gitlab.freedesktop.org/agd5f/linux/-/commits/amd-staging-drm-next/
>
Sorry Leo, I miss your message.
This issue is still actual for 6.2-rc8.
In my first message I was mistaken.
> Before kernel 5.1
On Fri, Feb 17, 2023 at 8:30 PM Alex Deucher wrote:
>
> On Fri, Feb 17, 2023 at 1:10 AM Mikhail Gavrilov
> wrote:
> >
> > On Fri, Dec 9, 2022 at 7:37 PM Leo Liu wrote:
> > >
> > > Please try the latest AMDGPU driver:
> > >
> > > https:/
Hi,
I have a laptop ASUS ROG Strix G15 Advantage Edition G513QY-HQ007. But
it is impossible to use without AC power because the system losts nvme
when I disconnect the power adapter.
Messages from kernel log when it happens:
nvme nvme0: controller is down; will reset: CSTS=0x, PCI_STATUS=0
On Fri, Feb 24, 2023 at 12:13 PM Christian König
wrote:
>
> Hi Mikhail,
>
> this is pretty clearly a problem with the system and/or it's BIOS and
> not the GPU hw or the driver.
>
> The option pci=nocrs makes the kernel ignore additional resource windows
> the BIOS reports through ACPI. This then
On Fri, Feb 24, 2023 at 8:31 PM Christian König
wrote:
>
> Sorry I totally missed that you attached the full dmesg to your original
> mail.
>
> Yeah, the driver did fail gracefully. But then X doesn't come up and
> then gdm just dies.
Are you sure that these messages should be present when the dr
On Mon, Feb 27, 2023 at 3:22 PM Christian König
>
> Unfortunately yes. We could clean that up a bit more so that you don't
> run into a BUG() assertion, but what essentially happens here is that we
> completely fail to talk to the hardware.
>
> In this situation we can't even re-enable vesa or text
Hi,
I didn't faced to issue drm_bridge_hpd_enable+0x94/0x9c [drm] but
fixing this issue leads to warning messages on my laptop ASUS ROG
Strix G15 Advantage Edition G513QY-HQ007 which has two AMD GPU.
Discrete Radeon 6800M and integrated in CPU Cezanne Vega 8.
I found bad commit by bisecting:
❯ git
On Tue, Mar 21, 2023 at 11:47 PM Christian König
wrote:
>
> Hi Mikhail,
>
> That looks like a reference counting issue to me.
>
> I'm going to take a look, but we have already fixed one of those recently.
>
> Probably best that you try this on drm-fixes, just to double check that
> this isn't the
On Fri, Mar 24, 2023 at 7:37 PM Christian König
wrote:
>
> Yeah, that one
>
> Thanks for the info, looks like this isn't fixed.
>
> Christian.
>
Hi,
glad to see that "BUG: KASAN: slab-use-after-free in
drm_sched_get_cleanup_job+0x47b/0x5c0" was fixed in 6.3-rc5.
For history it would be good to kn
On Tue, Apr 11, 2023 at 10:40 PM Mikhail Gavrilov
wrote:
>
> Hi,
> KASAN continues to find problems in the drm_sched_job_cleanup code at 6.3rc6.
> I not got any feedback in the thread
> https://lore.kernel.org/lkml/cabxgcsmvub2ra4d+k5cna0_2521tox++d4nmoukki4x2-q_...@mail.gmail.com/
Christian?
❯ /usr/src/kernels/6.3.0-0.rc7.56.fc39.x86_64/scripts/faddr2line
/lib/debug/lib/modules/6.3.0-0.rc7.56.fc39.x86_64/kernel/drivers/gpu/drm/scheduler/gpu-sched.ko.debug
drm_sched_job_cleanup+0x9a
drm_sched_job_cleanup+0x9a/0x130:
drm_sched_job_cleanup at
/usr/src/debug/kernel-6.3-rc7/linu
On Wed, Apr 19, 2023 at 1:12 PM Christian König
wrote:
>
> I'm already looking into this, but can't figure out why we run into
> problems here.
>
> What happens is that a CS is aborted without sending the job to the
> scheduler and in this case the cleanup function doesn't seem to work.
>
> Christ
On Thu, Apr 20, 2023 at 2:59 PM Christian König
wrote:
>
> Could you try drm-misc-next as well?
>
> Going to give drm-fixes another round of testing.
>
> Thanks,
> Christian.
Important don't give up.
https://youtu.be/25zhHBGIHJ8 [40 min]
https://youtu.be/utnDR26eYBY [50 min]
https://youtu.be/DJQ_
On Thu, Apr 20, 2023 at 2:59 PM Christian König
wrote:
> Could you try drm-misc-next as well?
If as I assume I cloned right repo
$ git clone -b drm-misc-next
git://anongit.freedesktop.org/drm/drm-misc linux-drm-misc-next
for my hardware last commit on this branch is turned out completely unworkin
On Thu, Apr 20, 2023 at 3:32 PM Mikhail Gavrilov
wrote:
>
> Important don't give up.
> https://youtu.be/25zhHBGIHJ8 [40 min]
> https://youtu.be/utnDR26eYBY [50 min]
> https://youtu.be/DJQ_tiimW6g [12 min]
> https://youtu.be/Y6AH1oJKivA [6 min]
> Yes the issue is everyth
1 - 100 of 157 matches
Mail list logo