On 4/22/25 17:59, Alexey Klimov wrote: > On Tue Apr 22, 2025 at 2:00 PM BST, Alex Deucher wrote: >> On Mon, Apr 21, 2025 at 10:21 PM Alexey Klimov <alexey.kli...@linaro.org> >> wrote: >>> >>> On Thu Apr 17, 2025 at 2:08 PM BST, Alex Deucher wrote: >>>> On Wed, Apr 16, 2025 at 8:43 PM Fugang Duan <fugang.d...@cixtech.com> >>>> wrote: >>>>> >>>>> 发件人: Alex Deucher <alexdeuc...@gmail.com> 发送时间: 2025年4月16日 22:49 >>>>>> 收件人: Alexey Klimov <alexey.kli...@linaro.org> >>>>>> On Wed, Apr 16, 2025 at 9:48 AM Alexey Klimov <alexey.kli...@linaro.org> >>>>>> wrote: >>>>>>> >>>>>>> On Wed Apr 16, 2025 at 4:12 AM BST, Fugang Duan wrote: >>>>>>>> 发件人: Alexey Klimov <alexey.kli...@linaro.org> 发送时间: 2025年4月16 >>>>>> 日 2:28 >>>>>>>>> #regzbot introduced: v6.12..v6.13 >>>>>>>>> The only change related to hdp_v5_0_flush_hdp() was >>>>>>>>> cf424020e040 drm/amdgpu/hdp5.0: do a posting read when flushing HDP >>>>>>>>> >>>>>>>>> Reverting that commit ^^ did help and resolved that problem. Before > > [..] > >>>> OK. that patch won't change anything then. Can you try this patch >>>> instead? >>> >>> Config I am using is basically defconfig wrt memory parameters, yeah, i use >>> 4k. >>> >>> So I tested that patch, thank you, and some other different configurations >>> -- >>> nothing helped. Exactly the same behaviour with the same backtrace. >> >> Did you test the first (4k check) or the second (don't remap on ARM) patch? > > The second one. I think you mentioned that first one won't help for 4k pages. > > >>> So it seems that it is firmware problem after all? >> >> There is no GPU firmware involved in this operation. It's just a >> posted write. E.g., we write to a register to flush the HDP write >> queue and then read the register back to make sure the write posted. >> If the second patch didn't help, then perhaps there is some issue with >> MMIO access on your platform? > > I didn't mean GPU firmware at all. I only had uefi/EL3 firmwares in mind. > > Completely out of the blue, based on nothing, do you think that > adding delay/some mem barrier between write and read might help?
That would still be quite some platform bug. > I wonder if host data path code should be executed during common desktop > usage as a common user then why it doesn't break later. Maybe it's some kind of write/read re-ordering issue. But yeah, I also think this is this motherboard problem. Thank you. You should probably ping some ARM guys to figure out what the fault code actually means. Regards, Christian. > > Thanks, > Alexey >