It looks like the missing patch made its way into 5.15.0-0.bpo.2-powerpc64le, as f4d3da72a76a9ce... I think. As a result, I think this bug is overcome by events and can be closed.
On Tue, Nov 2, 2021 at 12:37 PM Christian König <christian.koe...@amd.com> wrote: > > > > Am 30.10.21 um 16:32 schrieb Salvatore Bonaccorso: > > On Wed, Oct 13, 2021 at 10:08:21PM +0200, Salvatore Bonaccorso wrote: > >> Hi, > >> > >> On Mon, Oct 11, 2021 at 10:30:21AM +0200, Christian König wrote: > >>> Am 10.10.21 um 16:14 schrieb Xi Ruoyao: > >>>> On Sun, 2021-10-10 at 14:46 +0100, Nathaniel Filardo wrote: > >>>>> It occurs to me, quite belatedly, that it may be worth asking the > >>>>> author, reviewers, and signers of the change in question their > >>>>> thoughts on this bug report: > >>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.debian.org%2Fcgi-bin%2Fbugreport.cgi%3Fbug%3D990279&data=04%7C01%7Cchristian.koenig%40amd.com%7C3eab702e82bc4ab81fbd08d99bb21419%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637712012456263348%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=KFxF9He7545uxYylgQ%2F4HdljVuGoGUbvHh9xohbxu4o%3D&reserved=0 > >>>>> > >>>>> In particular, on ppc64 systems, Linux typically is configured to use > >>>>> a 64KiB page (i.e., shift 16) rather than 4KiB (shift 12) page. It > >>>>> looks, however, that AMDGPU_GPU_PAGE_SIZE is always 4096, and so > >>>>> something (perhaps in userspace, even, eek?) is requesting > >>>>> 4KiB-but-not-64KiB alignment of this buffer. > >>>> Christian told me the buffer should be aligned to *CPU* page boundary, > >>>> or the page table in AMDGPU driver will be corrupted: > >>> Yeah, that's indeed correct. And that intentionally breaks because > >>> otherwise > >>> we can corrupt the page tables and potentially cause much worse trouble. > >>> > >>> Question is more why userspace isn't told the correct value in your > >>> branch. > >>> > >>>>> the value of num_entries must always be a multiple of > >>>>> AMDGPU_GPU_PAGES_IN_CPU_PAGE or otherwise we corrupt the page tables. > >>>>> You need to identify the root cause of this, most likely start or last > >>>>> are not a multiple of AMDGPU_GPU_PAGES_IN_CPU_PAGE. > >>>> IMO f4d3da72a76a9ce5f57bba64788931686a9dc333 "drm/amdgpu: Set a suitable > >>>> dev_info.gart_page_size" should be backported along with this, which > >>>> makes the kernel to provide the CPU page size to libdrm and mesa and > >>>> correct userspace behavior. I'm not sure why only one is backported. > >>> > >>> Yes, exactly that sounds like the correct fix to me as well. > >> So, the 9a89a721b41b (" drm/amdgpu: check alignment on CPU page for bo > >> map") was backported to several stable series 4.14.229, 4.19.185, > >> 5.4.110, 5.10.28 and 5.11.12 but not the > >> f4d3da72a76a9ce5f57bba64788931686a9dc333 "drm/amdgpu: Set a suitable > >> dev_info.gart_page_size". > >> > >> What is confusely is that all of those backports reference as upstream > >> commit e3512fb67093fabdf27af303066627b921ee9bd8 and not > >> 9a89a721b41b23c6da8f8a6dd0e382966a850dcf which might be in part source > >> of the confusion? > >> > >> Can any of you request to backport > >> f4d3da72a76a9ce5f57bba64788931686a9dc333 as well for those stable > >> series where relevant? > > Here is the proposed change. Should this be submitted to stable for > > 5.10.y? > > You can drop the max_t(), just using PAGE_SIZE should work since there > can't be any smaller page size than 4k or the driver won't work at all. > > Regards, > Christian. > > > > > Regards, > > Salvatore > > > > From 02c987eb2ab0cdfd536d08bf812f4e37d3cc150a Mon Sep 17 00:00:00 2001 > > From: Huacai Chen <che...@lemote.com> > > Date: Tue, 30 Mar 2021 23:33:33 +0800 > > Subject: [PATCH] drm/amdgpu: Set a suitable dev_info.gart_page_size > > MIME-Version: 1.0 > > Content-Type: text/plain; charset=UTF-8 > > Content-Transfer-Encoding: 8bit > > > > commit f4d3da72a76a9ce5f57bba64788931686a9dc333 upstream. > > > > In Mesa, dev_info.gart_page_size is used for alignment and it was > > set to AMDGPU_GPU_PAGE_SIZE(4KB). However, the page table of AMDGPU > > driver requires an alignment on CPU pages. So, for non-4KB page system, > > gart_page_size should be max_t(u32, PAGE_SIZE, AMDGPU_GPU_PAGE_SIZE). > > > > Signed-off-by: Rui Wang <wa...@lemote.com> > > Signed-off-by: Huacai Chen <che...@lemote.com> > > Link: > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Floongson-community%2Flinux-stable%2Fcommit%2Fcaa9c0a1&data=04%7C01%7Cchristian.koenig%40amd.com%7C3eab702e82bc4ab81fbd08d99bb21419%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637712012456273340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=C8Dz22MXtb6u012FMN%2BRm0i9vDNkn5LJrkqX2qACzqY%3D&reserved=0 > > [Xi: rebased for drm-next, use max_t for checkpatch, > > and reworded commit message.] > > Signed-off-by: Xi Ruoyao <xry...@mengyan1223.wang> > > BugLink: > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1549&data=04%7C01%7Cchristian.koenig%40amd.com%7C3eab702e82bc4ab81fbd08d99bb21419%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637712012456273340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=Azbz96fafUgilLJIFAOzXadKygHFVqYBxlh%2FYvhNtdE%3D&reserved=0 > > Tested-by: Dan Horák <d...@danny.cz> > > Reviewed-by: Christian König <christian.koe...@amd.com> > > Signed-off-by: Alex Deucher <alexander.deuc...@amd.com> > > [Salvatore Bonaccorso: Backport to 5.10.y which does not contain > > a5a52a43eac0 ("drm/amd/amdgpu/amdgpu_kms: Remove 'struct > > drm_amdgpu_info_device dev_info' from the stack") which removes dev_info > > from the stack and places it on the heap.] > > Signed-off-by: Salvatore Bonaccorso <car...@debian.org> > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c > > index efda38349a03..917b94002f4b 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c > > @@ -766,9 +766,9 @@ static int amdgpu_info_ioctl(struct drm_device *dev, > > void *data, struct drm_file > > dev_info.high_va_offset = AMDGPU_GMC_HOLE_END; > > dev_info.high_va_max = AMDGPU_GMC_HOLE_END | vm_size; > > } > > - dev_info.virtual_address_alignment = max((int)PAGE_SIZE, > > AMDGPU_GPU_PAGE_SIZE); > > + dev_info.virtual_address_alignment = max_t(u32, PAGE_SIZE, > > AMDGPU_GPU_PAGE_SIZE); > > dev_info.pte_fragment_size = (1 << > > adev->vm_manager.fragment_size) * AMDGPU_GPU_PAGE_SIZE; > > - dev_info.gart_page_size = AMDGPU_GPU_PAGE_SIZE; > > + dev_info.gart_page_size = max_t(u32, PAGE_SIZE, > > AMDGPU_GPU_PAGE_SIZE); > > dev_info.cu_active_number = adev->gfx.cu_info.number; > > dev_info.cu_ao_mask = adev->gfx.cu_info.ao_cu_mask; > > dev_info.ce_ram_size = adev->gfx.ce_ram_size; >