Re: [PATCH v4 2/2] drm/amdgpu: unref pt bo after job submit

2020-03-16 Thread Christian König
ing, Felix ; Pan, Xinhui ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH v4 2/2] drm/amdgpu: unref pt bo after job submit hi, All I think I found the root cause. here is what happened. user: alloc/mapping memory kernel: validate memory and update the bo

RE: [PATCH v4 2/2] drm/amdgpu: unref pt bo after job submit

2020-03-16 Thread Tao, Yintian
Subject: Re: [PATCH v4 2/2] drm/amdgpu: unref pt bo after job submit [AMD Official Use Only - Internal Distribution Only] I still hit page fault with option 1 while running oclperf test. Looks like we need sync fence after commit. From: Tao, Yintian mailto:yintian

Re: [PATCH v4 2/2] drm/amdgpu: unref pt bo after job submit

2020-03-16 Thread Pan, Xinhui
: Deucher, Alexander ; Kuehling, Felix ; Pan, Xinhui ; amd-gfx@lists.freedesktop.org Subject: RE: [PATCH v4 2/2] drm/amdgpu: unref pt bo after job submit Hi Xinhui I encounter the same problem(page fault) when test vk_example benchmark. I use your first option which can fix the problem. Can you

RE: [PATCH v4 2/2] drm/amdgpu: unref pt bo after job submit

2020-03-16 Thread Tao, Yintian
inhui Sent: 2020年3月14日 21:07 To: Koenig, Christian Cc: Deucher, Alexander ; Kuehling, Felix ; Pan, Xinhui ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH v4 2/2] drm/amdgpu: unref pt bo after job submit hi, All I think I found the root cause. here is what happened. user: alloc/mappin

Re: [PATCH v4 2/2] drm/amdgpu: unref pt bo after job submit

2020-03-14 Thread Pan, Xinhui
hi, All I think I found the root cause. here is what happened. user: alloc/mapping memory kernel: validate memory and update the bo mapping, and update the page table -> amdgpu_vm_bo_update_mapping -> amdgpu_vm_update_ptes

Re: [PATCH v4 2/2] drm/amdgpu: unref pt bo after job submit

2020-03-13 Thread Christian König
The page table is not updated and then freed. A higher level PDE is updated and because of this the lower level page tables is freed. Without this it could be that the memory backing the freed page table is reused while the PDE is still pointing to it. Rather unlikely that this causes problem

Re: [PATCH v4 2/2] drm/amdgpu: unref pt bo after job submit

2020-03-13 Thread Felix Kuehling
This seems weird. This means that we update a page table, and then free it in the same amdgpu_vm_update_ptes call? That means the update is redundant. Can we eliminate the redundant PTE update if the page table is about to be freed anyway? Regards,   Felix On 2020-03-13 12:09, xinhui pan wrot

[PATCH v4 2/2] drm/amdgpu: unref pt bo after job submit

2020-03-13 Thread xinhui pan
Free page table bo before job submit is insane. We might touch invalid memory while job is runnig. we now have individualized bo resv during bo releasing. So any fences added to root PT bo is actually untested when a normal PT bo is releasing. We might hit gmc page fault or memory just got overwr