Right now every time the fdinfo is read, we go through the vm lists and lock all the BOs to calcuate the statistics. This causes a lot of lock contention when the VM is actively used. It gets worse if there is a lot of shared BOs or if there's a lot of submissions. We have seen submissions lock-up for seconds due to fdinfo for some workload. Therefore, rework the implementation to track the BOs as they get moved around.
The amd-only visible memory stat is removed to simplify implementation, it's unclear how useful this stat is since kernel map/unmap BOs whenever it wants to and on a modern system all of VRAM can be mapped if needed. v5: rebase on top of the drm_print_memory_stats refactor v6: split the drm changes into a seperate patch for drm-devel review, fix handling of drm-total- vs drm-resident- and handle drm-purgable-. v7: make drm-active- optional v8: clearify documentation, minor tweaks, and some bug fixes found during testing Yunxiang Li (5): drm: add drm_memory_stats_is_zero drm: make drm-active- stats optional Documentation/gpu: Clarify drm memory stats definition drm/amdgpu: remove unused function parameter drm/amdgpu: track bo memory stats at runtime Documentation/gpu/drm-usage-stats.rst | 36 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 4 +- drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 17 +- drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 17 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 3 +- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 111 +++++------- drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 4 +- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 4 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 179 ++++++++++++++------ drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 20 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c | 1 + drivers/gpu/drm/drm_file.c | 23 ++- drivers/gpu/drm/i915/i915_drm_client.c | 1 + drivers/gpu/drm/xe/xe_drm_client.c | 1 + include/drm/drm_file.h | 1 + include/drm/drm_gem.h | 14 +- 16 files changed, 254 insertions(+), 182 deletions(-) -- 2.34.1