On 12/08/20 2:02 pm, Christian König wrote: > Am 12.08.20 um 10:15 schrieb Shashank Sharma: >> Hello Christian, >> >> On 12/08/20 12:15 pm, Christian König wrote: >>> Am 12.08.20 um 06:33 schrieb Shashank Sharma: >>>> This patch adds a new trace event to track the PTE update >>>> events. This specific event will provide information like: >>>> - start and end of virtual memory mapping >>>> - HW engine flags for the map >>>> - physical address for mapping >>>> >>>> This will be particularly useful for memory profiling tools >>>> (like RMV) which are monitoring the page table update events. >>>> >>>> V2: Added physical address lookup logic in trace point >>>> V3: switch to use __dynamic_array >>>> added nptes int the TPprint arguments list >>>> added page size in the arg list >>>> >>>> Cc: Christian König <christian.koe...@amd.com> >>>> Cc: Alex Deucher <alexander.deuc...@amd.com> >>>> Signed-off-by: Christian König <christian.koe...@amd.com> >>>> Signed-off-by: Shashank Sharma <shashank.sha...@amd.com> >>>> --- >>>> drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 38 +++++++++++++++++++++++ >>>> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 9 ++++-- >>>> 2 files changed, 45 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h >>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h >>>> index 63e734a125fb..b9aae7983b4b 100644 >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h >>>> @@ -321,6 +321,44 @@ DEFINE_EVENT(amdgpu_vm_mapping, amdgpu_vm_bo_cs, >>>> TP_ARGS(mapping) >>>> ); >>>> >>>> +TRACE_EVENT(amdgpu_vm_update_ptes, >>>> + TP_PROTO(struct amdgpu_vm_update_params *p, >>>> + uint64_t start, uint64_t end, >>>> + unsigned int nptes, uint64_t dst, >>>> + uint64_t incr, uint64_t flags), >>>> + TP_ARGS(p, start, end, nptes, dst, incr, flags), >>>> + TP_STRUCT__entry( >>>> + __field(u64, start) >>>> + __field(u64, end) >>>> + __field(u64, flags) >>>> + __field(u64, incr) >>>> + __field(unsigned int, nptes) >>>> + __dynamic_array(u64, dst, nptes) >>>> + ), >>>> + >>>> + TP_fast_assign( >>>> + unsigned int i; >>>> + >>>> + __entry->start = start; >>>> + __entry->end = end; >>>> + __entry->flags = flags; >>>> + __entry->incr = incr; >>>> + __entry->nptes = nptes; >>>> + >>>> + for (i = 0; i < nptes; ++i) { >>>> + u64 addr = p->pages_addr ? amdgpu_vm_map_gart( >>>> + p->pages_addr, dst) : dst; >>>> + >>>> + ((u64 *)__get_dynamic_array(dst))[i] = addr; >>>> + dst += incr; >>>> + } >>>> + ), >>>> + TP_printk("seg:0x%010llx-0x%010llx, flags:0x%llx, nptr=%u, pgsz:%llu," >>>> + " dst:\n%s", __entry->start, __entry->end, __entry->flags, >>>> + __entry->nptes, __entry->incr, >>> This is not correct. The increment is NOT the page size, but rather the >>> page size rounded down to a power of 512+4K. >>> >>> In other words page size can be 4K, 8K, 16K, 32K, 64K....1M, 2M, >>> 4M....512M, 1G, 2G.... >>> >>> But the increment can only be 4K, 2M, 1G.... >> Understood. But I think the requirement here is for increment. My >> understanding is that the tool needs to save the page entries, and for that, >> it will need start of virtual mem, start of physical mem, mapping size and >> step to increment the entries. If that's so, we can re-label this entry as >> "step" instead of "page size". Please let me know if you think it's the >> right thing to do. > We could stick with the naming increment if that helps, but this can > also be derived from the number of destination addresses we have. sure, i will make it increment. > > On the other hand explicitly mentioning it probably won't hurt us either. > > And by the way what does "seg" mean?
Ah, to get into 80 char limit, I made 'segment' as 'seg' and later just realized I have to still break the print into two lines :) . I will make it back to segment or start/end - Shashank > > Christian. > >>> And do we need the nptes here? We just need it to print the correct >>> number of destination addresses. >> Agree, we don't really need nptes here, I will remove that and send V4. >> >> - Shashank >> >>> Regards, >>> Christian. >>> >>>> + __print_array(__get_dynamic_array(dst), __entry->nptes, 8)) >>>> +); >>>> + >>>> TRACE_EVENT(amdgpu_vm_set_ptes, >>>> TP_PROTO(uint64_t pe, uint64_t addr, unsigned count, >>>> uint32_t incr, uint64_t flags, bool direct), >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>> index 71e005cf2952..b5dbb5e8bc61 100644 >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c >>>> @@ -1513,17 +1513,22 @@ static int amdgpu_vm_update_ptes(struct >>>> amdgpu_vm_update_params *params, >>>> do { >>>> uint64_t upd_end = min(entry_end, frag_end); >>>> unsigned nptes = (upd_end - frag_start) >> >>>> shift; >>>> + uint64_t upd_flags = flags | AMDGPU_PTE_FRAG(frag); >>>> >>>> /* This can happen when we set higher level PDs >>>> to >>>> * silent to stop fault floods. >>>> */ >>>> nptes = max(nptes, 1u); >>>> + >>>> + trace_amdgpu_vm_update_ptes(params, frag_start, upd_end, >>>> + nptes, dst, incr, >>>> + upd_flags); >>>> amdgpu_vm_update_flags(params, pt, cursor.level, >>>> pe_start, dst, nptes, >>>> incr, >>>> - flags | AMDGPU_PTE_FRAG(frag)); >>>> + upd_flags); >>>> >>>> pe_start += nptes * 8; >>>> - dst += (uint64_t)nptes * AMDGPU_GPU_PAGE_SIZE << shift; >>>> + dst += nptes * incr; >>>> >>>> frag_start = upd_end; >>>> if (frag_start >= frag_end) { _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx