[gem5-users] GPU virtual memory system

Imad Al Assir via gem5-users Fri, 29 Oct 2021 09:03:48 -0700

Hello,
I have been looking at the source code of the GPU model for the past few weeks, 
and I had some doubts about the virtual memory system for discrete GPUs (and 
APUs if there are any differences). I will include my questions and partial 
answers below, and I hope you can correct me if I'm wrong. Also, it would be 
great if you can point me to the documentation/source code where each of these 
answers can be found.
1- Where are the page tables located exactly? Who manages them?
I saw that the page tables are emulated (i.e. with the EmulationPageTable 
structure) and that the GPU uses the host x86 page tables. But since there is 
no OS, who manages them and where are they located exactly? In the Ruby memory 
of the CPU?
2- How do page walks happen?
I saw some comment saying that they are not real page walks, and that the CPU's 
x86 page table walkers (PTWs) are used. But how is the translation from the 
page table actually fetched if the walk is not real? Don't the page walkers 
still have to walk the tables in memory?
3- How are page faults handled if there is no OS?
4- What components of the VM hierarchy are already present: IOMMU, TLBs, PWC, 
PTWs?
What I am sure of is that there is a customizable TLB hierarchy and TLB 
coalescers. As for the IOMMU, I was not able to figure out what it consisted 
of. I know that there is a PTW and that the model uses the CPU's x86 page 
tables to do the translations. But how many PTWs are there; GPUs usually 
require multiple PTWs, so is this number customizable? Also, I did not see any 
page walk caches or IOMMU TLBs. Are these not present in the current model? If 
I am wrong, please point me to the source code of each component (and where 
they are instantiated).


I saw that a paper published by AMD in the latest MICRO 
(https://dl.acm.org/doi/10.1145/3466752.3480105) used the GPU model, and that 
they had all of the components mentioned in question 4, so are these publicly 
available to everyone or should I implement them myself?
5- I saw a comment in gpu_compute_driver.cc saying: "TODO: IOMMU and GPUTLBs do 
not seem to correctly support shootdown". Does this mean that TLB shootdown is 
not working at all? And when you say IOMMU, what do you mean exactly (since 
there is no concrete IOMMU component), i.e. what does it consist of?
Sorry for the long e-mail and thank you in advance for your help,
Imad Al Assir

_______________________________________________
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-users] GPU virtual memory system

Reply via email to