On Wed, Aug 02, 2017 at 08:18:02PM +0200, Christian König wrote:
> Am 02.08.2017 um 19:33 schrieb Jerome Glisse:
> > On Wed, Aug 02, 2017 at 07:23:58PM +0200, Christian König wrote:
> > > Am 02.08.2017 um 19:13 schrieb Jerome Glisse:
> > > > On Wed, Aug 02, 2017 at 07:05:11PM +0200, Christian König wrote:
> > > > > Am 02.08.2017 um 18:43 schrieb Jerome Glisse:
> > > > > > On Wed, Aug 02, 2017 at 10:26:40AM +0200, Christian König wrote:
> > > > > > > [SNIP]
> > > > > > So to summarize you are saying you do not trust the value you get 
> > > > > > from
> > > > > > pci_map_page() ?
> > > > > Well, what we don't trust is that we actually get this value 
> > > > > correctly into
> > > > > our page tables.
> > > > > 
> > > > > > If not then i stress again that you have all the informations you 
> > > > > > need
> > > > > > inside the amdgpu driver. You can take the same scheme i propose to
> > > > > > dump ttm.dma_address[] and compare against content of GPU page 
> > > > > > table.
> > > > > Yes, exactly. But then again we have the mapping page to dma-address
> > > > > (because that is what drivers usually need), but what we need for 
> > > > > debugging
> > > > > is a map with the info dma-address to page.
> > > > Why would you need the reverse ? You have a GPU virtual address do the 
> > > > following:
> > > > GPU vaddr -> GPU page table entrie -> bus address
> > > > GPU vaddr -> bo_va_mapping -> bo_va -> bo -> page -> dma/bus address
> > > First of all the VM housekeeping structures keep the state as it is 
> > > supposed
> > > to be on the next command submission and so the top of the pipeline, but 
> > > the
> > > state of the page tables represent to bottom of the pipeline.
> > > 
> > > Second you can't access the BO from it's virtual address, the BO mapping 
> > > is
> > > protected by the BO reservation lock. So when I want to lockup the BO I
> > > would need to lock the BO first - chicken and egg problem. That is of 
> > > course
> > > solvable, but not something I would really like to do for a debugging
> > > feature.
> > Tom said that the GPU is stop and thus there is nothing happening to the vm
> > nor any of the bo right ? So there is no need to protect anything. If you
> > still allow thing to change vm (map/unmap bo ...) how do you expect to 
> > debug ?
> 
> Well the GPU is stuck (or manually stopped), but keep in mind that this is a
> very deep pipeline we are talking about here.
> 
> So even if there isn't any processing happening in the hardware any more
> there can still state changes queued up waiting for processing (or even new
> one added).
> 

I am familiar with how it all works. Vm are protected by fences so a
vm that is in use will be protected from bind/unbind if GPU is stop only
and update to virtual address space of that vm might also be block if
they were to happen through some GPU ring and not from CPU.

To me it looks like you want to know if GPU is accessing what was meant
to be access and i believe what i have outline allow that. Compare current
GPU page table entry with valid mapping.

Thus i do not see any value in trying to get bus -> page, especialy that
once you get the page you can't know to which bo it belongs without going
over all ttm tt objects and even then it might be an already "freed" page
from ttm point of view.

Jérôme
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Reply via email to