> Out of curiosity, where do you generate so many foreign dma-buf backes > objects? That was just an experiment of sharing buffers between two AMD devices. A stupid reference count bug and I've leaked 60 full HD frame a second.
Took a bit over two minutes to eat up 1GB of GTT space and the box dying rather spectacularly. Regards, Christian. Am 11.01.2016 um 15:55 schrieb Daniel Vetter: > On Mon, Jan 11, 2016 at 03:38:23PM +0100, Christian König wrote: >>> While this would be one way to work around the GTT space DOS, it >>> wouldn't work around the DOS problem of the exporter pinning system >>> pages that probably never get accounted. That's a limitation of the >>> export-import mechanism (dma-buf). >> Yeah, completely agree. That's a problem we sooner or later need to address >> as well. >> >> But currently I worry mostly about that specific problem at hand and how to >> fix it, the larger design problems need to wait. >> >> Just send five patches out to the list and CCing you as well. You can ignore >> the last two which are amdgpu specific, but would be nice if you could take >> a look at the first three. > Out of curiosity, where do you generate so many foreign dma-buf backes > objects? With prime I expect at most a few framebuffers on each side to be > shared, and that shouldn't really cause much trouble. Just sharing > dma-bufs with your own device should result in the underlying native > objects being referenced, so work like flink. > > Wrt fixing this for real: I think step 1 would be stop pinning dma-bufs on > import and instead implementing all the map/unmap vfunc hooks ttm already > implements (it's fully intentional that they match really closely > between) by virtualizing essentially ttm_bo_api.h. > > After that we need to allow exporters to evict mappings importers have > created (using the ww_mutex locking and all that like ttm does for purely > native eviction). That likely needs a callback to dma_buf_attachment. With > that we should be 90% there, since most likely when the importer tries to > bind more stuff exported from somewhere we'll run into limits of the > exporter. It's not perfect since it's not (yet) a truly global evict, but > should get us a large step towards that goal. > > If we really need global eviction across all drivers in a system (for e.g. > system memory, nothing else should be shared like that really I think) > then I guess we could look into either extending ttm_global (or adding > something similar at the dma-buf level). > > Cheers, Daniel > >> Thanks in advance, >> Christian. >> >> Am 11.01.2016 um 14:00 schrieb Thomas Hellstrom: >>> On 01/11/2016 01:39 PM, Christian König wrote: >>>> Am 10.01.2016 um 16:10 schrieb Thomas Hellstrom: >>>>> On 01/09/2016 11:46 AM, Christian König wrote: >>>>>> It's correct that exported buffers can't be moved to another domain or >>>>>> swapped to disk while another device is using it. >>>>>> >>>>>> But for imports that's a different story: >>>>>>> an imported object should never end up on a LRU list in TTM because >>>>>>> TTM wouldn't know how to evict it. >>>>>> Even currently the imported objects are actually added to the LRU. The >>>>>> problem is just that they are not added immediately during creation, >>>>>> but only with the first command submission. >>>>> Hmm. The fact that they are added to LRU at all sounds weird. What >>>>> happens when TTM tries to evict or even swap out an imported buffer? >>>> Adding them to the domain LRU is perfectly normal for the imported >>>> buffers and also works fine as far as I know. >>>> >>>> When TTM evicts them it just gets unmapped from GTT to make room in >>>> the address space, which at least for Radeon and Amdgpu is exactly >>>> what we want. >>>> >>>> That imported buffers get added to the swap LRU is indeed nonsense, >>>> but not harmful as far as I can see. Going to fix that in a follow up >>>> patch. >>>> >>> So the way TTM was designed when it was written was to be used to place >>> data in memory types where it could later be mapped by CPU and GPU in a >>> coherent fashion. >>> >>> The actual GPU mapping was primarily thought to be done by the driver as >>> part of the validation process *after* that placement if needed. >>> >>> Now many (most, all) drivers don't need that second step since the >>> memory type is directly mappable by the GPU, but in a situation where >>> the core TTM functionality (placement and swapping) is completely >>> replaced by another mechanism (such as imported buffers), not >>> implementing the GPU binding and eviction in the driver as a separate >>> step naturally generates a conflict. >>> >>> Now I'm not at all against TTM being used for GPU mapping only if that >>> simplifies things for imported buffers but in that case IMO one should >>> be aware of the limitations and we should find a way to mark those >>> buffers GPU_MAP_ONLY to avoid having TTM competing with the exporter >>> about the placement ad avoid putting it on the swap LRU. >>> >>> While this would be one way to work around the GTT space DOS, it >>> wouldn't work around the DOS problem of the exporter pinning system >>> pages that probably never get accounted. That's a limitation of the >>> export-import mechanism (dma-buf). >>> >>> /Thomas >>> >>> >>> >>>> Thanks for the comment, >>>> Christian. >>>> >>>>> /Thomas >>>>> >>>>> >>>>>> Regards, >>>>>> Christian. >>>>>> >>>>>> Am 09.01.2016 um 08:05 schrieb Thomas Hellstrom: >>>>>>> Hi! >>>>>>> >>>>>>> I might be misunderderstanding the use-case here, but IIRC the >>>>>>> discussion with TTM vs imported / exported buffer objects is that a >>>>>>> buffer object needs to be marked NO_EVICT in TTM before it's exported >>>>>>> and an imported object should never end up on a LRU list in TTM >>>>>>> because >>>>>>> TTM wouldn't know how to evict it. >>>>>>> >>>>>>> /Thomas >>>>>>> On 01/08/2016 02:41 PM, Christian König wrote: >>>>>>>> From: Christian König <christian.koenig at amd.com> >>>>>>>> >>>>>>>> If we import a BO with an eternal reservation object we don't >>>>>>>> reserve/unreserve it. So we never add it to the LRU causing a >>>>>>>> possible >>>>>>>> deny of service. >>>>>>>> >>>>>>>> Signed-off-by: Christian König <christian.koenig at amd.com> >>>>>>>> --- >>>>>>>> drivers/gpu/drm/ttm/ttm_bo.c | 8 +++++++- >>>>>>>> 1 file changed, 7 insertions(+), 1 deletion(-) >>>>>>>> >>>>>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c >>>>>>>> b/drivers/gpu/drm/ttm/ttm_bo.c >>>>>>>> index 745e996..a98a5d5 100644 >>>>>>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c >>>>>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c >>>>>>>> @@ -1170,9 +1170,15 @@ int ttm_bo_init(struct ttm_bo_device *bdev, >>>>>>>> if (likely(!ret)) >>>>>>>> ret = ttm_bo_validate(bo, placement, interruptible, >>>>>>>> false); >>>>>>>> - if (!resv) >>>>>>>> + if (!resv) { >>>>>>>> ttm_bo_unreserve(bo); >>>>>>>> + } else if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) { >>>>>>>> + spin_lock(&bo->glob->lru_lock); >>>>>>>> + ttm_bo_add_to_lru(bo); >>>>>>>> + spin_unlock(&bo->glob->lru_lock); >>>>>>>> + } >>>>>>>> + >>>>>>>> if (unlikely(ret)) >>>>>>>> ttm_bo_unref(&bo); >>> >> _______________________________________________ >> dri-devel mailing list >> dri-devel at lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/dri-devel