On 11/29/2012 10:58 PM, Marek Ol??k wrote: > On Thu, Nov 29, 2012 at 9:33 PM, Thomas Hellstrom <thomas at shipmail.org> > wrote: >> On 11/29/2012 01:52 PM, Marek Ol??k wrote: >>> On Thu, Nov 29, 2012 at 9:04 AM, Thomas Hellstrom <thomas at shipmail.org> >>> wrote: >>>> On 11/29/2012 03:15 AM, Marek Ol??k wrote: >>>>> On Thu, Nov 29, 2012 at 12:44 AM, Alan Swanson <swanson at ukfsn.org> >>>>> wrote: >>>>>> On Wed, 2012-11-28 at 18:24 -0500, Jerome Glisse wrote: >>>>>>> On Wed, Nov 28, 2012 at 6:18 PM, Thomas Hellstrom >>>>>>> <thomas at shipmail.org> >>>>>>> wrote: >>>>>>>> On 11/28/2012 04:58 PM, j.glisse at gmail.com wrote: >>>>>>>>> From: Jerome Glisse <jglisse at redhat.com> >>>>>>>>> >>>>>>>>> This patch add a minimum residency time configurable for each memory >>>>>>>>> pool (VRAM, GTT, ...). Intention is to avoid having a lot of memory >>>>>>>>> eviction from VRAM up to a point where the GPU pretty much spend all >>>>>>>>> it's time moving things in and out. >>>>>>>> >>>>>>>> This patch seems odd to me. >>>>>>>> >>>>>>>> It seems the net effect is to refuse evictions from VRAM and make >>>>>>>> buffers go >>>>>>>> somewhere else, and that makes things faster? >>>>>>>> >>>>>>>> Why don't they go there in the first place instead of trying to force >>>>>>>> them >>>>>>>> into VRAM, >>>>>>>> when VRAM is full? >>>>>>>> >>>>>>>> /Thomas >>>>>>> It's mostly a side effect of cs and validating with each cs, if boA is >>>>>>> in cs1 and not in cs2 and boB is in cs1 but not in cs2 than boA could >>>>>>> be evicted by cs2 and boB moved in, if next cs ie cs3 is like cs1 then >>>>>>> boA move back again and boB is evicted, then you get cs4 which >>>>>>> reference boB but not boA, boA get evicted and boB move in ... So ttm >>>>>>> just spend its time doing eviction but he doing so because it's ask by >>>>>>> the driver to do so. Note that what is costly there is not the bo move >>>>>>> in itself but the page allocation. >>>>>>> >>>>>>> I propose this patch to put a boundary on bo eviction frequency, i >>>>>>> thought it might help other driver, if you set the residency time to 0 >>>>>>> you get the current behavior, if you don't you enforce a minimum >>>>>>> residency time which helps driver like radeon. Of course a proper fix >>>>>>> to the bo eviction for radeon has to be in radeon code and is mostly >>>>>>> an overhaul of how we validate bo. >>>>>>> >>>>>>> But i still believe that this patch has value in itself by allowing >>>>>>> driver to put a boundary on buffer movement frequency. >>>>>>> >>>>>>> Cheers, >>>>>>> Jerome >>>>>> So, a variation on John Carmack's recommendation from 2000 to use MRU, >>>>>> not LRU, to avoid texture trashing. >>>>>> >>>>>> Mar 07, 2000 - Virtualized video card local memory is The Right >>>>>> Thing. >>>>>> http://floodyberry.com/carmack/johnc_plan_2000.html >>>>>> >>>>>> In fact, this was last discussed in 2005 with a patch for a 1 second >>>>>> stale texture eviction and I (still) wondered why a method it was never >>>>>> implemented since it was an clear problem. >>>>> BTW we can send end-of-frame markers to the kernel, which could be >>>>> used to implement Carmack's algorithm. >>>>> >>>>> Marek >>>> >>>> It seems to me like Carmack's algorithm is quite specific to the case >>>> where >>>> only a single GL client is running? >>> In theory, we could send context IDs to the kernel as well and modify >>> the conditional to "If the LRU texture was not needed in the previous >>> frame of any context". >>> >>> >>>> It also seems like it's designed around the fact that when eviction takes >>>> place, all buffer objects will be idle. With a >>>> reasonably filled graphics fifo / ring, blindly using MRU will cause the >>>> GPU >>>> to run synchronized. >>> I don't see why you would need to synchronize. If the GPU takes care >>> of moving buffers in and out of VRAM and there's only one ring buffer >>> ==> no synchronization is required. >> The LRU bo has a much higher probability of being idle than the MRU bo, and >> waiting for it to become idle will in >> principle synchronize the GPU and unnecessarily drain the ring. > What I tried to point out was that the synchronization shouldn't be > needed, because the CPU shouldn't do anything with the contents of > evicted buffers. The GPU moves the buffers, not the CPU. What does the > CPU do besides updating some kernel structures? > > Also, buffer deletion is something where you don't need to wait for > the buffer to become idle if you know the memory area won't be > mapped by the CPU, ever. The memory can be reclaimed right away. It > would be the GPU to move new data in and once that happens, the old > buffer will be trivially idle, because single-ring GPUs execute > commands in order.
Yes, you're right. Sorry about that. /Thomas