On 30.05.2014 13:46, Grigori Goronzy wrote: > On 30.05.2014 13:30, Marek Ol??k wrote: >> Grigori, >> >> you can git-checkout the commit before and after the memory management >> changes, compile both and test them. >> > > I was trying to revert the changes, but it looks like too much changed > in the meantime. The suitable commits to check out should be 0bc490a8 > (before) and 19dff56a (after), right? >
Turns out these changes weren't the problem, but instead it's the page tables rework which seems to also cause a bunch of other issues, commit 6d2f2944. The latest drm-fixes code doesn't change it, either. According to my (not very scientific) testing with radeontop and the "time" utility, this appears to be a CPU overhead problem. The "sys" duration reported by time for a Xonotic benchmark run is over 3x as long after the regression, and radeontop seems to report about 10% reduced GPU load on average. Best regards Grigori > Best regards > Grigori > >> Marek >> >> On Fri, May 30, 2014 at 2:30 AM, Grigori Goronzy <greg at chown.ath.cx> >> wrote: >>> On 13.05.2014 22:27, Marek Ol??k wrote: >>>> >>>> I applied these two patches Christian sent to dri-devel: >>>> >>>> drm/radeon: fix page directory update size estimation >>>> drm/radeon: fix buffer placement under memory pressure v2 >>>> >>>> on top of torvalds's master branch. >>>> >>> >>> With latest kernel master (a991639c) I still see a regression, >>> compared to >>> 3.13 or 3.14, which have similar performance. Xonotic is about 7% >>> slower. >>> OpenArena and Unigine Tropics are also noticeably slower, but I didn't >>> record accurate numbers. >>> >>> Maybe the improved memory management has some overhead, but this is not >>> acceptable IMHO. I'll try to investigate further. >>> >>> Best regards >>> >>> Grigori >>> >>>> Marek >>>> >>>> On Tue, May 13, 2014 at 10:19 PM, Grigori Goronzy <greg at chown.ath.cx> >>>> wrote: >>>>> >>>>> On 13.05.2014 21:50, Marek Ol??k wrote: >>>>>> >>>>>> >>>>>> Hi Christian, >>>>>> >>>>>> The performance regression I saw with piglit seems to be fixed with >>>>>> latest kernel git. It's difficult to bisect the kernel, because there >>>>>> are only merges between 3.14 and 3.15 and the merged committs are >>>>>> actually based on 3.14-rc1 and 3.14-rc4. >>>>>> >>>>>> All seems to be fine with your fixes. >>>>>> >>>>> >>>>> Which fixes have you applied? There are quite a few pending patches on >>>>> dri-devel, that aren't yet part of drm-fixes-3.15. >>>>> >>>>> Grigori >>>>> >>>>> >>>>>> Marek >>>>>> >>>>>> On Tue, May 13, 2014 at 5:31 PM, Christian K?nig >>>>>> <deathsimple at vodafone.de> wrote: >>>>>>> >>>>>>> >>>>>>> Is the performance regression regression caused by the page table >>>>>>> changes >>>>>>> or >>>>>>> something else? >>>>>>> >>>>>>> I did made some tests with xonotic while developing it and it didn't >>>>>>> showed >>>>>>> anything obvious, but I didn't made tests on different systems. >>>>>>> >>>>>>> Christian. >>>>>>> >>>>>>> Am 13.05.2014 17:19, schrieb Marek Ol??k: >>>>>>> >>>>>>>> Your latest patches fix the regression. >>>>>>>> >>>>>>>> The performance regression can also be reproduced with piglit "-t >>>>>>>> texelFetch.fs". >>>>>>>> >>>>>>>> Kernel 3.14: >>>>>>>> real 0m17.724s >>>>>>>> user 0m41.905s >>>>>>>> sys 0m11.299s >>>>>>>> >>>>>>>> The problematic commit checked out + your fixes (without the PTE >>>>>>>> patch >>>>>>>> I >>>>>>>> think): >>>>>>>> real 0m23.474s >>>>>>>> user 1m1.008s >>>>>>>> sys 0m13.812s >>>>>>>> >>>>>>>> Marek >>>>>>>> >>>>>>>> >>>>>>>> On Tue, May 13, 2014 at 3:57 PM, Christian K?nig >>>>>>>> <deathsimple at vodafone.de> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Am 13.05.2014 15:22, schrieb Alex Deucher: >>>>>>>>> >>>>>>>>>> On Mon, May 12, 2014 at 7:38 PM, Grigori Goronzy >>>>>>>>>> <greg at chown.ath.cx> >>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I can confirm this fixes it for me, too. >>>>>>>>>>> >>>>>>>>>>> 3.15 with these fixes and the large PTE patches actually ends up >>>>>>>>>>> being >>>>>>>>>>> noticeably slower than earlier kernels with Xonotic, though. I >>>>>>>>>>> wonder >>>>>>>>>>> what's >>>>>>>>>>> going on. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Allocation overhead? >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Unlikely, Xonotic just allocates a single page table at start, >>>>>>>>> which >>>>>>>>> then >>>>>>>>> gets extended to a certain rate until they no longer need more >>>>>>>>> address >>>>>>>>> space >>>>>>>>> and are done with it. >>>>>>>>> >>>>>>>>> Grigori, can you bisect and/or try to figure out what's wrong >>>>>>>>> here? >>>>>>>>> >>>>>>>>> Christian. >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Grigori >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 12.05.2014 14:50, Christian K?nig wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I could reproduce the problem with xonotic and I think I've >>>>>>>>>>>> found >>>>>>>>>>>> the >>>>>>>>>>>> issue. >>>>>>>>>>>> >>>>>>>>>>>> Please test the attached patch. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Christian. >>>>>>>>>>>> >>>>>>>>>>>> Am 11.05.2014 11:06, schrieb Christian K?nig: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> I have tested it and it doesn't fix the hangs. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Yeah, thought so. Well it was just a guess. >>>>>>>>>>>>> >>>>>>>>>>>>>> (Also, I don't like the patch, because it reverts the >>>>>>>>>>>>>> behavior I >>>>>>>>>>>>>> added >>>>>>>>>>>>>> for userspace buffers.) >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Actually it shouldn't affect that. The alternative domain >>>>>>>>>>>>> always >>>>>>>>>>>>> contains GART even when userspace only specified VRAM as >>>>>>>>>>>>> placement >>>>>>>>>>>>> (as >>>>>>>>>>>>> long as it is technical possible to do so). >>>>>>>>>>>>> >>>>>>>>>>>>> So what should happen is that TTM sees the current placement, >>>>>>>>>>>>> matches >>>>>>>>>>>>> that with the desired placement and should find that it >>>>>>>>>>>>> doesn't >>>>>>>>>>>>> need >>>>>>>>>>>>> to move the buffer (we should just test if this behavior >>>>>>>>>>>>> really >>>>>>>>>>>>> works >>>>>>>>>>>>> as expected). >>>>>>>>>>>>> >>>>>>>>>>>>> Christian. >>>>>>>>>>>>> >>>>>>>>>>>>> Am 10.05.2014 23:38, schrieb Marek Ol??k: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Christian, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I have tested it and it doesn't fix the hangs. >>>>>>>>>>>>>> >>>>>>>>>>>>>> (Also, I don't like the patch, because it reverts the >>>>>>>>>>>>>> behavior I >>>>>>>>>>>>>> added >>>>>>>>>>>>>> for userspace buffers.) >>>>>>>>>>>>>> >>>>>>>>>>>>>> Marek >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Sat, May 10, 2014 at 6:34 PM, Christian K?nig >>>>>>>>>>>>>> <deathsimple at vodafone.de> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Couldn't reproduce the issue so far. So the attached >>>>>>>>>>>>>>> patch is >>>>>>>>>>>>>>> just >>>>>>>>>>>>>>> a >>>>>>>>>>>>>>> complete shoot into the dark found by rereading the code, >>>>>>>>>>>>>>> but >>>>>>>>>>>>>>> it >>>>>>>>>>>>>>> might >>>>>>>>>>>>>>> actually be the problem. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Please give it a try. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Going to keep testing in the meantime, >>>>>>>>>>>>>>> Christian. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Am 10.05.2014 10:23, schrieb Christian K?nig: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I see hangs with kernel 3.15 and SI under memory pressure, >>>>>>>>>>>>>>>>> e.g. >>>>>>>>>>>>>>>>> if >>>>>>>>>>>>>>>>> I boot >>>>>>>>>>>>>>>>> with radeon.vramlimit=256 and then run Xonotic timedemo >>>>>>>>>>>>>>>>> with >>>>>>>>>>>>>>>>> high >>>>>>>>>>>>>>>>> settings. >>>>>>>>>>>>>>>>> I haven't had a chance to bisect it yet, but it might be a >>>>>>>>>>>>>>>>> similar >>>>>>>>>>>>>>>>> problem. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Sounds like the same issue to me. Thx for the good test >>>>>>>>>>>>>>>> case. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Any idea what is wrong with it? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Actually I already wondered that it went so smooth >>>>>>>>>>>>>>>> without any >>>>>>>>>>>>>>>> regression >>>>>>>>>>>>>>>> so far, didn't noticed the bug in bugzilla.kernel.org yet. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Some of the tests allocate a lot of MSAA textures and the >>>>>>>>>>>>>>>>> tests >>>>>>>>>>>>>>>>> also >>>>>>>>>>>>>>>>> run in parallel, which creates a lot of memory pressure >>>>>>>>>>>>>>>>> and >>>>>>>>>>>>>>>>> probably >>>>>>>>>>>>>>>>> causes buffer evictions. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Sounds like the underlying problem to me. We probably evict >>>>>>>>>>>>>>>> some >>>>>>>>>>>>>>>> part of a >>>>>>>>>>>>>>>> page table without updating the page directory. Going to >>>>>>>>>>>>>>>> dig >>>>>>>>>>>>>>>> into >>>>>>>>>>>>>>>> it today, >>>>>>>>>>>>>>>> it's probably just a one liner missing somewhere in the VM >>>>>>>>>>>>>>>> code. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Christian. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Am 09.05.2014 23:39, schrieb Grigori Goronzy: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 09.05.2014 20:03, Marek Ol??k wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> This commit which first appeared in 3.15-rc1 causes >>>>>>>>>>>>>>>>>> hangs on >>>>>>>>>>>>>>>>>> Bonaire: >>>>>>>>>>>>>>>>>> [...] >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The simplest way to reproduce the hangs is to run piglit >>>>>>>>>>>>>>>>>> with >>>>>>>>>>>>>>>>>> these >>>>>>>>>>>>>>>>>> parameters: >>>>>>>>>>>>>>>>>> -t texelFetch.fs >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Some of the tests allocate a lot of MSAA textures and the >>>>>>>>>>>>>>>>>> tests >>>>>>>>>>>>>>>>>> also >>>>>>>>>>>>>>>>>> run in parallel, which creates a lot of memory >>>>>>>>>>>>>>>>>> pressure and >>>>>>>>>>>>>>>>>> probably >>>>>>>>>>>>>>>>>> causes buffer evictions. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I see hangs with kernel 3.15 and SI under memory pressure, >>>>>>>>>>>>>>>>> e.g. >>>>>>>>>>>>>>>>> if >>>>>>>>>>>>>>>>> I boot >>>>>>>>>>>>>>>>> with radeon.vramlimit=256 and then run Xonotic timedemo >>>>>>>>>>>>>>>>> with >>>>>>>>>>>>>>>>> high >>>>>>>>>>>>>>>>> settings. >>>>>>>>>>>>>>>>> I haven't had a chance to bisect it yet, but it might be a >>>>>>>>>>>>>>>>> similar >>>>>>>>>>>>>>>>> problem. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Grigori >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> dri-devel mailing list >>>>>>>>>>> dri-devel at lists.freedesktop.org >>>>>>>>>>> http://lists.freedesktop.org/mailman/listinfo/dri-devel >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>>> >>> > > _______________________________________________ > dri-devel mailing list > dri-devel at lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel