On 13.05.2014 21:50, Marek Ol??k wrote: > Hi Christian, > > The performance regression I saw with piglit seems to be fixed with > latest kernel git. It's difficult to bisect the kernel, because there > are only merges between 3.14 and 3.15 and the merged committs are > actually based on 3.14-rc1 and 3.14-rc4. > > All seems to be fine with your fixes. >
Which fixes have you applied? There are quite a few pending patches on dri-devel, that aren't yet part of drm-fixes-3.15. Grigori > Marek > > On Tue, May 13, 2014 at 5:31 PM, Christian K?nig > <deathsimple at vodafone.de> wrote: >> Is the performance regression regression caused by the page table changes or >> something else? >> >> I did made some tests with xonotic while developing it and it didn't showed >> anything obvious, but I didn't made tests on different systems. >> >> Christian. >> >> Am 13.05.2014 17:19, schrieb Marek Ol??k: >> >>> Your latest patches fix the regression. >>> >>> The performance regression can also be reproduced with piglit "-t >>> texelFetch.fs". >>> >>> Kernel 3.14: >>> real 0m17.724s >>> user 0m41.905s >>> sys 0m11.299s >>> >>> The problematic commit checked out + your fixes (without the PTE patch I >>> think): >>> real 0m23.474s >>> user 1m1.008s >>> sys 0m13.812s >>> >>> Marek >>> >>> >>> On Tue, May 13, 2014 at 3:57 PM, Christian K?nig >>> <deathsimple at vodafone.de> wrote: >>>> >>>> Am 13.05.2014 15:22, schrieb Alex Deucher: >>>> >>>>> On Mon, May 12, 2014 at 7:38 PM, Grigori Goronzy <greg at chown.ath.cx> >>>>> wrote: >>>>>> >>>>>> I can confirm this fixes it for me, too. >>>>>> >>>>>> 3.15 with these fixes and the large PTE patches actually ends up being >>>>>> noticeably slower than earlier kernels with Xonotic, though. I wonder >>>>>> what's >>>>>> going on. >>>>> >>>>> Allocation overhead? >>>> >>>> >>>> Unlikely, Xonotic just allocates a single page table at start, which then >>>> gets extended to a certain rate until they no longer need more address >>>> space >>>> and are done with it. >>>> >>>> Grigori, can you bisect and/or try to figure out what's wrong here? >>>> >>>> Christian. >>>> >>>> >>>>> >>>>>> Grigori >>>>>> >>>>>> >>>>>> On 12.05.2014 14:50, Christian K?nig wrote: >>>>>>> >>>>>>> I could reproduce the problem with xonotic and I think I've found the >>>>>>> issue. >>>>>>> >>>>>>> Please test the attached patch. >>>>>>> >>>>>>> Thanks, >>>>>>> Christian. >>>>>>> >>>>>>> Am 11.05.2014 11:06, schrieb Christian K?nig: >>>>>>>>> >>>>>>>>> I have tested it and it doesn't fix the hangs. >>>>>>>> >>>>>>>> Yeah, thought so. Well it was just a guess. >>>>>>>> >>>>>>>>> (Also, I don't like the patch, because it reverts the behavior I >>>>>>>>> added >>>>>>>>> for userspace buffers.) >>>>>>>> >>>>>>>> Actually it shouldn't affect that. The alternative domain always >>>>>>>> contains GART even when userspace only specified VRAM as placement >>>>>>>> (as >>>>>>>> long as it is technical possible to do so). >>>>>>>> >>>>>>>> So what should happen is that TTM sees the current placement, matches >>>>>>>> that with the desired placement and should find that it doesn't need >>>>>>>> to move the buffer (we should just test if this behavior really works >>>>>>>> as expected). >>>>>>>> >>>>>>>> Christian. >>>>>>>> >>>>>>>> Am 10.05.2014 23:38, schrieb Marek Ol??k: >>>>>>>>> >>>>>>>>> Hi Christian, >>>>>>>>> >>>>>>>>> I have tested it and it doesn't fix the hangs. >>>>>>>>> >>>>>>>>> (Also, I don't like the patch, because it reverts the behavior I >>>>>>>>> added >>>>>>>>> for userspace buffers.) >>>>>>>>> >>>>>>>>> Marek >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Sat, May 10, 2014 at 6:34 PM, Christian K?nig >>>>>>>>> <deathsimple at vodafone.de> wrote: >>>>>>>>>> >>>>>>>>>> Couldn't reproduce the issue so far. So the attached patch is just >>>>>>>>>> a >>>>>>>>>> complete shoot into the dark found by rereading the code, but it >>>>>>>>>> might >>>>>>>>>> actually be the problem. >>>>>>>>>> >>>>>>>>>> Please give it a try. >>>>>>>>>> >>>>>>>>>> Going to keep testing in the meantime, >>>>>>>>>> Christian. >>>>>>>>>> >>>>>>>>>> Am 10.05.2014 10:23, schrieb Christian K?nig: >>>>>>>>>> >>>>>>>>>>>> I see hangs with kernel 3.15 and SI under memory pressure, e.g. >>>>>>>>>>>> if >>>>>>>>>>>> I boot >>>>>>>>>>>> with radeon.vramlimit=256 and then run Xonotic timedemo with high >>>>>>>>>>>> settings. >>>>>>>>>>>> I haven't had a chance to bisect it yet, but it might be a >>>>>>>>>>>> similar >>>>>>>>>>>> problem. >>>>>>>>>>> >>>>>>>>>>> Sounds like the same issue to me. Thx for the good test case. >>>>>>>>>>> >>>>>>>>>>>> Any idea what is wrong with it? >>>>>>>>>>> >>>>>>>>>>> Actually I already wondered that it went so smooth without any >>>>>>>>>>> regression >>>>>>>>>>> so far, didn't noticed the bug in bugzilla.kernel.org yet. >>>>>>>>>>> >>>>>>>>>>>> Some of the tests allocate a lot of MSAA textures and the tests >>>>>>>>>>>> also >>>>>>>>>>>> run in parallel, which creates a lot of memory pressure and >>>>>>>>>>>> probably >>>>>>>>>>>> causes buffer evictions. >>>>>>>>>>> >>>>>>>>>>> Sounds like the underlying problem to me. We probably evict some >>>>>>>>>>> part of a >>>>>>>>>>> page table without updating the page directory. Going to dig into >>>>>>>>>>> it today, >>>>>>>>>>> it's probably just a one liner missing somewhere in the VM code. >>>>>>>>>>> >>>>>>>>>>> Christian. >>>>>>>>>>> >>>>>>>>>>> Am 09.05.2014 23:39, schrieb Grigori Goronzy: >>>>>>>>>>>> >>>>>>>>>>>> On 09.05.2014 20:03, Marek Ol??k wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> This commit which first appeared in 3.15-rc1 causes hangs on >>>>>>>>>>>>> Bonaire: >>>>>>>>>>>>> [...] >>>>>>>>>>>>> >>>>>>>>>>>>> The simplest way to reproduce the hangs is to run piglit with >>>>>>>>>>>>> these >>>>>>>>>>>>> parameters: >>>>>>>>>>>>> -t texelFetch.fs >>>>>>>>>>>>> >>>>>>>>>>>>> Some of the tests allocate a lot of MSAA textures and the tests >>>>>>>>>>>>> also >>>>>>>>>>>>> run in parallel, which creates a lot of memory pressure and >>>>>>>>>>>>> probably >>>>>>>>>>>>> causes buffer evictions. >>>>>>>>>>>>> >>>>>>>>>>>> I see hangs with kernel 3.15 and SI under memory pressure, e.g. >>>>>>>>>>>> if >>>>>>>>>>>> I boot >>>>>>>>>>>> with radeon.vramlimit=256 and then run Xonotic timedemo with high >>>>>>>>>>>> settings. >>>>>>>>>>>> I haven't had a chance to bisect it yet, but it might be a >>>>>>>>>>>> similar >>>>>>>>>>>> problem. >>>>>>>>>>>> >>>>>>>>>>>> Grigori >>>>>>>>>>> >>>>>>>>>>> >>>>>> _______________________________________________ >>>>>> dri-devel mailing list >>>>>> dri-devel at lists.freedesktop.org >>>>>> http://lists.freedesktop.org/mailman/listinfo/dri-devel >>>> >>>> >>