Am 13.06.2014 23:31, schrieb Alex Deucher: > On Fri, Jun 13, 2014 at 11:45 AM, Christian K?nig > <deathsimple at vodafone.de> wrote: >> Hi Marek, >> >> ah, yes! Piglit in combination with that patch can indeed crash the box. >> >> Going to investigate now that I can reproduce it. > I wonder if it's a clockgating issue with the MC or BIF? You might > try adjusting the rdev->cg_flags (try setting it to 0) in > radeon_asic.c or disabling dpm.
Unfortunately that was just a false alarm. I was just on a branch which didn't had the "stop poisoning the GART TLB" patch, after applying this patch I can again let piglit run for the whole night without a lockup. No idea what goes wrong when Marek runs piglit, but 3.15.0+"stop poisoning the GART TLB"+"force_gtt" is rock solid here. Christian. > > Alex > >> Thanks, >> Christian. >> >> Am 13.06.2014 15:19, schrieb Marek Ol??k: >> >>> Hi, >>> >>> With my "force_gtt" patch, Cape Verde is unstable too, so all GCN >>> chips are affected. >>> >>> I recommend applying that patch, because it will reproduce the problem >>> faster. Without it, the hangs are very rare and it may take a while >>> before they occur. >>> >>> Marek >>> >>> On Thu, Jun 12, 2014 at 1:23 PM, Christian K?nig >>> <deathsimple at vodafone.de> wrote: >>>> Please do so, and you might want to try 3.15.0 as well. >>>> >>>> I've tested multiple piglit runs over night with my Bonaire and 3.15.0 >>>> and >>>> that seemed to work perfectly fine. >>>> >>>> Going to test Alex drm-next-3.16 a bit more as well. >>>> >>>> Christian. >>>> >>>> Am 11.06.2014 12:56, schrieb Marek Ol??k: >>>> >>>>> I only tested Bonaire. I can test Cape Verde if needed. >>>>> >>>>> Marek >>>>> >>>>> On Wed, Jun 11, 2014 at 11:29 AM, Christian K?nig >>>>> <deathsimple at vodafone.de> wrote: >>>>>> Crap, I already wanted to check back with you if that really fixes your >>>>>> problems. >>>>>> >>>>>> Thanks for the info, this crash also only happens on CIK doesn't it? >>>>>> >>>>>> Christian. >>>>>> >>>>>> Am 11.06.2014 01:30, schrieb Marek Ol??k: >>>>>> >>>>>>> Sorry to tell you the bad news. This patch doesn't fix the hangs on my >>>>>>> machine. >>>>>>> >>>>>>> I tested drm-next-3.16 from Alex's tree. I also switched copying from >>>>>>> SDMA to CP DMA, which hung too. >>>>>>> >>>>>>> I also tried this: >>>>>>> >>>>>>> git checkout (the problematic commit): >>>>>>> 6d2f294 - drm/radeon: use normal BOs for the page tables v4 >>>>>>> >>>>>>> git cherry-pick (fixes): >>>>>>> 0e97703c - drm/radeon: add define for flags used in R600+ GTT >>>>>>> 0986c1a5 - drm/radeon: stop poisoning the GART TLB >>>>>>> 4906f689 - drm/radeon: fix page directory update size estimation >>>>>>> 4b095566 - drm/radeon: fix buffer placement under memory pressure v2 >>>>>>> >>>>>>> Then I tested both SDMA and CP DMA copying. Both were unstable. >>>>>>> >>>>>>> Testing was done with piglit / quick.tests. >>>>>>> >>>>>>> Marek >>>>>>> >>>>>>> >>>>>>> On Wed, Jun 4, 2014 at 3:29 PM, Christian K?nig >>>>>>> <deathsimple at vodafone.de> >>>>>>> wrote: >>>>>>>> From: Christian K?nig <christian.koenig at amd.com> >>>>>>>> >>>>>>>> When we set the valid bit on invalid GART entries they are >>>>>>>> loaded into the TLB when an adjacent entry is loaded. This >>>>>>>> poisons the TLB with invalid entries which are sometimes >>>>>>>> not correctly removed on TLB flush. >>>>>>>> >>>>>>>> For stable inclusion the patch probably needs to be modified a bit. >>>>>>>> >>>>>>>> Signed-off-by: Christian K?nig <christian.koenig at amd.com> >>>>>>>> Cc: stable at vger.kernel.org >>>>>>>> --- >>>>>>>> drivers/gpu/drm/radeon/rs600.c | 5 ++++- >>>>>>>> 1 file changed, 4 insertions(+), 1 deletion(-) >>>>>>>> >>>>>>>> diff --git a/drivers/gpu/drm/radeon/rs600.c >>>>>>>> b/drivers/gpu/drm/radeon/rs600.c >>>>>>>> index 0a8be63..e0465b2 100644 >>>>>>>> --- a/drivers/gpu/drm/radeon/rs600.c >>>>>>>> +++ b/drivers/gpu/drm/radeon/rs600.c >>>>>>>> @@ -634,7 +634,10 @@ int rs600_gart_set_page(struct radeon_device >>>>>>>> *rdev, >>>>>>>> int i, uint64_t addr) >>>>>>>> return -EINVAL; >>>>>>>> } >>>>>>>> addr = addr & 0xFFFFFFFFFFFFF000ULL; >>>>>>>> - addr |= R600_PTE_GART; >>>>>>>> + if (addr == rdev->dummy_page.addr) >>>>>>>> + addr |= R600_PTE_SYSTEM | R600_PTE_SNOOPED; >>>>>>>> + else >>>>>>>> + addr |= R600_PTE_GART; >>>>>>>> writeq(addr, ptr + (i * 8)); >>>>>>>> return 0; >>>>>>>> } >>>>>>>> -- >>>>>>>> 1.9.1 >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> dri-devel mailing list >>>>>>>> dri-devel at lists.freedesktop.org >>>>>>>> http://lists.freedesktop.org/mailman/listinfo/dri-devel >>>>>>