On Mit, 2012-04-18 at 21:17 +1000, Benjamin Herrenschmidt wrote: > On Wed, 2012-04-18 at 12:34 +0200, Michel Dänzer wrote: > > On Mit, 2012-04-18 at 20:20 +1000, Benjamin Herrenschmidt wrote: > > > On Wed, 2012-04-18 at 10:02 +0200, Michel Dänzer wrote: > > > > > > > > > GPU lockup appears to be a common problem with the radeon driver. > > > > > > > > It's what happens when anything goes wrong with the GPU. If it doesn't > > > > happen with agpmode=-1, it's probably an AGP related coherency issue. > > > > > > I had some success hacking the DRM to do an in_le32 from the ring head > > > after writing it. Just a gross hack but it seemed to help on a G5. > > > > AFAICT radeon_ring_commit() does that already: > > > > DRM_MEMORYBARRIER(); > > WREG32(ring->wptr_reg, (ring->wptr << ring->ptr_reg_shift) & > > ring->ptr_reg_mask); > > (void)RREG32(ring->wptr_reg); > > > > We added the readback about a decade ago. :) > > Hrm, I have a different hack in that old tree I was playing with a while > back, let me see... > > --- a/drivers/gpu/drm/radeon/radeon_cp.c > +++ b/drivers/gpu/drm/radeon/radeon_cp.c
Note that radeon_cp.c is UMS code, for KMS you need to look at radeon_ring.c. > @@ -2245,6 +2245,9 @@ void radeon_commit_ring(drm_radeon_private_t > *dev_priv) > DRM_MEMORYBARRIER(); > GET_RING_HEAD( dev_priv ); > > +#ifdef CONFIG_PPC > + in_be32(dev_priv->ring.start); > +#endif > if ((dev_priv->flags & RADEON_FAMILY_MASK) >= CHIP_R600) { > > > I think that my rational was to ensure that all previous stores to > AGP (indirect buffers etc...) were pushed out & ordered vs the ring > wptr update or something like that, bcs I think those path aren't well > ordered in HW. In fact I suspect we might even need a bigger hammer than > that in_be32(). Probably wouldn't hurt trying something like that in the KMS code as well. > Another hack I had around was removing the SBA reset from agp-uninorth > completely on binding new pages, it seemed to cause hangs. You mean like commit 5613beb46d54da6ef7f1c4589e9f2e60eeb10721? :) > > > I suspect there's a fundamental design issue with apple bridge in that > > > the CPU to memory path isn't coherent at all with the GPU to memory path > > > ie. even vs. cache flush instructions (ie buffers in the memory > > > controllers can still be out of sync). > > > > > > Darwin does some gross hacks to work around that, some of them visible > > > in the AGP drivers, some burried in the Apple driver, I don't know for > > > sure. It's possible that they end up mapping all AGP memory as cache > > > inhibited, but we can't do that because of our linear mapping. > > > > We are doing that though... > > Are we really ? I thought we were taking existing cachable RAM objects > and mapping them into the AGP gart. No, the radeon driver has always mapped memory uncacheable to the CPU while it's bound into the AGP GART. > Are we replacing both kernel & user mappings for those objects with an > equivalent cache inhibited mapping ? > > I'm not -that- familiar with how ttm works here. I'm hardly more familiar with how it all works than you. :) > In any case it can cause bus checkstops because the same pages can be > prefetched into the cache via the linear mapping which is covered by > BATs So you've been saying for about a decade. :) But I've never seen any problems tracked down to that. > (unless you make your graphic objects HIGHMEM only but good luck with > that :-) FWIW I think TTM indeed prefers highmem pages for GPU access. The radeon driver normally doesn't need kernel mappings for them. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Debian, X and DRI developer _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev