On 08/18/2016 09:21 PM, Marek Olšák wrote: > On Thu, Aug 18, 2016 at 4:23 AM, Michel Dänzer <michel at daenzer.net> wrote: >> Maybe the rasterization as two triangles results in bad PCIe bandwidth >> utilization. Using the asynchronous DMA engine for these transfers would >> probably be ideal, but having the 3D engine rasterize a single rectangle >> (either using the rectangle primitive or a large triangle with scissor) >> might already help. > > There is only one thing that's bad for PCIe when the surface is > linear: the 3D engine. Disabling all but the first shader engine and > all but the first 2 RBs should improve performance for blits from VRAM > to GTT. The closed driver does that, but I don't remember if the > destination must be linear, must be in GTT, or both. In any case, SDMA > should still be the best for VRAM->GTT blits. > > Marek >
Friday evening education question: So if you have multiple render backends active they compete for PCIe bus access and some kind of "trashing" happens in the arbitration, drastically reducing the bandwidth? thanks, -mario