On Tue, Nov 24, 2015 at 11:07:54PM -0800, Kenneth Graunke wrote: > On Tuesday, November 24, 2015 05:17:29 PM Matt Turner wrote: > > It's called by the inline intel_batchbuffer_begin() function which > > itself is used in BEGIN_BATCH. So in sequence of code emitting multiple > > packets, we have inlined this ~200 byte function multiple times. Making > > it an out-of-line function presumably improved icache usage. > > > > Improves performance of Gl32Batch7 by 3.39898% +/- 0.358674% (n=155) on > > Ivybridge. > > That's kind of sad. When I added the render ring prelude code to this > function, Eric was concerned about overhead like this. I do wonder > whether we'd be better off doing explicit ring switching, like I did > on the 'ringswitch' branch of my tree. That kills a bunch of the code > on every BEGIN_BATCH().
Also note that I sent a bunch of patches earlier to remove the extra code from BEGIN_BATCH and a ton of unnecessary work in relocation and batch construction. In total there's about 20% (more on byt/bsw where we couldn't emit batches fast enough to keep the GPU busy) on *Batch7. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev