On Wed, May 10, 2017 at 6:45 PM, Marek Olšák <mar...@gmail.com> wrote: > Hi, > > This series adds an optional module into gallium/util that wraps > around pipe_context and moves execution of all pipe_context calls into > a separate thread. > > It puts a lot of new requirements on the driver, especially on thread- > safetiness of pipe_context functions, and even expects different > behavior from pipe_context in some cases, so it may be non-trivial > to enable. All of it is necessary to have a perfectly scalable > threaded execution. (Any new drivers should be built around it from > the beginning) > > The performance improvement isn't very high (it's just hiding overhead > of pipe_context only), but I can tell you and I have tested a lot of > apps with this, it really doesn't sync the thread with majority of > apps except for SwapBuffers. > > It can do these: > - unsychronized buffer mappings don't sync > - ordinary buffer mappings are promoted to unsynchronized when it's safe > - full buffer invalidations are implemented as reallocations and don't sync > - partial buffer invalidations are implemented as copy_buffer and don't sync
interesting.. maybe I can drop some of the resource shadowing tricks I added in freedreno to avoid mid frame texture uploads or UBO updates from triggering a flush / tile-pass.. BR, -R > - get_query_result doesn't sync when the threaded context has seen flush() > (i.e. get_query_result is contextless in that case) > > Missing: > - deferred fences - mainly Bioshock Infinite might benefit > - texture mappings (meaning CPU access) always sync, texture_subdata > doesn't sync for small uploads only, but we can make all texture > uploads asynchronous by simply copying what is done for buffers > > Note that it has a very low overhead when it's always synchronous > (i.e. not multithreaded), because it's really fast to enqueue and > execute calls. The worst case scenario might be -3% performance (just > guessing here). > > All requirements on Gallium drivers and other information can be found > in the header file: > https://cgit.freedesktop.org/~mareko/mesa/tree/src/gallium/auxiliary/util/u_threaded_context.h?h=gallium-threaded2#n26 > > RadeonSI enables threaded Gallium by default for OpenGL Core and > Compatibility profiles and all OpenGL ES variants. > > There is a small performance concern for RadeonSI: If non-contiguous > VRAM mappings are not supported (amdgpu - kernel 4.11 and older, > radeon - all kernels), the performance difference might be negative, > because buffer invalidations are done unconditionally, meaning that > there can be more live and mapped VRAM buffers. It's difficult to tell > whether any real apps are affected in a measurable way. > > Here are performance numbers: > > APPS: MORE IS BETTER > Alien Isolation: +16% > Bioshock Infinite: +13% > Borderlands 2: +12% > Civilization 5: +12% > Civilization 6: +10% > CS:GO: +8% > ET Legacy: +12% > Openarena: +27% > Talos Principle (high details, 1680x1050 internal resolution): +17% > glmark2: no change in the final score > > When games are GPU-bound: no change > > Because of not taking advantage of deferred fences, Bioshock runs > 80% of time asynchronously and 20% of time synchronously. > All other games run 100% of time asynchronously. > > x11perf: MORE IS BETTER > x11perf: Test: 500px PutImage Square: -3% > x11perf: Test: Scrolling 500 x 500 px: +16% > x11perf: Test: Char in 80-char aa line: +13% > x11perf: Test: PutImage XY 500x500 Square: +1% > x11perf: Test: Fill 300 x 300px AA Trapezoid: NO CHANGE > x11perf: Test: 500px Copy From Window To Window: +14% > x11perf: Test: Copy 500x500 From Pixmap To Pixmap: -1% > x11perf: Test: 500px Compositing From Pixmap To Window: +21% > x11perf: Test: 500px Compositing From Window To Window: +18% > > gtkperf: LESS IS BETTER > gtkperf: GTK Widget: Total Time: -2% > gtkperf: GTK Widget: GtkComboBox: +7% > gtkperf: GTK Widget: GtkCheckButton: -15% > gtkperf: GTK Widget: GtkRadioButton: -13% > gtkperf: GTK Widget: GtkToggleButton: -2% > gtkperf: GTK Widget: GtkComboBoxEntry: -1% > gtkperf: GTK Widget: GtkTextView - Scroll: NO CHANGE > gtkperf: GTK Widget: GtkTextView - Add Text: NO CHANGE > gtkperf: GTK Widget: GtkDrawingArea - Circles: -9% > gtkperf: GTK Widget: GtkDrawingArea - Pixbufs: -3% > > Hence the decision to enable it by default. > > Please review. > > Marek > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev