On Wed, Jun 14, 2017 at 9:45 PM, Jose Fonseca <jfons...@vmware.com> wrote: > On 14/06/17 17:12, Marek Olšák wrote: >> >> On Tue, Jun 13, 2017 at 3:43 PM, Marek Olšák <mar...@gmail.com> wrote: >>> >>> On Tue, Jun 13, 2017 at 1:40 PM, Jose Fonseca <jfons...@vmware.com> >>> wrote: >>>> >>>> On 12/06/17 22:56, Marek Olšák wrote: >>>>> >>>>> >>>>> On Mon, Jun 12, 2017 at 10:43 PM, Jose Fonseca <jfons...@vmware.com> >>>>> wrote: >>>>>> >>>>>> >>>>>> On 12/06/17 21:25, Marek Olšák wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Mon, Jun 12, 2017 at 9:51 PM, Jose Fonseca <jfons...@vmware.com> >>>>>>> wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> How does this help exactly? >>>>>>>> >>>>>>>> Are applications actually rendering to the same FBO w/ and w/o SRGB >>>>>>>> decoding? >>>>>>>> >>>>>>>> Or is the problem here GL_SRGB_WRITE state getting spuriously >>>>>>>> dirtied >>>>>>>> by >>>>>>>> the >>>>>>>> application? >>>>>>>> >>>>>>>> And even if they do, why is toggling surface views in framebuffer >>>>>>>> state >>>>>>>> so >>>>>>>> expensive? >>>>>>>> >>>>>>>> I don't object per se, but it looks like an unusual thing to >>>>>>>> optimize >>>>>>>> for. >>>>>>>> >>>>>>> >>>>>>> set_framebuffer_state is basically a memory barrier. We have >>>>>>> different >>>>>>> caches between FB and textures and we have to flush them when a >>>>>>> texture is unbound from the framebuffer and set as a sampler view. To >>>>>>> keep thing simple, set_framebuffer_state is the barrier. When we >>>>>>> change the blend state, the barrier is avoided. Note that the barrier >>>>>>> makes set_framebuffer_state a function that is always GPU-bound. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> I see. >>>>>> >>>>>> And you're sure that the incoming set_framebuffer_state are not >>>>>> spurious? >>>>>> >>>>>> I know cso_context always eliminates redundant >>>>>> pipe_context::set_framebuffer_state calls, but it is perhaps possible >>>>>> that >>>>>> Mesa state tracker is reseting the framebuffer state with different >>>>>> surface >>>>>> views, but that in practice are exactly the same as the previous one? >>>>>> >>>>>> Like I said, it seems odd apps are doing this: it doesn't make much >>>>>> sense >>>>>> to >>>>>> me to change colorspace of the fragments between draws. (Unless some >>>>>> of >>>>>> the >>>>>> assets are already in SRGB and the app is trying to be too smart for >>>>>> its >>>>>> own >>>>>> good to avoid the sRGB->RGB->sRGB.) It seems much more likely that >>>>>> these >>>>>> framebuffer state changes are self-inflicted some where in our stack, >>>>>> than >>>>>> something truly demanded by the app. >>>>>> >>>>>> And if that's the case and we can fix it, then it would be a better >>>>>> solution >>>>>> all around. >>>>> >>>>> >>>>> >>>>> Yeah the funny part and the reason is that we have a microbenchmark in >>>>> piglit (drawoverhead) changing this state between draw calls. :) >>>>> >>>>> Marek >>>>> >>>> >>>> I couldn't find that piglit microbenchmark. mesademos has >>>> src/perf/drawoverhead.c but it doesn't set GL_SRGB_WRITE. So if fbo is >>>> changing internally, then it's a perf bug in Mesa state tracker. >>>> >>>> Unless it's mimicking something that real apps do, then it's probably >>>> better >>>> to fix the microbenchmark to use a more realistic tests. >>> >>> >>> If you build piglit, it's in bin/drawoverhead. >>> >>> You're right that this subtest (switching GL_FRAMEBUFFER_SRGB) is >>> rather artificial and fairly unlikely to occur with real apps. >> >> >> FYI, I'm dropping this series and I don't have it in my repo anymore. >> piglit/drawoverhead will be updated not to test this state change. >> >> Marek > > > Great. > > BTW, I'm not sure what's a good state to change in such microbenchmark. > > There is of course, a myriad of states to pick, but they are not all the > same: performance can vary wildly depending on the choice. I'm not sure > what's a good representative state change in such circumstances Perhaps > toggling between two texture objects? Or some sampler state?
If you've ever run the microbenchmark, you know there are plenty of state changes tested. I think there are like 15 state changes tested in about 60 subtests at the moment. I'm adding more tests into it. Currently I have 100 subtests in there locally. At the moment the missing subtests are mostly just shader resources: immutable textures (mutable textures i.e. not TexStorage-based are already tested), TBOs, images, image buffers, SSBOs (maybe), atomic counters (maybe). The methodology is 1 state change followed by 1 draw call in a loop, measuring the number of draw calls per second for that case, and comparing with the baseline draw rate (which is without the state change). Marek _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev