On 29 October 2013 12:04, Eric Anholt <e...@anholt.net> wrote: > Paul Berry <stereotype...@gmail.com> writes: > > > On 29 October 2013 11:20, Eric Anholt <e...@anholt.net> wrote: > > > >> Paul Berry <stereotype...@gmail.com> writes: > >> > >> > On 27 October 2013 12:58, Marek Olšák <mar...@gmail.com> wrote: > >> > > >> >> On Sun, Oct 27, 2013 at 4:19 PM, Paul Berry <stereotype...@gmail.com > > > >> >> wrote: > >> >> > On 20 October 2013 09:20, Marek Olšák <mar...@gmail.com> wrote: > >> >> >> > >> >> >> From: Marek Olšák <marek.ol...@amd.com> > >> >> >> > >> >> >> This avoids a defect in lower_output_reads. > >> >> >> > >> >> >> The problem is lower_output_reads treats the gl_FragData array as > a > >> >> single > >> >> >> variable. It first redirects all output writes to a temporary > >> variable > >> >> >> (array) > >> >> >> and then writes the whole temporary variable to the output, > >> generating > >> >> >> assignments to all elements of gl_FragData. > >> >> > > >> >> > > >> >> > Can you go into more detail about why this is a problem? At first > >> >> glance it > >> >> > seems like it should be fine, because failing to assign to an > element > >> of > >> >> > gl_FragData is supposed to result in undefined data going to that > >> >> fragment > >> >> > output. So lowering to a shader that does assign to that element > of > >> >> > gl_FragData seems like it should be harmless. What am I missing > here? > >> >> > >> >> Thanks for the review. The problem is drivers cannot eliminate > useless > >> >> writes to gl_FragData, and each enabled gl_FragData output decreases > >> >> performance. The GLSL compiler cannot eliminate the writes either, > >> >> because gl_FragData is an array. > >> >> > >> >> Maybe the GLSL compiler should resize arrays based on which elements > >> >> are written, so that unused elements are not declared, but this is > not > >> >> enough for gl_FragData, where the i-th output can be written, but > >> >> (i-1)-th output doesn't have to be. > >> >> > >> > > >> > Ah, ok. When I saw the word "defect", I misunderstood you to be > fixing a > >> > correctness-of-rendering bug. As a performance optimization, I get it > >> now. > >> > > >> > For driver back-ends that don't need lower_output_reads (such as i915 > and > >> > i965), this optimization isn't needed. Would you mind adding a flag > to > >> > ShaderCompilerOptions so that tgsi-based drivers can opt in to this > new > >> > optimization? I want to make sure that the code generation of i965 > and > >> > i915 isn't affected. > >> > >> Actually, if Marek has identified that applications not setting all > >> their gl_FragData[] is a performance issue, I want to see this applied > >> to i965, too. This optimization would remove entire FB_WRITE opcodes > >> for us, which could be pretty significant if applications hit this case. > >> > > > > Yes, but Marek's patch won't benefit i965 even in that case. The only > > effect of Marek's patch is to prevent lower_output_reads (which i965 > > doesn't use) from converting a shader that writes to only some elements > of > > gl_FragData into a shader that writes to all elements of gl_FragData. > > > > To optimize i965 for shaders that don't write to all elements of > > gl_FragData we need to do back-end work that's unrelated to Marek's > patch. > > Oh, I thought Marek's patch was going to take a program that did: > > out vec4 gl_FragData[4]; > > gl_FragData[0] = vec4(whatever); > gl_FragData[3] = vec4(whatever); > > and turn it into: > > out vec4 gl_FragData_0; > out vec4 gl_FragData_3; > > gl_FragData_0 = vec4(whatever); > gl_FragData_3 = vec4(whatever); > > which would have had an effect. >
Shoot, you're right. Man, my batting average for code review isn't doing too hot this week. I now agree with Eric that this optimization makes sense for i965 (and in fact, probably makes sense for all back ends). No need to set up a mechanism to allow back ends to opt out of it. Sorry for the noise.
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev