On Monday, August 22, 2016 5:50:49 PM PDT Francisco Jerez wrote: > Kenneth Graunke <kenn...@whitecape.org> writes: > > > Many GPUs cannot handle GL_KHR_blend_equation_advanced natively, and > > need to emulate it in the pixel shader. This lowering pass implements > > all the necessary math for advanced blending. It fetches the existing > > framebuffer value using the MESA_shader_framebuffer_fetch built-in > > variables, and the previous commit's state var uniform to select > > which equation to use. > > > > This is done at the GLSL IR level to make it easy for all drivers to > > implement the GL_KHR_blend_equation_advanced extension and share code. > > > > Drivers need to hook up MESA_shader_framebuffer_fetch functionality: > > 1. Hook up the fb_fetch_output variable > > 2. Implement BlendBarrier() > > > > Then to get KHR_blend_equation_advanced, they simply need to: > > 3. Disable hardware blending based on ctx->Color._AdvancedBlendEnabled > > 4. Call this lowering pass. > > > > Very little driver specific code should be required. > > > > v2: Handle multiple output variables per render target (which may exist > > due to ARB_enhanced_layouts), and array variables (even with one > > render target, we might have out vec4 color[1]), and non-vec4 > > variables (it's easier than finding spec text to justify not > > handling it). Thanks to Francisco Jerez for the feedback. > > > > Signed-off-by: Kenneth Graunke <kenn...@whitecape.org> > > --- > > src/compiler/Makefile.sources | 1 + > > src/compiler/glsl/ir_optimization.h | 1 + > > .../glsl/lower_blend_equation_advanced.cpp | 557 > > +++++++++++++++++++++ > > 3 files changed, 559 insertions(+) > > create mode 100644 src/compiler/glsl/lower_blend_equation_advanced.cpp > > > > diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources > > index 0ff9b23..fe26132 100644 > > --- a/src/compiler/Makefile.sources > > +++ b/src/compiler/Makefile.sources > > @@ -77,6 +77,7 @@ LIBGLSL_FILES = \ > > glsl/loop_analysis.h \ > > glsl/loop_controls.cpp \ > > glsl/loop_unroll.cpp \ > > + glsl/lower_blend_equation_advanced.cpp \ > > glsl/lower_buffer_access.cpp \ > > glsl/lower_buffer_access.h \ > > glsl/lower_const_arrays_to_uniforms.cpp \ > > diff --git a/src/compiler/glsl/ir_optimization.h > > b/src/compiler/glsl/ir_optimization.h > > index c29260a..3bd6928 100644 > > --- a/src/compiler/glsl/ir_optimization.h > > +++ b/src/compiler/glsl/ir_optimization.h > > @@ -151,6 +151,7 @@ void optimize_dead_builtin_variables(exec_list > > *instructions, > > bool lower_tess_level(gl_linked_shader *shader); > > > > bool lower_vertex_id(gl_linked_shader *shader); > > +bool lower_blend_equation_advanced(gl_linked_shader *shader); > > > > bool lower_subroutine(exec_list *instructions, struct > > _mesa_glsl_parse_state *state); > > void propagate_invariance(exec_list *instructions); > > diff --git a/src/compiler/glsl/lower_blend_equation_advanced.cpp > > b/src/compiler/glsl/lower_blend_equation_advanced.cpp > > new file mode 100644 > > index 0000000..5632865 > > --- /dev/null > > +++ b/src/compiler/glsl/lower_blend_equation_advanced.cpp > > @@ -0,0 +1,557 @@ > > +/* > > + * Copyright © 2016 Intel Corporation > > + * > > + * Permission is hereby granted, free of charge, to any person obtaining a > > + * copy of this software and associated documentation files (the > > "Software"), > > + * to deal in the Software without restriction, including without > > limitation > > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > > + * and/or sell copies of the Software, and to permit persons to whom the > > + * Software is furnished to do so, subject to the following conditions: > > + * > > + * The above copyright notice and this permission notice (including the > > next > > + * paragraph) shall be included in all copies or substantial portions of > > the > > + * Software. > > + * > > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS > > OR > > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR > > OTHER > > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING > > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER > > + * DEALINGS IN THE SOFTWARE. > > + */ > > + > > +#include "ir.h" > > +#include "ir_builder.h" > > +#include "ir_optimization.h" > > +#include "ir_hierarchical_visitor.h" > > +#include "program/prog_instruction.h" > > +#include "program/prog_statevars.h" > > +#include "util/bitscan.h" > > + > > +using namespace ir_builder; > > + > > +#define imm1(x) new(mem_ctx) ir_constant((float) x, 1) > > +#define imm3(x) new(mem_ctx) ir_constant((float) x, 3) > > + > > The x argument should probably be between parentheses to avoid surprises > in the future -- Or even better, make them static inline functions > instead of macros.
Good call, I'll add parenthesis. I made them macros rather than static inlines so you can do imm3(0.5) rather than imm3(mem_ctx, 0.5)...it's shorter (which is the main point). [snip] > > + blend_comps[0], blend_comps[1], > > + blend_comps[2], blend_comps[3]); > > + } > > + > > + ir_function_signature *main = get_main(sh); > > + ir_factory f(&main->body, mem_ctx); > > + > > + ir_variable *result_dest = > > + calc_blend_result(f, mode, fb, blend_source, sh->info.BlendSupport); > > + > > + /* Copy the result back to the original values. It would be simpler > > + * to demote the program's output variables, and create a new vec4 > > + * output for our result, but this pass runs before we create the > > + * ARB_program_interface_query resource list. So we have to leave > > + * the original outputs in place and use them. > > + */ > > + for (int i = 0; i < 4; i++) { > > + if (!outputs[i]) > > + continue; > > + > > + f.emit(assign(deref_output(outputs[i]), swizzle(result_dest, i, 1), > > + 1 << i)); > > + } > > + > > It looks like this relies on the main function having a single exit > point, which AFAICT may not be the case if there are multiple return > statements and the lowering pass is called before do_lower_jumps() (as > is the case in your i965 code). Several options occur to me: > > - Rename the shader's main function and generate a new main function > that calls into the original and implements the advanced blending > epilogue. This has the advantage that it wouldn't force drivers to > lower returns if they can support them natively, and is likely to > result in better code generation than the next option. > > - Make the lowering pass re-emit the blending epilogue for each exit > point of the main function. > > - Call do_lower_jumps(lower_main_return=true) manually at the top of > lower_blend_equation_advanced() if support for any advanced blending > modes was requested in order to make sure the function has the right > form. This seems a bit of a hack but I guess it would be the easiest > solution. Ugh. Good point. Option #2 seems unfortunate - I'd hate to emit this multiple times. In a world where all functions are inlined, options #1 and #3 seem basically equivalent - function inlining lowers early returns. So the renamed main would have early returns lowered anyway. I think I'll take option #3 and change the function to: bool lower_blend_equation_advanced(struct gl_linked_shader *sh) { if (sh->info.BlendSupport == 0) return false; /* Lower early returns in main() so there's a single exit point * where we can insert our lowering code. */ do_lower_jumps(sh->ir, false, false, true, false, false);
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev