Ok, given what you said below, I think everything is sane. Reviewed-by: Jason Ekstrand <ja...@jlekstrand.net>
Note that I didn't actually verify that all of the code you deleted is going to work once we switch over to NIR handling it with varyings. On Wed, Jan 4, 2017 at 2:12 PM, Kenneth Graunke <kenn...@whitecape.org> wrote: > On Wednesday, January 4, 2017 1:35:55 PM PST Jason Ekstrand wrote: > > On Wed, Jan 4, 2017 at 3:07 AM, Kenneth Graunke <kenn...@whitecape.org> > > wrote: > > > > > Treating everything as scalar arrays allows us to drop a bunch of > > > special case input/output munging all throughout the backend. > > > Instead, we just need to remap the TessLevel components to the > > > appropriate patch URB header locations in remap_patch_urb_offsets(). > > > > > > We also switch to treating the TES input versions of these as ordinary > > > shader inputs rather than system values, as remap_patch_urb_offsets() > > > just makes everything work out without special handling. > > > > > > This regresses one Piglit test: > > > arb_tessellation_shader-large-uniforms/GL_TESS_CONTROL_ > > > SHADER-array-at-limit > > > > > > The compiler starts promoting the constant arrays assigned to > gl_TessLevel* > > > to uniform arrays. Since the shader also has a uniform array that uses > > > the maximum number of uniform components, this puts it over the uniform > > > component limit enforced by the linker. This is arguably a bug in the > > > constant array promotion code (it should avoid pushing us over limits), > > > but is unlikely to penalize any real application. > > > > > > Signed-off-by: Kenneth Graunke <kenn...@whitecape.org> > > > --- > > > src/mesa/drivers/dri/i965/brw_context.c | 2 +- > > > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 164 > > > +-------------------- > > > src/mesa/drivers/dri/i965/brw_nir.c | 74 +++++++++- > > > src/mesa/drivers/dri/i965/brw_nir.h | 3 +- > > > .../drivers/dri/i965/brw_nir_tcs_workarounds.c | 8 +- > > > src/mesa/drivers/dri/i965/brw_shader.cpp | 61 +------- > > > src/mesa/drivers/dri/i965/brw_shader.h | 5 - > > > src/mesa/drivers/dri/i965/brw_tcs.c | 5 +- > > > src/mesa/drivers/dri/i965/brw_tes.c | 17 +-- > > > src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp | 117 > +-------------- > > > 10 files changed, 92 insertions(+), 364 deletions(-) > > > > > > diff --git a/src/mesa/drivers/dri/i965/brw_context.c > > > b/src/mesa/drivers/dri/i965/brw_context.c > > > index 45490a0f5cf..22f872fe782 100644 > > > --- a/src/mesa/drivers/dri/i965/brw_context.c > > > +++ b/src/mesa/drivers/dri/i965/brw_context.c > > > @@ -672,7 +672,7 @@ brw_initialize_context_constants(struct > brw_context > > > *brw) > > > if (brw->gen >= 5 || brw->is_g4x) > > > ctx->Const.MaxClipPlanes = 8; > > > > > > - ctx->Const.LowerTessLevel = true; > > > + ctx->Const.GLSLTessLevelsAsInputs = true; > > > ctx->Const.LowerTCSPatchVerticesIn = brw->gen >= 8; > > > ctx->Const.LowerTESPatchVerticesIn = true; > > > ctx->Const.PrimitiveRestartForPatches = true; > > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > > > b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > > > index 2ed843bd03d..8f745dff440 100644 > > > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > > > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > > > @@ -2520,78 +2520,7 @@ fs_visitor::nir_emit_tcs_intrinsic(const > > > fs_builder &bld, > > > bld.MOV(patch_handle, > > > retype(brw_vec1_grf(0, 0), BRW_REGISTER_TYPE_UD)); > > > > > > - if (imm_offset == 0) { > > > - /* This is a read of gl_TessLevelInner[], which lives in > the > > > - * Patch URB header. The layout depends on the domain. > > > - */ > > > - dst.type = BRW_REGISTER_TYPE_F; > > > - switch (tcs_key->tes_primitive_mode) { > > > - case GL_QUADS: { > > > - /* DWords 3-2 (reversed) */ > > > - fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_F, 4); > > > - > > > - inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, tmp, > > > patch_handle); > > > - inst->offset = 0; > > > - inst->mlen = 1; > > > - inst->size_written = 4 * REG_SIZE; > > > - > > > - /* dst.xy = tmp.wz */ > > > - bld.MOV(dst, offset(tmp, bld, 3)); > > > - bld.MOV(offset(dst, bld, 1), offset(tmp, bld, 2)); > > > - break; > > > - } > > > - case GL_TRIANGLES: > > > - /* DWord 4; hardcode offset = 1 and size_written = > > > REG_SIZE */ > > > - inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, dst, > > > patch_handle); > > > - inst->offset = 1; > > > - inst->mlen = 1; > > > - inst->size_written = REG_SIZE; > > > - break; > > > - case GL_ISOLINES: > > > - /* All channels are undefined. */ > > > - break; > > > - default: > > > - unreachable("Bogus tessellation domain"); > > > - } > > > - } else if (imm_offset == 1) { > > > - /* This is a read of gl_TessLevelOuter[], which lives in > the > > > - * Patch URB header. The layout depends on the domain. > > > - */ > > > - dst.type = BRW_REGISTER_TYPE_F; > > > - > > > - fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_F, 4); > > > - inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, tmp, > > > patch_handle); > > > - inst->offset = 1; > > > - inst->mlen = 1; > > > - inst->size_written = 4 * REG_SIZE; > > > - > > > - /* Reswizzle: WZYX */ > > > - fs_reg srcs[4] = { > > > - offset(tmp, bld, 3), > > > - offset(tmp, bld, 2), > > > - offset(tmp, bld, 1), > > > - offset(tmp, bld, 0), > > > - }; > > > - > > > - unsigned num_components; > > > - switch (tcs_key->tes_primitive_mode) { > > > - case GL_QUADS: > > > - num_components = 4; > > > - break; > > > - case GL_TRIANGLES: > > > - num_components = 3; > > > - break; > > > - case GL_ISOLINES: > > > - /* Isolines are not reversed; swizzle .zw -> .xy */ > > > - srcs[0] = offset(tmp, bld, 2); > > > - srcs[1] = offset(tmp, bld, 3); > > > - num_components = 2; > > > - break; > > > - default: > > > - unreachable("Bogus tessellation domain"); > > > - } > > > - bld.LOAD_PAYLOAD(dst, srcs, num_components, 0); > > > - } else { > > > + { > > > if (first_component != 0) { > > > unsigned read_components = > > > instr->num_components + first_component; > > > @@ -2656,55 +2585,6 @@ fs_visitor::nir_emit_tcs_intrinsic(const > > > fs_builder &bld, > > > > > > if (indirect_offset.file != BAD_FILE) { > > > srcs[header_regs++] = indirect_offset; > > > - } else if (!is_passthrough_shader) { > > > - if (imm_offset == 0) { > > > - value.type = BRW_REGISTER_TYPE_F; > > > - > > > - mask &= (1 << tesslevel_inner_components( > tcs_key->tes_primitive_mode)) > > > - 1; > > > - > > > - /* This is a write to gl_TessLevelInner[], which lives in > the > > > - * Patch URB header. The layout depends on the domain. > > > - */ > > > - switch (tcs_key->tes_primitive_mode) { > > > - case GL_QUADS: > > > - /* gl_TessLevelInner[].xy lives at DWords 3-2 > (reversed). > > > - * We use an XXYX swizzle to reverse put .xy in the .wz > > > - * channels, and use a .zw writemask. > > > - */ > > > - mask = writemask_for_backwards_vector(mask); > > > - swiz = BRW_SWIZZLE4(0, 0, 1, 0); > > > - break; > > > - case GL_TRIANGLES: > > > - /* gl_TessLevelInner[].x lives at DWord 4, so we set > the > > > - * writemask to X and bump the URB offset by 1. > > > - */ > > > - imm_offset = 1; > > > - break; > > > - case GL_ISOLINES: > > > - /* Skip; gl_TessLevelInner[] doesn't exist for > isolines. */ > > > - return; > > > - default: > > > - unreachable("Bogus tessellation domain"); > > > - } > > > - } else if (imm_offset == 1) { > > > - /* This is a write to gl_TessLevelOuter[] which lives in > the > > > - * Patch URB Header at DWords 4-7. However, it's > reversed, so > > > - * instead of .xyzw we have .wzyx. > > > - */ > > > - value.type = BRW_REGISTER_TYPE_F; > > > - > > > - mask &= (1 << tesslevel_outer_components( > tcs_key->tes_primitive_mode)) > > > - 1; > > > - > > > - if (tcs_key->tes_primitive_mode == GL_ISOLINES) { > > > - /* Isolines .xy should be stored in .zw, in order. */ > > > - swiz = BRW_SWIZZLE4(0, 0, 0, 1); > > > - mask <<= 2; > > > - } else { > > > - /* Other domains are reversed; store .wzyx instead of > > > .xyzw */ > > > - swiz = BRW_SWIZZLE_WZYX; > > > - mask = writemask_for_backwards_vector(mask); > > > - } > > > - } > > > } > > > > > > if (mask == 0) > > > @@ -2851,48 +2731,6 @@ fs_visitor::nir_emit_tes_intrinsic(const > > > fs_builder &bld, > > > } > > > break; > > > > > > - case nir_intrinsic_load_tess_level_outer: > > > - /* When the TES reads gl_TessLevelOuter, we ensure that the > patch > > > header > > > - * appears as a push-model input. So, we can simply use the > ATTR > > > file > > > - * rather than issuing URB read messages. The data is stored > in the > > > - * high DWords in reverse order - DWord 7 contains .x, DWord 6 > > > contains > > > - * .y, and so on. > > > - */ > > > - switch (tes_prog_data->domain) { > > > - case BRW_TESS_DOMAIN_QUAD: > > > - for (unsigned i = 0; i < 4; i++) > > > - bld.MOV(offset(dest, bld, i), component(fs_reg(ATTR, 0), > 7 - > > > i)); > > > - break; > > > - case BRW_TESS_DOMAIN_TRI: > > > - for (unsigned i = 0; i < 3; i++) > > > - bld.MOV(offset(dest, bld, i), component(fs_reg(ATTR, 0), > 7 - > > > i)); > > > - break; > > > - case BRW_TESS_DOMAIN_ISOLINE: > > > - for (unsigned i = 0; i < 2; i++) > > > - bld.MOV(offset(dest, bld, i), component(fs_reg(ATTR, 0), > 6 + > > > i)); > > > - break; > > > - } > > > - break; > > > - > > > - case nir_intrinsic_load_tess_level_inner: > > > - /* When the TES reads gl_TessLevelInner, we ensure that the > patch > > > header > > > - * appears as a push-model input. So, we can simply use the > ATTR > > > file > > > - * rather than issuing URB read messages. > > > - */ > > > - switch (tes_prog_data->domain) { > > > - case BRW_TESS_DOMAIN_QUAD: > > > - bld.MOV(dest, component(fs_reg(ATTR, 0), 3)); > > > - bld.MOV(offset(dest, bld, 1), component(fs_reg(ATTR, 0), 2)); > > > - break; > > > - case BRW_TESS_DOMAIN_TRI: > > > - bld.MOV(dest, component(fs_reg(ATTR, 0), 4)); > > > - break; > > > - case BRW_TESS_DOMAIN_ISOLINE: > > > - /* ignore - value is undefined */ > > > - break; > > > - } > > > - break; > > > - > > > case nir_intrinsic_load_input: > > > case nir_intrinsic_load_per_vertex_input: { > > > fs_reg indirect_offset = get_indirect_offset(instr); > > > diff --git a/src/mesa/drivers/dri/i965/brw_nir.c > > > b/src/mesa/drivers/dri/i965/brw_nir.c > > > index 6f37e97a86f..46eeb1723b4 100644 > > > --- a/src/mesa/drivers/dri/i965/brw_nir.c > > > +++ b/src/mesa/drivers/dri/i965/brw_nir.c > > > @@ -141,9 +141,68 @@ remap_inputs_with_vue_map(nir_block *block, const > > > struct brw_vue_map *vue_map) > > > } > > > > > > static bool > > > +remap_tess_levels(nir_builder *b, nir_intrinsic_instr *intr, > > > + GLenum primitive_mode) > > > +{ > > > + const int location = nir_intrinsic_base(intr); > > > + const unsigned component = nir_intrinsic_component(intr); > > > + bool out_of_bounds; > > > + > > > + if (location == VARYING_SLOT_TESS_LEVEL_INNER) { > > > + switch (primitive_mode) { > > > + case GL_QUADS: > > > + /* gl_TessLevelInner[0..1] lives at DWords 3-2 (reversed). */ > > > + nir_intrinsic_set_base(intr, 0); > > > + nir_intrinsic_set_component(intr, 3 - component); > > > + out_of_bounds = false; > > > > > > > What if component > 1? I guess that's not really a problem but it is > > out-of-bounds... > > The GLSL array is a float[2], so I guess I was assuming something would > reject the out of bounds array access higher up in the compiler. > > Maybe that isn't a valid assumption though... > > > > + break; > > > + case GL_TRIANGLES: > > > + /* gl_TessLevelInner[0] lives at DWord 4. */ > > > + nir_intrinsic_set_base(intr, 1); > > > + out_of_bounds = component > 0; > > > + break; > > > + case GL_ISOLINES: > > > + out_of_bounds = true; > > > + break; > > > + default: > > > + unreachable("Bogus tessellation domain"); > > > + } > > > + } else if (location == VARYING_SLOT_TESS_LEVEL_OUTER) { > > > + if (primitive_mode == GL_ISOLINES) { > > > + /* gl_TessLevelOuter[0..1] lives at DWords 6-7 (in order). */ > > > + nir_intrinsic_set_base(intr, 1); > > > + nir_intrinsic_set_component(intr, 2 + > > > nir_intrinsic_component(intr)); > > > + out_of_bounds = component > 1; > > > + } else { > > > + /* Triangles use DWords 7-5 (reversed); Quads use 7-4 > (reversed) > > > */ > > > + nir_intrinsic_set_base(intr, 1); > > > + nir_intrinsic_set_component(intr, 3 - > > > nir_intrinsic_component(intr)); > > > + out_of_bounds = component == 3 && primitive_mode == > GL_TRIANGLES; > > > > > + } > > > + } else { > > > + return false; > > > + } > > > + > > > + if (out_of_bounds) { > > > + if (nir_intrinsic_infos[intr->intrinsic].has_dest) { > > > + b->cursor = nir_before_instr(&intr->instr); > > > + nir_ssa_def *undef = nir_ssa_undef(b, 1, 32); > > > + nir_ssa_def_rewrite_uses(&intr->dest.ssa, > > > nir_src_for_ssa(undef)); > > > + } > > > + nir_instr_remove(&intr->instr); > > > + } > > > + > > > + return true; > > > +} > > > + > > > +static bool > > > remap_patch_urb_offsets(nir_block *block, nir_builder *b, > > > - const struct brw_vue_map *vue_map) > > > + const struct brw_vue_map *vue_map, > > > + GLenum tes_primitive_mode) > > > { > > > + const bool is_passthrough_tcs = b->shader->info->name && > > > + strcmp(b->shader->info->name, "passthrough") == 0; > > > > > > > This is gross... > > > > Also... Why? What's so special about the passthrough that it doesn't > need > > tess level remaps? I have a feeling there's some more general thing we > > could be doing here. > > We've always been special casing it, so I'm not sure it's more gross > than it was before...*shrug*. > > All TCS must write the patch header, which is two vec4 slots: > > - gl_TessLevelOuter[4] > - gl_TessLevelInner[2] > - TR DS Cache Disable bit (DWord 0) > > For the passthrough shader, the inner/outer levels come from API state. > So, I hooked up two built-in vec4 uniforms, and made their contents > match the patch URB header exactly. So, the shader doesn't have to > do any remapping or shuffling...it just straight copies both vec4s. > Simpler and more efficient. But...a special case. > > > > + > > > nir_foreach_instr_safe(instr, block) { > > > if (instr->type != nir_instr_type_intrinsic) > > > continue; > > > @@ -154,6 +213,11 @@ remap_patch_urb_offsets(nir_block *block, > > > nir_builder *b, > > > > > > if ((stage == MESA_SHADER_TESS_CTRL && is_output(intrin)) || > > > (stage == MESA_SHADER_TESS_EVAL && is_input(intrin))) { > > > + > > > + if (!is_passthrough_tcs && > > > + remap_tess_levels(b, intrin, tes_primitive_mode)) > > > + continue; > > > > > > > Let's make sure I've got this right... > > > > We map everything from the varying identifiers to VUE slots. For the > case > > of tesslevel, they will be assigned VUE slots 0 and 1 so the code below > > will hever cause any other output to alias. Then we rework stuff in > > remap_tess_levels so that they map to the right locaion in 8-dword chunk > at > > the begining of the VUE. > > Yep, exactly. I assigned VARYING_SLOT_TESS_LEVEL_INNER/OUTER to slots 0 > and 1, since both together live in those slots. (Which is which doesn't > actually matter.) Here we special case those and take care of mapping > the individual components to the right spot. > > > This function (remap_patch_urb_offsets) could really use a comment at the > > top saying what it's doing. > > I suppose so. It makes sense to me, but I wrote it... >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev