On Wed, Oct 28, 2015 at 3:08 AM, Kenneth Graunke <kenn...@whitecape.org> wrote: > On Tuesday, October 27, 2015 04:40:19 PM Kristian Høgsberg wrote: >> On Mon, Oct 12, 2015 at 02:55:32PM -0700, Kenneth Graunke wrote: >> > Signed-off-by: Kenneth Graunke <kenn...@whitecape.org> >> > --- >> > src/mesa/drivers/dri/i965/brw_fs.cpp | 174 ++++++++++ >> > src/mesa/drivers/dri/i965/brw_fs.h | 16 +- >> > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 378 >> > ++++++++++++++++++++++ >> > src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 49 ++- >> > src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 21 ++ >> > 5 files changed, 628 insertions(+), 10 deletions(-) >> > >> > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp >> > b/src/mesa/drivers/dri/i965/brw_fs.cpp >> > index dde8c45..778237a 100644 >> > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp >> > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp >> > @@ -43,6 +43,7 @@ >> > #include "brw_wm.h" >> > #include "brw_fs.h" >> > #include "brw_cs.h" >> > +#include "brw_vec4_gs_visitor.h" >> > #include "brw_cfg.h" >> > #include "brw_dead_control_flow.h" >> > #include "main/uniforms.h" >> > @@ -1347,6 +1348,47 @@ fs_visitor::emit_discard_jump() >> > } >> > >> > void >> > +fs_visitor::emit_gs_thread_end() >> > +{ >> > + assert(stage == MESA_SHADER_GEOMETRY); >> > + >> > + if (gs_compile->control_data_header_size_bits > 0) { >> > + emit_gs_control_data_bits(this->final_gs_vertex_count); >> > + } >> > + >> > + const fs_builder abld = bld.annotate("thread end"); >> > + fs_inst *inst; >> > + >> > + if (gs_compile->prog_data.static_vertex_count != -1) { >> > + foreach_in_list_reverse(fs_inst, prev, &this->instructions) { >> > + if (prev->opcode == SHADER_OPCODE_URB_WRITE_SIMD8 || >> > + prev->opcode == SHADER_OPCODE_URB_WRITE_SIMD8_MASKED || >> > + prev->opcode == SHADER_OPCODE_URB_WRITE_SIMD8_PER_SLOT || >> > + prev->opcode == >> > SHADER_OPCODE_URB_WRITE_SIMD8_MASKED_PER_SLOT) { >> > + prev->eot = true; >> > + return; >> > + } else if (prev->is_control_flow() || prev->has_side_effects()) { >> > + break; >> > + } >> > + } >> > + fs_reg hdr = abld.vgrf(BRW_REGISTER_TYPE_UD, 1); >> > + abld.MOV(hdr, fs_reg(retype(brw_vec8_grf(1, 0), >> > BRW_REGISTER_TYPE_UD))); >> > + inst = abld.emit(SHADER_OPCODE_URB_WRITE_SIMD8, reg_undef, hdr); >> > + inst->mlen = 1; >> > + } else { >> > + fs_reg payload = abld.vgrf(BRW_REGISTER_TYPE_UD, 2); >> > + fs_reg *sources = ralloc_array(mem_ctx, fs_reg, 2); >> > + sources[0] = fs_reg(retype(brw_vec8_grf(1, 0), >> > BRW_REGISTER_TYPE_UD)); >> > + sources[1] = this->final_gs_vertex_count; >> > + abld.LOAD_PAYLOAD(payload, sources, 2, 2); >> > + inst = abld.emit(SHADER_OPCODE_URB_WRITE_SIMD8, reg_undef, payload); >> > + inst->mlen = 2; >> > + } >> > + inst->eot = true; >> > + inst->offset = 0; >> > +} >> > + >> > +void >> > fs_visitor::assign_curb_setup() >> > { >> > if (dispatch_width == 8) { >> > @@ -1550,6 +1592,53 @@ fs_visitor::assign_vs_urb_setup() >> > } >> > } >> > >> > +void >> > +fs_visitor::assign_gs_urb_setup() >> > +{ >> > + assert(stage == MESA_SHADER_GEOMETRY); >> > + >> > + const gl_geometry_program *gp = &gs_compile->gp->program; >> > + brw_vue_prog_data *vue_prog_data = (brw_vue_prog_data *) prog_data; >> > + >> > + first_non_payload_grf += >> > + 8 * vue_prog_data->urb_read_length * gp->VerticesIn; >> >> Where does the 8 * come from here? > > Vertex data is read from the URB in 256-bit (HWord) units - two vec4 > slots at a time. The fact that this happens to be the size of a > register is quite confusing, but not entirely relevant :)
Ah yes, *urb* read length is a number of vec4s. I was thinking that urb_read_length was a number of regs. I think this part of the spec confused me: "This is followed by GRFs containing Vertex 0 Element 1 (if it exists), and so on, up to the number of Vertex 0 elements specified by Vertex URB Read Length." but of course, there it's understood that URB Read Length is pairs or vec4s. > Data read from the URB is expanded into the SIMD8 data layout, where > a vec4 takes up 4 registers. So when we read a 256-bit URB block, > those two vec4s end up taking up 8 registers. > > See vec4_gs_visitor::setup_varying_inputs(), which multiplies by 2 > since in SIMD4x2 mode a vec4 takes up a whole register, so 2 vec4s > take up 2 registers. > > Another way of explaining it: a SIMD4x2 GS processes 2 completely > unrelated primitives in each half. Data comes from two VUEs. > > VUE entry for primitive 0: [<A0.x A0.y A0.z A0.w>, <B0.x B0.y B0.z B0.w>, > ...] > VUE entry for primitive 1: [<A1.x A1.y A1.z A1.w>, <B1.x B1.y B1.z B1.w>, > ...] > > One URB access reads 256-bits, but splats that across 2 registers: > > r10 = A1.x A1.y A1.z A1.w | A0.x A0.y A0.z A0.w > r11 = B1.x B1.y B1.z B1.w | B0.x B0.y B0.z B0.w > > Similarly, SIMD8 shaders process 8 independent primitives per thread. > The data comes from 8 different VUEs: > > VUE entry for primitive 0: [<A0.x A0.y A0.z A0.w>, <B0.x B0.y B0.z B0.w>, > ...] > VUE entry for primitive 1: [<A1.x A1.y A1.z A1.w>, <B1.x B1.y B1.z B1.w>, > ...] > VUE entry for primitive 2: [<A2.x A2.y A2.z A2.w>, <B2.x B2.y B2.z B2.w>, > ...] > VUE entry for primitive 3: [<A3.x A3.y A3.z A3.w>, <B3.x B3.y B3.z B3.w>, > ...] > VUE entry for primitive 4: [<A4.x A4.y A4.z A4.w>, <B4.x B4.y B4.z B4.w>, > ...] > VUE entry for primitive 5: [<A5.x A5.y A5.z A5.w>, <B5.x B5.y B5.z B5.w>, > ...] > VUE entry for primitive 6: [<A6.x A6.y A6.z A6.w>, <B6.x B6.y B6.z B6.w>, > ...] > VUE entry for primitive 7: [<A7.x A7.y A7.z A7.w>, <B7.x B7.y B7.z B7.w>, > ...] > > One URB access reads 256-bits, but splats that across 8 registers: > > r10 = A7.x A6.x A5.x A4.x A3.x A2.x A1.x A0.x > r11 = A7.y A6.y A5.y A4.y A3.y A2.y A1.y A0.y > r12 = A7.z A6.z A5.z A4.z A3.z A2.z A1.z A0.z > r13 = A7.w A6.w A5.w A4.w A3.w A2.w A1.w A0.w > r14 = B7.x B6.x B5.x B4.x B3.x B2.x B1.x B0.x > r15 = B7.y B6.y B5.y B4.y B3.y B2.y B1.y B0.y > r16 = B7.z B6.z B5.z B4.z B3.z B2.z B1.z B0.z > r17 = B7.w B6.w B5.w B4.w B3.w B2.w B1.w B0.w > > I hope that helps! This was a good question. This was a good answer. thanks, Kristian _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev