On 3 September 2011 11:38, Kenneth Graunke <kenn...@whitecape.org> wrote:
> On 09/02/2011 09:06 AM, Paul Berry wrote: > > Previously, the old VS backend computed the URB entry size by adding > > the number of vertex shader outputs to the size of the URB header. > > This often produced a larger result than necessary, because some > > vertex shader outputs are stored in the header, so they were being > > double counted. This patch changes the old VS backend to compute the > > URB entry size directly from the number of slots in the VUE map. > > --- > > src/mesa/drivers/dri/i965/brw_vs.h | 1 - > > src/mesa/drivers/dri/i965/brw_vs_emit.c | 31 > +++++++------------------------ > > 2 files changed, 7 insertions(+), 25 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_vs.h > b/src/mesa/drivers/dri/i965/brw_vs.h > > index a02c06d..c31869c 100644 > > --- a/src/mesa/drivers/dri/i965/brw_vs.h > > +++ b/src/mesa/drivers/dri/i965/brw_vs.h > > @@ -65,7 +65,6 @@ struct brw_vs_compile { > > > > struct brw_vue_map vue_map; > > GLuint first_output; > > - GLuint nr_outputs; > > GLuint last_scratch; > > > > GLuint first_tmp; > > diff --git a/src/mesa/drivers/dri/i965/brw_vs_emit.c > b/src/mesa/drivers/dri/i965/brw_vs_emit.c > > index e7667c8..7c430ce 100644 > > --- a/src/mesa/drivers/dri/i965/brw_vs_emit.c > > +++ b/src/mesa/drivers/dri/i965/brw_vs_emit.c > > @@ -402,36 +402,19 @@ static void brw_vs_alloc_regs( struct > brw_vs_compile *c ) > > /* The VS VUEs are shared by VF (outputting our inputs) and VS, so > size > > * them to fit the biggest thing they need to. > > */ > > - attributes_in_vue = MAX2(c->nr_outputs, c->nr_inputs); > > + attributes_in_vue = MAX2(c->vue_map.num_slots, c->nr_inputs); > > > > - /* See emit_vertex_write() for where the VUE's overhead on top of the > > - * attributes comes from. > > - */ > > - if (intel->gen >= 7) { > > - int header_regs = 2; > > - if (c->key.nr_userclip) > > - header_regs += 2; > > - > > - /* Each attribute is 16 bytes (1 vec4), so dividing by 4 gives us > the > > - * number of 64-byte (512-bit) units. > > - */ > > - c->prog_data.urb_entry_size = (attributes_in_vue + header_regs + > 3) / 4; > > - } else if (intel->gen == 6) { > > - int header_regs = 2; > > - if (c->key.nr_userclip) > > - header_regs += 2; > > - > > - /* Each attribute is 16 bytes (1 vec4), so dividing by 8 gives us > the > > + if (intel->gen == 6) { > > + /* Each attribute is 32 bytes (2 vec4s), so dividing by 8 gives us > the > > * number of 128-byte (1024-bit) units. > > */ > > - c->prog_data.urb_entry_size = (attributes_in_vue + header_regs + > 7) / 8; > > - } else if (intel->gen == 5) > > + c->prog_data.urb_entry_size = (attributes_in_vue + 7) / 8; > > + } else { > > /* Each attribute is 16 bytes (1 vec4), so dividing by 4 gives us > the > > * number of 64-byte (512-bit) units. > > */ > > - c->prog_data.urb_entry_size = (attributes_in_vue + 6 + 3) / 4; > > - else > > - c->prog_data.urb_entry_size = (attributes_in_vue + 2 + 3) / 4; > > + c->prog_data.urb_entry_size = (attributes_in_vue + 3) / 4; > > + } > > > > c->prog_data.total_grf = reg; > > Totally non-obvious, but seems correct and is _much_ cleaner. > > While we're at it, it might be nicer to do: > > if (intel->gen == 6) > c->prog_data.urb_entry_size = ALIGN(attributes_in_vue, 8) / 8; > else > c->prog_data.urb_entry_size = ALIGN(attributes_in_vue, 4) / 4; > > Otherwise, I get confused, thinking there are +7 or +3 registers > dedicated to something or other. Most likely because in the original > code, (attributes_in_vue + 6 + 3) / 4, the +6 _is_ adding some header > registers, while the +3 is purely a "round up!" factor. > > We can always do that in a follow-up patch if you prefer (or just not, > because this is all going away soon enough.) > Yeah, I like your rewrite, Ken--it's clearer, and it's a simple enough change. I'll go ahead and update the patch.
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev