On Mon, May 16, 2016 at 9:23 PM, Francisco Jerez <curroje...@riseup.net> wrote:
> Currently the spilling code attempts to guess the scratch message > block size from the dispatch width of the shader, which is plain wrong > for SIMD-lowered instructions (frequently but not exclusively > encountered in SIMD32 shaders) or for instructions with register > region data types of size other than 32 bit. > > Instead try to use the SIMD component size of the instruction which in > some cases will allow the dataport to apply the correct channel mask > to the scratch data read or written. In the spill case the block size > needs to be clamped to the number of MRF registers reserved for > spilling. In the unspill case I didn't even bother because we > currently have no 100% accurate way to determine whether a source > region is per-channel or whether it contains things like headers that > don't respect channel boundaries -- That's fine, because the unspill > is marked force_writemask_all we can just use the largest allowable > scratch message size. > --- > src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 20 > +++++++++++++++++--- > 1 file changed, 17 insertions(+), 3 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp > b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp > index 5cb2013..ff40c42 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp > @@ -942,8 +942,14 @@ fs_visitor::spill_reg(int spill_reg) > inst->src[i].nr = unspill_dst.nr; > inst->src[i].reg_offset = 0; > > + /* We read the largest power-of-two divisor of the register > count > + * (because only POT scratch read blocks are allowed by the > + * hardware) up to the maximum supported block size. > + * XXX - Bump the limit when the generator code is ready for > + * 32-wide spills. > + */ > const unsigned width = > - dispatch_width == 16 && regs_read % 2 == 0 ? 16 : 8; > + MIN2(16, 1u << (ffs(MAX2(1, regs_read) * 8) - 1)); > Can regs_read ever be zero? Seems kind of odd. > > /* Set exec_all() on unspill messages under the (rather > * pessimistic) assumption that there is no one-to-one > @@ -974,8 +980,16 @@ fs_visitor::spill_reg(int spill_reg) > inst->no_dd_clear = false; > inst->no_dd_check = false; > > - const unsigned width = > - dispatch_width == 16 && inst->regs_written % 2 == 0 ? 16 : 8; > + /* Calculate the execution width of the scratch messages (which > work > + * in terms of 32 bit components so we have a fixed number of > eight > + * channels per spilled register). We attempt to write one > + * exec_size-wide component of the variable at a time without > + * exceeding the maximum number of (fake) MRF registers reserved > for > + * spills. > + */ > + const unsigned width = 8 * MIN2( > + DIV_ROUND_UP(inst->dst.component_size(inst->exec_size), > REG_SIZE), > + spill_max_size(this)); > > /* Spills should only write data initialized by the instruction > for > * whichever channels are enabled in the excution mask. If > that's > -- > 2.7.3 > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev