Matt Turner <matts...@gmail.com> writes: > The implementations of the ARB_shader_group_vote intrinsics will > explicitly write the flag as the destination register. > --- > src/intel/compiler/brw_fs.cpp | 12 ++++++++++-- > 1 file changed, 10 insertions(+), 2 deletions(-) > > diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp > index 43b6e34204..97908a4563 100644 > --- a/src/intel/compiler/brw_fs.cpp > +++ b/src/intel/compiler/brw_fs.cpp > @@ -890,9 +890,17 @@ fs_inst::flags_written() const > opcode != BRW_OPCODE_WHILE)) || > opcode == FS_OPCODE_MOV_DISPATCH_TO_FLAGS) { > return flag_mask(this); > - } else { > - return 0; > + } else if (dst.file == ARF) { > + if (dst.nr == BRW_ARF_FLAG + 0 && dst.subnr == 0) > + return 0b0001; > + if (dst.nr == BRW_ARF_FLAG + 0 && dst.subnr == 1) > + return 0b0010; > + if (dst.nr == BRW_ARF_FLAG + 1 && dst.subnr == 0) > + return 0b0100; > + if (dst.nr == BRW_ARF_FLAG + 1 && dst.subnr == 1) > + return 0b1000;
This doesn't look right to me, it assumes that flag registers are 16b (they're 32b), and that you can only write a single byte of the flag register at a time, which is hardly ever the case. I think you want something along the lines of: | unsigned | bit_mask(unsigned n) | { | return (n >= CHAR_BIT * sizeof(bit_mask(n)) ? ~0u : (1 << n) - 1); | } | | unsigned | flag_mask(const fs_reg &r, unsigned sz) | { | if (r.file == ARF) { | const unsigned start = (r.nr - BRW_ARF_FLAG) * 4 + r.subnr; | const unsigned end = start + sz; | return bit_mask(end) & ~bit_mask(start); | } else { | return 0; | } | } [Yes, this should give you the right result (i.e. zero) for ARF registers outside the range of the (flag) ARF window of the 32-bit bitmask.] Notice the similarity between this and the current flag_mask() overload. This will allow you to easily re-use the same logic for both flags_read() and flags_written(), you just need to pass 'inst->src[i], inst->size_read(i)' and 'inst->dst, inst->size_written' as arguments respectively, ORing the results for each source in the former case. Also make sure to remove the 'XXX' comments at the top of both functions saying that explicit flag reads and writes are not handled. ;) > } > + return 0; > } > > /** > -- > 2.13.0 > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev