Am 25.04.2014 23:19, schrieb Ilia Mirkin: > On Fri, Apr 25, 2014 at 5:02 PM, Roland Scheidegger <srol...@vmware.com> > wrote: >> Am 25.04.2014 19:41, schrieb Ilia Mirkin: >>> Signed-off-by: Ilia Mirkin <imir...@alum.mit.edu> >>> --- >>> src/gallium/auxiliary/tgsi/tgsi_info.c | 8 +++++ >>> src/gallium/docs/source/tgsi.rst | 51 >>> ++++++++++++++++++++++++++++++ >>> src/gallium/include/pipe/p_shader_tokens.h | 11 ++++++- >>> 3 files changed, 69 insertions(+), 1 deletion(-) >>> >>> diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c >>> b/src/gallium/auxiliary/tgsi/tgsi_info.c >>> index 5bcc3c9..d03a920 100644 >>> --- a/src/gallium/auxiliary/tgsi/tgsi_info.c >>> +++ b/src/gallium/auxiliary/tgsi/tgsi_info.c >>> @@ -223,6 +223,14 @@ static const struct tgsi_opcode_info >>> opcode_info[TGSI_OPCODE_LAST] = >>> { 1, 2, 0, 0, 0, 0, COMP, "UMUL_HI", TGSI_OPCODE_UMUL_HI }, >>> { 1, 3, 1, 0, 0, 0, OTHR, "TG4", TGSI_OPCODE_TG4 }, >>> { 1, 2, 1, 0, 0, 0, OTHR, "LODQ", TGSI_OPCODE_LODQ }, >>> + { 1, 3, 0, 0, 0, 0, COMP, "IBFE", TGSI_OPCODE_IBFE }, >>> + { 1, 3, 0, 0, 0, 0, COMP, "UBFE", TGSI_OPCODE_UBFE }, >>> + { 1, 4, 0, 0, 0, 0, COMP, "BFI", TGSI_OPCODE_BFI }, >>> + { 1, 1, 0, 0, 0, 0, COMP, "BREV", TGSI_OPCODE_BREV }, >>> + { 1, 1, 0, 0, 0, 0, COMP, "POPC", TGSI_OPCODE_POPC }, >>> + { 1, 1, 0, 0, 0, 0, COMP, "LSB", TGSI_OPCODE_LSB }, >>> + { 1, 1, 0, 0, 0, 0, COMP, "IMSB", TGSI_OPCODE_IMSB }, >>> + { 1, 1, 0, 0, 0, 0, COMP, "UMSB", TGSI_OPCODE_UMSB }, >>> }; >>> >>> const struct tgsi_opcode_info * >>> diff --git a/src/gallium/docs/source/tgsi.rst >>> b/src/gallium/docs/source/tgsi.rst >>> index 0ea0759..95b069f 100644 >>> --- a/src/gallium/docs/source/tgsi.rst >>> +++ b/src/gallium/docs/source/tgsi.rst >>> @@ -1558,6 +1558,57 @@ Support for these opcodes indicated by >>> PIPE_SHADER_CAP_INTEGERS (all of them?) >>> >>> dst.w = |src.w| >>> >>> +Bitwise ISA >>> +^^^^^^^^^^^ >>> +These opcodes are used for bit-level manipulation of integers. >>> + >>> +.. opcode:: IBFE - Signed Bitfield Extract >>> + >>> +.. math:: >>> + >>> + value = src0 >>> + >>> + offset = src1 >>> + >>> + bits = src2 >>> + >>> + dst = bitfield\_extract(value, offset, bits) >>> + >>> +.. opcode:: UBFE - Unsigned Bitfield Extract >>> + >>> +.. math:: >>> + >>> + value = src0 >>> + >>> + offset = src1 >>> + >>> + bits = src2 >>> + >>> + dst = bitfield\_extract(value, offset, bits) >> I think the description for these two leaves a bit to be desired (you'd >> even think they are the same). > > They basically are the same, except for the sign extension. Yes of course. But you can't tell from that description.
> What's the > standard for such operations which don't map into "math" nicely? > Should I stick some pseudo-code in? Some paragraph including pseudo-code is fine by me. Or you could explain the bitfield_extract term below under the Functions section (though I'm not sure it's such a good idea - bitfield_extract() just isn't a very well known term). > >> >>> + >>> +.. opcode:: BFI - Bitfield Insert >>> + >>> +.. math:: >>> + >>> + base = src0 >>> + >>> + insert = src1 >>> + >>> + offset = src2 >>> + >>> + bits = src3 >>> + >>> + dst = bitfield\_insert(base, insert, offset, bits) >> Same as above. >> >>> + >>> +.. opcode:: BREV - Bitfield Reverse >> Could also be a bit more descriptive. >> >>> + >>> +.. opcode:: POPC - Population Count (Count Set Bits) >>> + >>> +.. opcode:: LSB - Index of lowest set bit >>> + >>> +.. opcode:: IMSB - Index of highest non-sign bit >> That looks very confusing to me, since it apparently is meant to give >> the highest set bit if the number is positive, and the highest cleared >> bit if the number is negative. > > Right, so if the sign-bit is 1 (negative), it's the index of the > highest 0. If the sign bit is 0 (positive), it's the index of the > highest 1. And -1 if all the bits are the same. None of these at all > map nicely to a "math" style of description. Perhaps I should just put > in a paragraph for these? Sounds good to me. > >> >>> + >>> +.. opcode:: UMSB - Index of highest 1-bit >> highest set bit? > > Sure. > >> >> Otherwise these look reasonable to me. >> As for the addc/subb I guess this is an area where just about everything >> you do won't really match hw in any case. A quick glance at radeonsi >> tells me that gcn actually _always_ sets the carry bit for normal int >> adds/subs but does so in the VCC reg - so if you'd want to get this to a >> "normal" register you'd have to do some other instruction (maybe >> conditional 0/1 move based on VCC). However, gcn actually has subb/addc >> instructions, these just do add/sub honoring that VCC bit (and again >> still outputting VCC bit themselves). >> But sm5 and glsl agree there - they both have addc/subb with just just 2 >> inputs (so no carry/borrow input) but an additional "normal" overflow >> output. Maybe this is easiest to transform into what hw will actually do >> usually. > > I was hoping to not have to deal with carry/borrow at the TGSI level > at all and just have the GLSL lower to ADD + USLT or so, and then for > hw capable of dealing with it (not nvc0, or at least the blob driver > doesn't make use of a mechanism that'd enable it), having a peephole > opt that converts the USLT to a "recover whereever the flag is at". I guess an explicit carry instruction makes it somewhat more obvious this really came from an addc. Not sure if that really matters, though. Roland > >> >> Roland >> >> >>> >>> Geometry ISA >>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >>> diff --git a/src/gallium/include/pipe/p_shader_tokens.h >>> b/src/gallium/include/pipe/p_shader_tokens.h >>> index b537166..d095bd3 100644 >>> --- a/src/gallium/include/pipe/p_shader_tokens.h >>> +++ b/src/gallium/include/pipe/p_shader_tokens.h >>> @@ -462,7 +462,16 @@ struct tgsi_property_data { >>> >>> #define TGSI_OPCODE_LODQ 183 >>> >>> -#define TGSI_OPCODE_LAST 184 >>> +#define TGSI_OPCODE_IBFE 184 >>> +#define TGSI_OPCODE_UBFE 185 >>> +#define TGSI_OPCODE_BFI 186 >>> +#define TGSI_OPCODE_BREV 187 >>> +#define TGSI_OPCODE_POPC 188 >>> +#define TGSI_OPCODE_LSB 189 >>> +#define TGSI_OPCODE_IMSB 190 >>> +#define TGSI_OPCODE_UMSB 191 >>> + >>> +#define TGSI_OPCODE_LAST 192 >>> >>> #define TGSI_SAT_NONE 0 /* do not saturate */ >>> #define TGSI_SAT_ZERO_ONE 1 /* clamp to [0,1] */ >>> _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev