On Fri, Apr 25, 2014 at 5:02 PM, Roland Scheidegger <srol...@vmware.com> wrote: > Am 25.04.2014 19:41, schrieb Ilia Mirkin: >> Signed-off-by: Ilia Mirkin <imir...@alum.mit.edu> >> --- >> src/gallium/auxiliary/tgsi/tgsi_info.c | 8 +++++ >> src/gallium/docs/source/tgsi.rst | 51 >> ++++++++++++++++++++++++++++++ >> src/gallium/include/pipe/p_shader_tokens.h | 11 ++++++- >> 3 files changed, 69 insertions(+), 1 deletion(-) >> >> diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c >> b/src/gallium/auxiliary/tgsi/tgsi_info.c >> index 5bcc3c9..d03a920 100644 >> --- a/src/gallium/auxiliary/tgsi/tgsi_info.c >> +++ b/src/gallium/auxiliary/tgsi/tgsi_info.c >> @@ -223,6 +223,14 @@ static const struct tgsi_opcode_info >> opcode_info[TGSI_OPCODE_LAST] = >> { 1, 2, 0, 0, 0, 0, COMP, "UMUL_HI", TGSI_OPCODE_UMUL_HI }, >> { 1, 3, 1, 0, 0, 0, OTHR, "TG4", TGSI_OPCODE_TG4 }, >> { 1, 2, 1, 0, 0, 0, OTHR, "LODQ", TGSI_OPCODE_LODQ }, >> + { 1, 3, 0, 0, 0, 0, COMP, "IBFE", TGSI_OPCODE_IBFE }, >> + { 1, 3, 0, 0, 0, 0, COMP, "UBFE", TGSI_OPCODE_UBFE }, >> + { 1, 4, 0, 0, 0, 0, COMP, "BFI", TGSI_OPCODE_BFI }, >> + { 1, 1, 0, 0, 0, 0, COMP, "BREV", TGSI_OPCODE_BREV }, >> + { 1, 1, 0, 0, 0, 0, COMP, "POPC", TGSI_OPCODE_POPC }, >> + { 1, 1, 0, 0, 0, 0, COMP, "LSB", TGSI_OPCODE_LSB }, >> + { 1, 1, 0, 0, 0, 0, COMP, "IMSB", TGSI_OPCODE_IMSB }, >> + { 1, 1, 0, 0, 0, 0, COMP, "UMSB", TGSI_OPCODE_UMSB }, >> }; >> >> const struct tgsi_opcode_info * >> diff --git a/src/gallium/docs/source/tgsi.rst >> b/src/gallium/docs/source/tgsi.rst >> index 0ea0759..95b069f 100644 >> --- a/src/gallium/docs/source/tgsi.rst >> +++ b/src/gallium/docs/source/tgsi.rst >> @@ -1558,6 +1558,57 @@ Support for these opcodes indicated by >> PIPE_SHADER_CAP_INTEGERS (all of them?) >> >> dst.w = |src.w| >> >> +Bitwise ISA >> +^^^^^^^^^^^ >> +These opcodes are used for bit-level manipulation of integers. >> + >> +.. opcode:: IBFE - Signed Bitfield Extract >> + >> +.. math:: >> + >> + value = src0 >> + >> + offset = src1 >> + >> + bits = src2 >> + >> + dst = bitfield\_extract(value, offset, bits) >> + >> +.. opcode:: UBFE - Unsigned Bitfield Extract >> + >> +.. math:: >> + >> + value = src0 >> + >> + offset = src1 >> + >> + bits = src2 >> + >> + dst = bitfield\_extract(value, offset, bits) > I think the description for these two leaves a bit to be desired (you'd > even think they are the same).
They basically are the same, except for the sign extension. What's the standard for such operations which don't map into "math" nicely? Should I stick some pseudo-code in? > >> + >> +.. opcode:: BFI - Bitfield Insert >> + >> +.. math:: >> + >> + base = src0 >> + >> + insert = src1 >> + >> + offset = src2 >> + >> + bits = src3 >> + >> + dst = bitfield\_insert(base, insert, offset, bits) > Same as above. > >> + >> +.. opcode:: BREV - Bitfield Reverse > Could also be a bit more descriptive. > >> + >> +.. opcode:: POPC - Population Count (Count Set Bits) >> + >> +.. opcode:: LSB - Index of lowest set bit >> + >> +.. opcode:: IMSB - Index of highest non-sign bit > That looks very confusing to me, since it apparently is meant to give > the highest set bit if the number is positive, and the highest cleared > bit if the number is negative. Right, so if the sign-bit is 1 (negative), it's the index of the highest 0. If the sign bit is 0 (positive), it's the index of the highest 1. And -1 if all the bits are the same. None of these at all map nicely to a "math" style of description. Perhaps I should just put in a paragraph for these? > >> + >> +.. opcode:: UMSB - Index of highest 1-bit > highest set bit? Sure. > > Otherwise these look reasonable to me. > As for the addc/subb I guess this is an area where just about everything > you do won't really match hw in any case. A quick glance at radeonsi > tells me that gcn actually _always_ sets the carry bit for normal int > adds/subs but does so in the VCC reg - so if you'd want to get this to a > "normal" register you'd have to do some other instruction (maybe > conditional 0/1 move based on VCC). However, gcn actually has subb/addc > instructions, these just do add/sub honoring that VCC bit (and again > still outputting VCC bit themselves). > But sm5 and glsl agree there - they both have addc/subb with just just 2 > inputs (so no carry/borrow input) but an additional "normal" overflow > output. Maybe this is easiest to transform into what hw will actually do > usually. I was hoping to not have to deal with carry/borrow at the TGSI level at all and just have the GLSL lower to ADD + USLT or so, and then for hw capable of dealing with it (not nvc0, or at least the blob driver doesn't make use of a mechanism that'd enable it), having a peephole opt that converts the USLT to a "recover whereever the flag is at". > > Roland > > >> >> Geometry ISA >> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >> diff --git a/src/gallium/include/pipe/p_shader_tokens.h >> b/src/gallium/include/pipe/p_shader_tokens.h >> index b537166..d095bd3 100644 >> --- a/src/gallium/include/pipe/p_shader_tokens.h >> +++ b/src/gallium/include/pipe/p_shader_tokens.h >> @@ -462,7 +462,16 @@ struct tgsi_property_data { >> >> #define TGSI_OPCODE_LODQ 183 >> >> -#define TGSI_OPCODE_LAST 184 >> +#define TGSI_OPCODE_IBFE 184 >> +#define TGSI_OPCODE_UBFE 185 >> +#define TGSI_OPCODE_BFI 186 >> +#define TGSI_OPCODE_BREV 187 >> +#define TGSI_OPCODE_POPC 188 >> +#define TGSI_OPCODE_LSB 189 >> +#define TGSI_OPCODE_IMSB 190 >> +#define TGSI_OPCODE_UMSB 191 >> + >> +#define TGSI_OPCODE_LAST 192 >> >> #define TGSI_SAT_NONE 0 /* do not saturate */ >> #define TGSI_SAT_ZERO_ONE 1 /* clamp to [0,1] */ >> _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev