On Mon, Apr 21, 2014 at 12:56 PM, Matt Turner <matts...@gmail.com> wrote: > On Mon, Apr 21, 2014 at 8:54 AM, Ilia Mirkin <imir...@alum.mit.edu> wrote: >> Hello, >> >> I've been giving some thought to catching up with core mesa on ARB_gs5 >> support. One of the things that ARB_gs5 introduces are new operations: >> >> genType frexp(genType x, out genIType exp); >> genType ldexp(genType x, in genIType exp); >> >> genIType bitfieldExtract(genIType value, int offset, int bits); >> genUType bitfieldExtract(genUType value, int offset, int bits); >> >> genIType bitfieldInsert(genIType base, genIType insert, int offset, >> int bits); >> genUType bitfieldInsert(genUType base, genUType insert, int offset, >> int bits); >> >> genIType bitfieldReverse(genIType value); >> genUType bitfieldReverse(genUType value); >> >> genIType bitCount(genIType value); >> genIType bitCount(genUType value); >> >> genIType findLSB(genIType value); >> genIType findLSB(genUType value); >> >> genIType findMSB(genIType value); >> genIType findMSB(genUType value); >> >> genUType uaddCarry(genUType x, genUType y, out genUType carry); >> genUType usubBorrow(genUType x, genUType y, out genUType borrow); >> >> void umulExtended(genUType x, genUType y, out genUType msb, >> out genUType lsb); >> void imulExtended(genIType x, genIType y, out genIType msb, >> out genIType lsb); >> >> (I've skipped the packing stuff since that seems to already be >> supported/lowered elsewhere, i2f/f2i which is already handled, and the >> texture gather stuff, for which support already exists. And the >> interpolateAt* stuff which isn't supported by core mesa yet, and when >> it is, will require a very diff kind of handling than the above.) >> >> I guess the only drivers one really needs to worry about here are >> r600/radeonsi and nouveau. svga is largely a passthrough afaik, and >> llvmpipe/softpipe is software and can thus implement it however it >> wants. >> >> Looking at the nvc0+ shader ISA, there are instructions to directly >> handle all the bitfield stuff (bitfieldExtract, bitfieldInsert, >> bitfieldReverse, bitCount, findLSB, findMSB). There is also a "mul >> high", which is that the *mulExtended stuff gets translated into. >> >> There are no instructions to handle frexp/ldexp, or the add carry/sub >> borrow stuff. (Looking at the code the blob generates, they just do >> all that "by hand". Even though there is a "set cc" flag on those >> instructions which one might assume has the carry. But the blob didn't >> use it.) >> >> So I was thinking that we could just take the relevant SM5 >> instructions and lower the rest. Specifically, these would be the new >> opcodes: >> >> IBFE >> UBFE >> BFI >> BREV (not BFREV since most instructions appear to be 3/4 letters) >> POPC (shorter than "countbits") >> LSB >> UMSB >> IMSB >> IMULHI >> >> I just took a look at the Radeon SI ISA, and it does seem like it has >> ldexp/frexp instructions, as well as setting the carry flag for >> addc/subb. Although since TGSI doesn't have flags or multiple >> destinations, not sure how the latter 2 could be easily encoded in the >> glsl->tgsi translation. >> >> Thoughts/opinions before I go and implement the above? Is someone else >> already working on this? > > I've written lowering code for ldexp/frexp. It relies on support for
The lowering code for ldexp is optional, but the frexp one seems to be "required" (in that there is no ir_binop_frexp at all). If RadeonSI wants to make use of its built-in frexp instruction, they'll either need to change it, or have a _really_ clever peephole pass. (Didn't check if the r600 isa had the same thing...) > EXT_shader_integer_mix, which disappointingly no other Mesa drivers > have exposed. http://gallium.readthedocs.org/en/latest/tgsi.html#opcode-UCMP I assume that's the same thing? If so, that extension can probably just be exposed as-is on gallium for drivers that support NativeIntegers. > > For the multi-destination built-ins, i965 has multi-destination > instructions (addc, subb) which write the carry/borrow to the > accumulator register. Instead of doing a ton of infrastructure to > support multi-destination IR I emit an add an addc for uaddCarry and > only use the carry result from addc. A peephole optimization can > easily combine the add/addc into a single addc. Hm, neat idea. But the same peephole pass could, instead, be used to detect UADD x, a, b USLT y, x, a And then you don't need the special ADDC instruction. And you get the advantage of being able to detect (some) people who were doing this by hand before. -ilia _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev