On Thu, Jun 9, 2016 at 10:28 PM, Ilia Mirkin <imir...@alum.mit.edu> wrote: > On Thu, Jun 9, 2016 at 4:11 PM, Ian Romanick <i...@freedesktop.org> wrote: >> On 06/09/2016 11:26 AM, Ilia Mirkin wrote: >>> On Thu, Jun 9, 2016 at 2:07 PM, Ian Romanick <i...@freedesktop.org> wrote: >>>> On 06/08/2016 02:15 PM, Dave Airlie wrote: >>>>> While writing ARB_gpu_shader_int64 I realised I needed to change >>>>> a lot of existing checks for doubles to 64bit, so I decided to >>>>> do that as much in advance as possible. >>>> >>>> I didn't know you were working on that. I just started poking at more >>>> general sized integer support too. I wanted to add support for 8, 16, >>>> and 64-bit types. >>> >>> Might be worth noting that NVIDIA has some support for "SIMD" >>> operations on 16- and 8-bit sized values packed in a 32-bit integer. >>> You can see what operations are supported by looking up "video >>> instructions" in the PTX ISA - those roughly map 1:1 with the >>> hardware. However I've never seen NVIDIA blob actually generate them, >>> even with NV_gpu_shader5's u8vec4 and such. I don't know how this >>> changes on Pascal, which is rumored to support fp16 ALU natively. >> >> Have you tried feeding it PTX directly? It could just be a limitation >> of the GLSL compiler. > > I haven't. Although I suspect that if I tell PTX to emit a particular > instruction, then it will convert it to the proper ISA encoding and > emit it, since they really do map 1:1 last I looked. I was more > surprised that u8vec4 + u8vec4 didn't end up using it, and instead did > the adds as 4x32-bit and then re-extracted the low 8 bits. Perhaps > NVIDIA knows something I don't, or perhaps like you say, their GLSL > compiler is just not smart enough to do it. Or perhaps that specific > case caused them to decide not to do it, but a different case would > have used it (probably various issues with instruction latencies, dual > issue capabilities, etc). > > I had originally proposed using this feature to the dolphin team, who > has a ton of u8's in their shaders that they constantly bit-mask and > clamp, but when I saw what the blob was going to do with those, I > withdrew that suggestion. > >> >>>> What's your hardware support plan? I think that any hardware that can >>>> do uaddCarry, usubBorrow, [ui]mulExtended, and findMSB can implement >>>> everything in a relatively efficient manner. I've coded almost all of >>>> the possible 64-bit operations in GLSL using ivec2 or uvec2 and these >>>> primitives as a proof of concept. Less efficient implementations of >>>> everything is possible if any of those primitives are missing. >>>> Technically speaking, it ought to be possible to expose 64-bit integer >>>> support on *any* hardware that has true integers. >>>> >>>> I'm currently leaning towards implementing these as a NIR lowering pass, >>>> but there are other possibilities. There are advantages to doing the >>>> lowering after most or all of the device independent optimizations. In >>>> addition, doing it completely in NIR means that we can get 64-bit >>>> integer support for SPIR-V nearly for free. I've also considered GLSL >>>> IR lowering or lowering while translating GLSL IR to NIR. >>> >>> While I can't speak for AMD hw, NVIDIA has some limited support for 64-bit >>> ints: >>> >>> (a) atomics >>> (b) shifts (so you don't have to use a temp + bitfield manipulation to >>> shift from one 32-bit val to another) >>> (c) conversion between float/double and 64-bit ints >> >> Yeah, some Intel hardware is similar. I suspect we'd want to have a >> bitfield to select which specific operations or groups of operations >> actually need to be lowered. Jason and Ken reminded me that we already >> do basically the same thing for fp64. >> >>> And things like addition can be done using things like carry bits. We >>> have a pass to auto-lower 64-bit integer ops at the "end" so that >>> splitting them up doesn't affect things like constant propagation and >>> other optimizations. [I'm sure it'll need adjusting for a full 64-bit >>> int implementation, it mostly ends up getting used with address >>> calculations.] So I'd be highly in favor of (a) letting the backend >>> deal with it and (b) having the requisite TGSI opcodes to express it >>> all cleanly [which is what Dave has done]. >> >> We'll definitely need support in the lower-level IRs. Current and >> future GPUs have various levels of native support. We really want to >> take advantage of that. Some drivers will also want to implement their >> own lowering for some things. For example, before Gen7, Intel GPUs >> didn't have a 32x32->64 multiplier. They have a 16x32->48 multiplier >> (I'm not kidding) that can be used to simulate a 32x32->64 multiplier. >> I think we can use that in a clever way to generate a 64x64->64 results >> more efficiently than would come from a generic lowering pass that uses >> 32x32->64 multiplications. > > nv50 has 24x24 -> 32 [and 16x16 -> 32]. Loads of fun to implement > imulExtended() on that - you still have to compute the low bits for > the carry information. nvc0 all has the regular 32x32 -> low/high 32 > logic, with optional carry addition/generation, so it's no trouble. > >> >> At the same time, if implementing lowering once at a higher level means >> that we can enable a feature in more places more quickly, that seems >> like winning. I think blending the two approaches will lead to the best >> overall result. I doubt Marek will spend any effort implementing 64-bit >> integer support for r600. If the real work of adding that support >> happened at higher levels of Mesa, I bet he'd accept patches. :) > > I'm in no way opposed to having shareable "fudging" logic, so that it > can be used by drivers with less sophisticated backends, or ones that > are getting less development interest. Just want to make sure that a > way to let the backend just deal with it remains.
For r600, the lowering to int32 should take place in a common place (e.g. GLSL IR). For radeonsi, we'd like to get all int64 opcodes because we already have full int64 support in the LLVM backend. Marek _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev