On Sat, Feb 8, 2014 at 10:22 PM, Christoph Bumiller <e0425...@student.tuwien.ac.at> wrote: > On 07.02.2014 23:25, Dave Airlie wrote: >>>> Doh, yes because GL has ARB_texture_gather then has stuff hidden away >>>> in ARB_gpu_shader5 I forgot to add the extra bits which I suppose we >>>> should do. >>>> >>>> So I've reposted with the component selection in src1 now. >>> Hmm seems a bit excessive to use an extra reg for that (gather4 but only >>> in d3d11 form uses a src_sel on the sampler reg, but that might not work). >>> I realize this is actually more messy than I thought, since the initial >>> ARB_texture_gather had the ability to query if multi-channel formats are >>> allowed, but had no way to select the channel (somewhat relying on >>> ARB_texture_swizzle to do it, though of course you can't issue multiple >>> gathers with the same texture to get different channels that way). >>> But glsl 4.00 version could select the channel. >>> Is the ARB_texture_gather version actually all that useful or could you >>> merge the two caps? That is, if you have the ability to fetch from >>> multi-channel textures, assume you can also select the channel. The sm4 >>> version of gather4 also has the single-channel format restriction - I >>> guess though some hw really can do 4 channels without channel selection. >> Yeah I think I'll rethink this stuff, it looks like two caps, one for >> MAX_COMPONENTS for ARB_texture_gather4, and just one cap for >> TEXTURE_GATHER_SM5 support which would denote support for all the >> ARB_GPU_shader5 bits. >> >>> Other than that, what about shadow samplers? Gather4 of course can't do >>> it (because the d3d10-style opcodes have different opcodes for shadow >>> comparisons), but the GL style opcodes are usually the same if shadow >>> samplers or not are used. Maybe you don't want to handle that right now, >>> just saying that if you'd want to use the same opcode you'd be missing a >>> component in case of texture cube arrays... Since this can't be used for >>> fixed function though I'd guess nothing would stop you from using a >>> different opcode for shadow samplers. >> >> I've gotten shadow samplers to work with the current opcodes, though I >> have to see about cube arrays if we have the running out of space to >> put everything. >> >> Also the GPU_shader5 spec has a few more oddities, so you have >> textureGatherOffset which can take a non-constant set of offset values >> to apply to all 4 texels, then you have textureGatherOffsets which >> only takes constants again, but 4 of them, one per texel. Looking at >> radeon hw it appears fglrx decomposes textureGatherOffsets into >> multiple gather instructions at the hw level but using the >> non-constant hw support to do this. So I'm not sure if the gallium >> interface should just support non-constant for all offsets and just >> restrict the GL. > > Fwiw Fermi+ support 4 different non-constant offsets, since they're > passed in a register anyway. >
The problem with textureGatherOffsets is that it takes 4 constant offset pairs, but it only samples for i0,j0 for each, unlike textureGather and textureGatherOffset which take a single offset and sample from the i0, j0->i1, j1. So I'd really need to know if any hw can do this effectively, if so we'd have to bake the gallium TG4 instruction as being a) no SM5 available - no component, no shadow, no non-consts, just takes a single constant texture offset. b) SM5 CAP available, shadow, component, single non-constant offset gets i0,j0->i1,j1 behaviour, and multiple non-constant offsets get i0,j0 behaviour. We could in theory lower the textureGatherOffsets into 4 textureGatherOffset just sampling the i0,j0 of each and putting it into a different channel of the dst and this is what AMD hw does anyways, so I've done it in r600g. But maybe we could/should lower this at a higher level, though to be honest I kinda suck at writing lowering passes for GLSL or TGSI. Dave. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev