On Mon, Jun 11, 2018 at 10:59 PM, Roland Scheidegger <srol...@vmware.com> wrote: > Am 12.06.2018 um 01:17 schrieb Rob Clark: >> On Mon, Jun 11, 2018 at 6:59 PM, Roland Scheidegger <srol...@vmware.com> >> wrote: >>> Am 12.06.2018 um 00:32 schrieb Jason Ekstrand: >>>> On Wed, Jun 6, 2018 at 7:43 AM, Rob Clark <robdcl...@gmail.com >>>> <mailto:robdcl...@gmail.com>> wrote: >>>> >>>> Signed-off-by: Rob Clark <robdcl...@gmail.com >>>> <mailto:robdcl...@gmail.com>> >>>> --- >>>> I can't say for sure that this will work on all drivers, but it is >>>> what the blob driver does, and it seems to make deqp happy. I could >>>> move this to it's own pass inside ir3, but that seemed like overkill >>>> >>>> src/compiler/nir/nir.h | 10 ++++++++++ >>>> src/compiler/nir/nir_lower_system_values.c | 17 +++++++++++++++++ >>>> src/gallium/drivers/freedreno/ir3/ir3_nir.c | 1 + >>>> 3 files changed, 28 insertions(+) >>>> >>>> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h >>>> index 073ab4e82ea..de3d55d83af 100644 >>>> --- a/src/compiler/nir/nir.h >>>> +++ b/src/compiler/nir/nir.h >>>> @@ -1963,6 +1963,16 @@ typedef struct nir_shader_compiler_options { >>>> */ >>>> bool lower_base_vertex; >>>> >>>> + /** >>>> + * If enabled, gl_HelperInvocation will be lowered as: >>>> + * >>>> + * !((1 << gl_SampleID) & gl_SampleMaskIN[0])) >>>> >>>> >>>> This only works for multi-sampling. What about the single-sampled case? >>> It doesn't make sense to me for msaa neither. >>> gl_SampleID forces per-sample execution, which clearly isn't what you >>> want. >> >> open to suggestions about how to try and force blob to do something >> different.. I tried a few things and the blob popped out shader code >> that worked the same way in all cases, but ofc I could have missed >> something. >> >>> Plus, gl_SampleMaskIN is specified to only contain bits for the >>> current shader invocation, so for msaa with forced per-sample execution >>> that would only contain the single bit corresponding to gl_SampleID >>> anyway (that is my interpretation at least - I know hw sample mask >>> inputs will probably always contains all the bits from rasterization, >>> regardless of gl_SampleID), so the "1 << gl_SampleID &" part will do >>> nothing at all. So I think that part is more about lowering the hw >>> rasterization sample mask to gl_SampleMaskIN rather than lowering to >>> gl_HelperInvocation. >>> But yes, !gl_SampleMaskIN should give gl_HelperInvocation - I think all >>> hw can give you raster sample mask even without msaa but I'm not >>> entirely sure if it's guaranteed to work in GL with single sampling >>> (similar for sample id, which should just be stuck at 0). But using the >>> gl names here looks very, very fishy to me here. >> >> I *guess* this is relying on behaviour that gl does not guarantee, but >> also doesn't prohibit, but the hw happens work that way and it works >> out.. >> >> Maybe the answer is to simply better document the assumptions that >> this lowering depends on? > Well I think if you have a nir shader which uses sample id but is > supposed to not be run per-sample that is quite confusing (I'd just > assume that adrenos will use sample id 0 just like in single sampled > case). Hopefully noone will deduce shader frequency from the nir shader... > I'm definitely no NIR expert but there's quite a lot of hacky > assumptions in there: > - sample id in nir doesn't cause per-sample execution (and needs to be 0 > in case of msaa but no per-sample execution) > - sampleMaskIn is available in single-sampled > - sampleMaskIn doesn't actually have gl semantics but hw ones (so simply > the mask from rasterization, not taking into account the sample id > already if there's per-sample execution (for legitimate reasons), > although as mentioned I think there's still some debate about this and > not all drivers may actually agree what the semantics of it are in case > of per-sample execution) > > As such I'm not sure it really makes sense to do that as a generic > lowering pass? Albeit you're quite right that it might work for a lot of > hw... But some comments would at the very least imho definitely be required. >
so I reworded the comment about the enable flag a bit: /** * If enabled, gl_HelperInvocation will be lowered as: * * !((1 << sample_id) & sample_mask_in)) * * This depends on some possibly hw implementation details, which may * not be true for all hw. In particular that the FS is only executed * for covered samples or for helper invocations. So, do not blindly * enable this option. */ bool lower_helper_invocation; But I guess I probably do need to introduce a new intrinsic to access sample_id without necessarily triggering per-sample mode.. hmm.. BR, -R > Roland > > > >> >> BR, >> -R >> >>> >>> Roland >>> >>> >>> >>> >>>> >>>> --Jason >>>> >>>> >>>> >>>> + * >>>> + * TODO any hw w/ more than 32 samples? For them (if they >>>> + * used this option), a bit more math would be involved. >>>> + */ >>>> + bool lower_helper_invocation; >>>> + >>>> bool lower_cs_local_index_from_id; >>>> >>>> bool lower_device_index_to_zero; >>>> diff --git a/src/compiler/nir/nir_lower_system_values.c >>>> b/src/compiler/nir/nir_lower_system_values.c >>>> index 487da042620..6668cbb5dcd 100644 >>>> --- a/src/compiler/nir/nir_lower_system_values.c >>>> +++ b/src/compiler/nir/nir_lower_system_values.c >>>> @@ -136,6 +136,23 @@ convert_block(nir_block *block, nir_builder *b) >>>> nir_load_first_vertex(b)); >>>> break; >>>> >>>> + case SYSTEM_VALUE_HELPER_INVOCATION: >>>> + if (b->shader->options->lower_helper_invocation) { >>>> + nir_ssa_def *tmp; >>>> + >>>> + tmp = nir_ushr(b, >>>> + nir_imm_int(b, 1), >>>> + nir_load_sample_id(b)); >>>> + >>>> + tmp = nir_iand(b, >>>> + nir_load_sample_mask_in(b), >>>> + tmp); >>>> + >>>> + sysval = nir_inot(b, nir_i2b(b, tmp)); >>>> + } >>>> + >>>> + break; >>>> + >>>> case SYSTEM_VALUE_INSTANCE_INDEX: >>>> sysval = nir_iadd(b, >>>> nir_load_instance_id(b), >>>> diff --git a/src/gallium/drivers/freedreno/ir3/ir3_nir.c >>>> b/src/gallium/drivers/freedreno/ir3/ir3_nir.c >>>> index cd1f9c526f2..341d990b269 100644 >>>> --- a/src/gallium/drivers/freedreno/ir3/ir3_nir.c >>>> +++ b/src/gallium/drivers/freedreno/ir3/ir3_nir.c >>>> @@ -51,6 +51,7 @@ static const nir_shader_compiler_options options = { >>>> .lower_extract_byte = true, >>>> .lower_extract_word = true, >>>> .lower_all_io_to_temps = true, >>>> + .lower_helper_invocation = true, >>>> }; >>>> >>>> struct nir_shader * >>>> -- >>>> 2.17.0 >>>> >>>> _______________________________________________ >>>> mesa-dev mailing list >>>> mesa-dev@lists.freedesktop.org <mailto:mesa-dev@lists.freedesktop.org> >>>> >>>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fmesa-dev&data=02%7C01%7Csroland%40vmware.com%7C3de626b5c92844260bfb08d5cff17c34%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C1%7C0%7C636643558433673773&sdata=rA8s3IuEmEOR6NKdxwa5eTIwAOOhbAQWNAxW6FmiUOU%3D&reserved=0 >>>> >>>> <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fmesa-dev&data=02%7C01%7Csroland%40vmware.com%7Cc50ea7f234a34e635ac308d5cfeb432a%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C1%7C0%7C636643531719133589&sdata=k8EjcVGoBZyrmPfu5s5yUFZ7OerpIrUi3W3bK7qVz5o%3D&reserved=0> >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> mesa-dev mailing list >>>> mesa-dev@lists.freedesktop.org >>>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fmesa-dev&data=02%7C01%7Csroland%40vmware.com%7Cc50ea7f234a34e635ac308d5cfeb432a%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C1%7C0%7C636643531719133589&sdata=k8EjcVGoBZyrmPfu5s5yUFZ7OerpIrUi3W3bK7qVz5o%3D&reserved=0 >>>> >>> > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev