On Tue, Jun 12, 2018 at 3:36 PM, Jason Ekstrand <ja...@jlekstrand.net> wrote: > On Tue, Jun 12, 2018 at 10:49 AM, Rob Clark <robdcl...@gmail.com> wrote: >> >> On Mon, Jun 11, 2018 at 10:59 PM, Roland Scheidegger <srol...@vmware.com> >> wrote: >> > Am 12.06.2018 um 01:17 schrieb Rob Clark: >> >> On Mon, Jun 11, 2018 at 6:59 PM, Roland Scheidegger >> >> <srol...@vmware.com> wrote: >> >>> Am 12.06.2018 um 00:32 schrieb Jason Ekstrand: >> >>>> On Wed, Jun 6, 2018 at 7:43 AM, Rob Clark <robdcl...@gmail.com >> >>>> <mailto:robdcl...@gmail.com>> wrote: >> >>>> >> >>>> Signed-off-by: Rob Clark <robdcl...@gmail.com >> >>>> <mailto:robdcl...@gmail.com>> >> >>>> --- >> >>>> I can't say for sure that this will work on all drivers, but it >> >>>> is >> >>>> what the blob driver does, and it seems to make deqp happy. I >> >>>> could >> >>>> move this to it's own pass inside ir3, but that seemed like >> >>>> overkill >> >>>> >> >>>> src/compiler/nir/nir.h | 10 ++++++++++ >> >>>> src/compiler/nir/nir_lower_system_values.c | 17 >> >>>> +++++++++++++++++ >> >>>> src/gallium/drivers/freedreno/ir3/ir3_nir.c | 1 + >> >>>> 3 files changed, 28 insertions(+) >> >>>> >> >>>> diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h >> >>>> index 073ab4e82ea..de3d55d83af 100644 >> >>>> --- a/src/compiler/nir/nir.h >> >>>> +++ b/src/compiler/nir/nir.h >> >>>> @@ -1963,6 +1963,16 @@ typedef struct nir_shader_compiler_options >> >>>> { >> >>>> */ >> >>>> bool lower_base_vertex; >> >>>> >> >>>> + /** >> >>>> + * If enabled, gl_HelperInvocation will be lowered as: >> >>>> + * >> >>>> + * !((1 << gl_SampleID) & gl_SampleMaskIN[0])) >> >>>> >> >>>> >> >>>> This only works for multi-sampling. What about the single-sampled >> >>>> case? >> >>> It doesn't make sense to me for msaa neither. >> >>> gl_SampleID forces per-sample execution, which clearly isn't what you >> >>> want. >> >> >> >> open to suggestions about how to try and force blob to do something >> >> different.. I tried a few things and the blob popped out shader code >> >> that worked the same way in all cases, but ofc I could have missed >> >> something. >> >> >> >>> Plus, gl_SampleMaskIN is specified to only contain bits for the >> >>> current shader invocation, so for msaa with forced per-sample >> >>> execution >> >>> that would only contain the single bit corresponding to gl_SampleID >> >>> anyway (that is my interpretation at least - I know hw sample mask >> >>> inputs will probably always contains all the bits from rasterization, >> >>> regardless of gl_SampleID), so the "1 << gl_SampleID &" part will do >> >>> nothing at all. So I think that part is more about lowering the hw >> >>> rasterization sample mask to gl_SampleMaskIN rather than lowering to >> >>> gl_HelperInvocation. >> >>> But yes, !gl_SampleMaskIN should give gl_HelperInvocation - I think >> >>> all >> >>> hw can give you raster sample mask even without msaa but I'm not >> >>> entirely sure if it's guaranteed to work in GL with single sampling >> >>> (similar for sample id, which should just be stuck at 0). But using >> >>> the >> >>> gl names here looks very, very fishy to me here. >> >> >> >> I *guess* this is relying on behaviour that gl does not guarantee, but >> >> also doesn't prohibit, but the hw happens work that way and it works >> >> out.. >> >> >> >> Maybe the answer is to simply better document the assumptions that >> >> this lowering depends on? >> > Well I think if you have a nir shader which uses sample id but is >> > supposed to not be run per-sample that is quite confusing (I'd just >> > assume that adrenos will use sample id 0 just like in single sampled >> > case). Hopefully noone will deduce shader frequency from the nir >> > shader... >> > I'm definitely no NIR expert but there's quite a lot of hacky >> > assumptions in there: >> > - sample id in nir doesn't cause per-sample execution (and needs to be 0 >> > in case of msaa but no per-sample execution) >> > - sampleMaskIn is available in single-sampled >> > - sampleMaskIn doesn't actually have gl semantics but hw ones (so simply >> > the mask from rasterization, not taking into account the sample id >> > already if there's per-sample execution (for legitimate reasons), >> > although as mentioned I think there's still some debate about this and >> > not all drivers may actually agree what the semantics of it are in case >> > of per-sample execution) >> > >> > As such I'm not sure it really makes sense to do that as a generic >> > lowering pass? Albeit you're quite right that it might work for a lot of >> > hw... But some comments would at the very least imho definitely be >> > required. >> > >> >> so I reworded the comment about the enable flag a bit: >> >> /** >> * If enabled, gl_HelperInvocation will be lowered as: >> * >> * !((1 << sample_id) & sample_mask_in)) >> * >> * This depends on some possibly hw implementation details, which may >> * not be true for all hw. In particular that the FS is only executed >> * for covered samples or for helper invocations. So, do not blindly >> * enable this option. >> */ >> bool lower_helper_invocation; >> >> But I guess I probably do need to introduce a new intrinsic to access >> sample_id without necessarily triggering per-sample mode.. hmm.. > > > Interesting side-note: Issue 22 of the ARB_shader_image_load_store spec > explicitly calls out gl_SampleMaskIn[0] as a mechanism for detecting helper > pixels. So, apart from the whole implicitly causing per-sample shading, > this does appear to be valid if you're willing to take an issue in an > extension as spec text. :-) >
I guess that might be worth referencing as a hint that this might work on more than one hw, but I guess maybe it is not safe to count on as a "this lowering option is guaranteed to work on all hw" thing.. Anyways, I'll respin this patch w/ the above comment (perhaps embellished a bit) when I can think up a decent name for an intrinsic to access sample_id without implying that you need to execute the FS per-sample.. picking names is hard ;-) BR, -R > --Jason > > >> >> BR, >> -R >> >> >> > Roland >> > >> > >> > >> >> >> >> BR, >> >> -R >> >> >> >>> >> >>> Roland >> >>> >> >>> >> >>> >> >>> >> >>>> >> >>>> --Jason >> >>>> >> >>>> >> >>>> >> >>>> + * >> >>>> + * TODO any hw w/ more than 32 samples? For them (if they >> >>>> + * used this option), a bit more math would be involved. >> >>>> + */ >> >>>> + bool lower_helper_invocation; >> >>>> + >> >>>> bool lower_cs_local_index_from_id; >> >>>> >> >>>> bool lower_device_index_to_zero; >> >>>> diff --git a/src/compiler/nir/nir_lower_system_values.c >> >>>> b/src/compiler/nir/nir_lower_system_values.c >> >>>> index 487da042620..6668cbb5dcd 100644 >> >>>> --- a/src/compiler/nir/nir_lower_system_values.c >> >>>> +++ b/src/compiler/nir/nir_lower_system_values.c >> >>>> @@ -136,6 +136,23 @@ convert_block(nir_block *block, nir_builder >> >>>> *b) >> >>>> nir_load_first_vertex(b)); >> >>>> break; >> >>>> >> >>>> + case SYSTEM_VALUE_HELPER_INVOCATION: >> >>>> + if (b->shader->options->lower_helper_invocation) { >> >>>> + nir_ssa_def *tmp; >> >>>> + >> >>>> + tmp = nir_ushr(b, >> >>>> + nir_imm_int(b, 1), >> >>>> + nir_load_sample_id(b)); >> >>>> + >> >>>> + tmp = nir_iand(b, >> >>>> + nir_load_sample_mask_in(b), >> >>>> + tmp); >> >>>> + >> >>>> + sysval = nir_inot(b, nir_i2b(b, tmp)); >> >>>> + } >> >>>> + >> >>>> + break; >> >>>> + >> >>>> case SYSTEM_VALUE_INSTANCE_INDEX: >> >>>> sysval = nir_iadd(b, >> >>>> nir_load_instance_id(b), >> >>>> diff --git a/src/gallium/drivers/freedreno/ir3/ir3_nir.c >> >>>> b/src/gallium/drivers/freedreno/ir3/ir3_nir.c >> >>>> index cd1f9c526f2..341d990b269 100644 >> >>>> --- a/src/gallium/drivers/freedreno/ir3/ir3_nir.c >> >>>> +++ b/src/gallium/drivers/freedreno/ir3/ir3_nir.c >> >>>> @@ -51,6 +51,7 @@ static const nir_shader_compiler_options >> >>>> options = { >> >>>> .lower_extract_byte = true, >> >>>> .lower_extract_word = true, >> >>>> .lower_all_io_to_temps = true, >> >>>> + .lower_helper_invocation = true, >> >>>> }; >> >>>> >> >>>> struct nir_shader * >> >>>> -- >> >>>> 2.17.0 >> >>>> >> >>>> _______________________________________________ >> >>>> mesa-dev mailing list >> >>>> mesa-dev@lists.freedesktop.org >> >>>> <mailto:mesa-dev@lists.freedesktop.org> >> >>>> >> >>>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fmesa-dev&data=02%7C01%7Csroland%40vmware.com%7C3de626b5c92844260bfb08d5cff17c34%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C1%7C0%7C636643558433673773&sdata=rA8s3IuEmEOR6NKdxwa5eTIwAOOhbAQWNAxW6FmiUOU%3D&reserved=0 >> >>>> >> >>>> <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fmesa-dev&data=02%7C01%7Csroland%40vmware.com%7Cc50ea7f234a34e635ac308d5cfeb432a%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C1%7C0%7C636643531719133589&sdata=k8EjcVGoBZyrmPfu5s5yUFZ7OerpIrUi3W3bK7qVz5o%3D&reserved=0> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> _______________________________________________ >> >>>> mesa-dev mailing list >> >>>> mesa-dev@lists.freedesktop.org >> >>>> >> >>>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fmesa-dev&data=02%7C01%7Csroland%40vmware.com%7Cc50ea7f234a34e635ac308d5cfeb432a%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C1%7C0%7C636643531719133589&sdata=k8EjcVGoBZyrmPfu5s5yUFZ7OerpIrUi3W3bK7qVz5o%3D&reserved=0 >> >>>> >> >>> >> > > > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev