Hmm, at least tgsi_text_translate() doesn't appear to use tgsi_ureg.. and there are still a number of users of tgsi_text_translate().. I guess handling this in tgsi_ureg would avoid fixing all the tgsi_ureg users, but that still leaves a lot of others. Changing them all still seems to be too intrusive to me.
(And also, I have a large collection of saved tgsi shaders that I use for standalone testing of my shader compiler and don't really like the idea of fixing up 700 or 800 tgsi shaders by hand :-P) That said, looking at the code like llvmpipe where Roland/Jose where thinking we might have problems.. by making the assumption that we never mix TEX* and SAMPLE* opc's, I think we can loosen the restriction to: for TEX* instructions, the tgsi must either *not* include SVIEW, or *must* include a matching SVIEW[idx] for every SAMP[idx] Which is a restriction that glsl_to_tgsi follows. If you follow this restriction, then for TEX* shaders which have SVIEW's, file_max[TGSI_FILE_SAMPLER_VIEW] == file_max[TGSI_FILE_SAMPLER] and file_mask[TGSI_FILE_SAMPLER_VIEW] == file_mask[TGSI_FILE_SAMPLER].. so code takes a different but equivalent path. And for TEX* shaders which don't have SVIEW's everything continues to work as before. With this approach, we don't have to fix up everything to create SVIEW[idx] for every SAMP[idx], as long as glsl_to_tgsi always creates SVIEW[idx] for each SAMP[idx], and any other tgsi generator that later adds SVIEW support for TEX* instructions follows the same pattern. So, tl;dr: I think really all I need to add to this patch is add blurb in tgsi.rst to explain this restriction and usage of SVIEW for TEX* Thoughts? BR, -R On Tue, Jun 9, 2015 at 1:20 PM, Marek Olšák <mar...@gmail.com> wrote: > If you only want to modify TGSI and not all the users, you only have > to fix tgsi_ureg. tgsi_ureg is a layer that can hide a lot of small > ugly details if needed, including sampler view declarations when the > users don't even know about them. > > Marek > > > > > > On Tue, Jun 9, 2015 at 6:01 PM, Rob Clark <robdcl...@gmail.com> wrote: >> On Tue, Jun 9, 2015 at 9:32 AM, Roland Scheidegger <srol...@vmware.com> >> wrote: >>> Am 09.06.2015 um 15:00 schrieb Rob Clark: >>>> On Tue, Jun 9, 2015 at 5:01 AM, Jose Fonseca <jfons...@vmware.com> wrote: >>>>> On 09/06/15 04:03, Rob Clark wrote: >>>>>> >>>>>> On Mon, Jun 8, 2015 at 10:50 PM, Roland Scheidegger <srol...@vmware.com> >>>>>> wrote: >>>>>>> >>>>>>> Am 09.06.2015 um 04:40 schrieb Rob Clark: >>>>>>>> >>>>>>>> On Mon, Jun 8, 2015 at 10:36 PM, Roland Scheidegger >>>>>>>> <srol...@vmware.com> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Am 09.06.2015 um 04:20 schrieb Rob Clark: >>>>>>>>>> >>>>>>>>>> On Mon, Jun 8, 2015 at 9:40 PM, Roland Scheidegger >>>>>>>>>> <srol...@vmware.com> wrote: >>>>>>>>>>> >>>>>>>>>>> Am 09.06.2015 um 03:20 schrieb Rob Clark: >>>>>>>>>>>> >>>>>>>>>>>> On Mon, Jun 8, 2015 at 8:35 PM, Roland Scheidegger >>>>>>>>>>>> <srol...@vmware.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Am 08.06.2015 um 20:15 schrieb Rob Clark: >>>>>>>>>>>>>> >>>>>>>>>>>>>> From: Rob Clark <robcl...@freedesktop.org> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Freedreno needs sampler type information to deal with int/uint >>>>>>>>>>>>>> textures. >>>>>>>>>>>>>> To accomplish this, start creating sampler-view declarations, as >>>>>>>>>>>>>> suggested here: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://lists.freedesktop.org/archives/mesa-dev/2014-November/071583.html >>>>>>>>>>>>>> >>>>>>>>>>>>>> create a sampler-view with index matching the sampler, to encode >>>>>>>>>>>>>> the >>>>>>>>>>>>>> texture type (ie. SINT/UINT/FLOAT). Ie: >>>>>>>>>>>>>> >>>>>>>>>>>>>> DCL SVIEW[n], 2D, UINT >>>>>>>>>>>>>> DCL SAMP[n] >>>>>>>>>>>>>> TEX OUT[1], IN[1], SAMP[n] >>>>>>>>>>>>>> >>>>>>>>>>>>>> For tgsi texture instructions which do not take an explicit SVIEW >>>>>>>>>>>>>> argument, the SVIEW index is implied by the SAMP index. >>>>>>>>>>>>> >>>>>>>>>>>>> I wonder if there should be some doc update somewhere. >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> yeah, perhaps.. I wasn't quite sure where, tgsi.rst only talks >>>>>>>>>>>> about >>>>>>>>>>>> SVIEW in terms of the SAMPLE and related opc's (ie. the ones which >>>>>>>>>>>> do >>>>>>>>>>>> take an SVIEW arg), and the way things are with this patch, other >>>>>>>>>>>> drivers simply need to ignore the extra SVIEW decl's and not >>>>>>>>>>>> randomly >>>>>>>>>>>> assert for things to continue working as before.. >>>>>>>>>>>> >>>>>>>>>>>> hypothetically if something in tree actually created SVIEW decl >>>>>>>>>>>> currently, and the shader also had TEX* style instructions, it >>>>>>>>>>>> would >>>>>>>>>>>> have to take care to not have conflicting SVIEW's. But afaict >>>>>>>>>>>> nothing >>>>>>>>>>>> in-tree creates sampler view decl's currently. >>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Signed-off-by: Rob Clark <robcl...@freedesktop.org> >>>>>>>>>>>>>> --- >>>>>>>>>>>>>> src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 29 >>>>>>>>>>>>>> +++++++++++++++++++++++++++++ >>>>>>>>>>>>>> 1 file changed, 29 insertions(+) >>>>>>>>>>>>>> >>>>>>>>>>>>>> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp >>>>>>>>>>>>>> b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp >>>>>>>>>>>>>> index 0e60d95..ce8f2ea 100644 >>>>>>>>>>>>>> --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp >>>>>>>>>>>>>> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp >>>>>>>>>>>>>> @@ -239,6 +239,7 @@ public: >>>>>>>>>>>>>> st_src_reg sampler; /**< sampler register */ >>>>>>>>>>>>>> int sampler_array_size; /**< 1-based size of sampler array, >>>>>>>>>>>>>> 1 >>>>>>>>>>>>>> if not array */ >>>>>>>>>>>>>> int tex_target; /**< One of TEXTURE_*_INDEX */ >>>>>>>>>>>>>> + glsl_base_type tex_type; >>>>>>>>>>>>>> GLboolean tex_shadow; >>>>>>>>>>>>>> >>>>>>>>>>>>>> st_src_reg tex_offsets[MAX_GLSL_TEXTURE_OFFSET]; >>>>>>>>>>>>>> @@ -345,6 +346,8 @@ public: >>>>>>>>>>>>>> >>>>>>>>>>>>>> int num_address_regs; >>>>>>>>>>>>>> int samplers_used; >>>>>>>>>>>>>> + glsl_base_type sampler_types[PIPE_MAX_SAMPLERS]; >>>>>>>>>>>>>> + int sampler_targets[PIPE_MAX_SAMPLERS]; /**< One of >>>>>>>>>>>>>> TGSI_TEXTURE_* */ >>>>>>>>>>>>>> bool indirect_addr_consts; >>>>>>>>>>>>>> int wpos_transform_const; >>>>>>>>>>>>>> >>>>>>>>>>>>>> @@ -3323,6 +3326,8 @@ glsl_to_tgsi_visitor::visit(ir_texture *ir) >>>>>>>>>>>>>> assert(!"Should not get here."); >>>>>>>>>>>>>> } >>>>>>>>>>>>>> >>>>>>>>>>>>>> + inst->tex_type = ir->type->base_type; >>>>>>>>>>>>>> + >>>>>>>>>>>>>> this->result = result_src; >>>>>>>>>>>>>> } >>>>>>>>>>>>>> >>>>>>>>>>>>>> @@ -3471,6 +3476,11 @@ count_resources(glsl_to_tgsi_visitor *v, >>>>>>>>>>>>>> gl_program *prog) >>>>>>>>>>>>>> for (int i = 0; i < inst->sampler_array_size; i++) { >>>>>>>>>>>>>> v->samplers_used |= 1 << (inst->sampler.index + i); >>>>>>>>>>>>>> >>>>>>>>>>>>>> + debug_assert(i < (int)ARRAY_SIZE(v->sampler_types)); >>>>>>>>>>>>>> + v->sampler_types[i] = inst->tex_type; >>>>>>>>>>>>>> + v->sampler_targets[i] = >>>>>>>>>>>>>> + st_translate_texture_target(inst->tex_target, >>>>>>>>>>>>>> inst->tex_shadow); >>>>>>>>>>>>>> + >>>>>>>>>>>>>> if (inst->tex_shadow) { >>>>>>>>>>>>>> prog->ShadowSamplers |= 1 << >>>>>>>>>>>>>> (inst->sampler.index >>>>>>>>>>>>>> + i); >>>>>>>>>>>>>> } >>>>>>>>>>>>>> @@ -5529,7 +5539,26 @@ st_translate_program( >>>>>>>>>>>>>> /* texture samplers */ >>>>>>>>>>>>>> for (i = 0; i < >>>>>>>>>>>>>> ctx->Const.Program[MESA_SHADER_FRAGMENT].MaxTextureImageUnits; >>>>>>>>>>>>>> i++) { >>>>>>>>>>>>>> if (program->samplers_used & (1 << i)) { >>>>>>>>>>>>>> + unsigned type; >>>>>>>>>>>>>> + >>>>>>>>>>>>>> t->samplers[i] = ureg_DECL_sampler(ureg, i); >>>>>>>>>>>>>> + >>>>>>>>>>>>>> + switch (program->sampler_types[i]) { >>>>>>>>>>>>>> + case GLSL_TYPE_INT: >>>>>>>>>>>>>> + type = TGSI_RETURN_TYPE_SINT; >>>>>>>>>>>>>> + break; >>>>>>>>>>>>>> + case GLSL_TYPE_UINT: >>>>>>>>>>>>>> + type = TGSI_RETURN_TYPE_UINT; >>>>>>>>>>>>>> + break; >>>>>>>>>>>>>> + case GLSL_TYPE_FLOAT: >>>>>>>>>>>>>> + type = TGSI_RETURN_TYPE_FLOAT; >>>>>>>>>>>>>> + break; >>>>>>>>>>>>>> + default: >>>>>>>>>>>>>> + unreachable("not reached"); >>>>>>>>>>>>>> + } >>>>>>>>>>>>>> + >>>>>>>>>>>>>> + ureg_DECL_sampler_view( ureg, i, >>>>>>>>>>>>>> program->sampler_targets[i], >>>>>>>>>>>>>> + type, type, type, type ); >>>>>>>>>>>>>> } >>>>>>>>>>>>>> } >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> This indeed seems like a consistent solution. Hopefully drivers >>>>>>>>>>>>> don't >>>>>>>>>>>>> assert when they see sampler view dcls... >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> well, tgsi_to_nir did before I fixed it :-( >>>>>>>>>>>> >>>>>>>>>>>> (mostly just because of liberal debug_assert() usage) >>>>>>>>>>>> >>>>>>>>>>>> it's at least not something I'd land right before a release branch >>>>>>>>>>>> point. But I guess the fix is easy enough (ie. remove a bogus >>>>>>>>>>>> assert), and 10.7 branch point is quite some time from now.. >>>>>>>>>>>> >>>>>>>>>>>>> I am also somewhat worried that this only changes the glsl-to-tgsi >>>>>>>>>>>>> path, >>>>>>>>>>>>> not other paths. llvmpipe and draw rely on either having sview >>>>>>>>>>>>> declarations for all (sample) instructions or for none, and with >>>>>>>>>>>>> draw >>>>>>>>>>>>> possibly inserting tex opcodes (for aalines etc.) on its own this >>>>>>>>>>>>> could >>>>>>>>>>>>> probably break (these draw paths aren't used with d3d10 sample >>>>>>>>>>>>> opcodes >>>>>>>>>>>>> as it's more or less illegal to use tex and sample opcodes in the >>>>>>>>>>>>> same >>>>>>>>>>>>> shader, but that doesn't really matter here). I think using sview >>>>>>>>>>>>> declarations is the right thing to do but it probably should be >>>>>>>>>>>>> illegal >>>>>>>>>>>>> then to NOT have them in some places. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> fwiw, the way I implemented it in tgsi_to_nir (and what I'd >>>>>>>>>>>> recommend >>>>>>>>>>>> for other drivers) is that for the "tex style" opcodes, if there is >>>>>>>>>>>> not a matching SVIEW decl, assume float. That seemed more sane >>>>>>>>>>>> than >>>>>>>>>>>> fixing up *all* the other state trackers.. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Yeah but for llvmpipe/draw it's not just the type. The problem is >>>>>>>>>>> how >>>>>>>>>>> they are parsing this stuff - if there's at least one sview decl >>>>>>>>>>> they'll >>>>>>>>>>> construct the static texture state based on the defined sampler >>>>>>>>>>> views, >>>>>>>>>>> but if there's none (up to now for gl state tracker) they'll do this >>>>>>>>>>> based on the defined samplers, and for some there simply won't be >>>>>>>>>>> any >>>>>>>>>>> such definitions (if draw inserted some stage). llvmpipe can handle >>>>>>>>>>> tex >>>>>>>>>>> opcodes without sview dcls, it should also be able to handle them if >>>>>>>>>>> you >>>>>>>>>>> have them (but I wouldn't bet on it without testing as that's >>>>>>>>>>> definitely >>>>>>>>>>> going to hit some code paths never seen before), but it can't handle >>>>>>>>>>> having "some" sview dcls (see for example lp_state_fs.c, line 3050 >>>>>>>>>>> and >>>>>>>>>>> following why it can't work). This is fixable but only with some >>>>>>>>>>> awkwardness (since due to d3d10 sample opcodes not having 1:1 >>>>>>>>>>> mapping >>>>>>>>>>> between samplers and sampler views). Hence my proposal of making it >>>>>>>>>>> mandatory to have sview decls instead of having them optionally. >>>>>>>>>>> That >>>>>>>>>>> way could also ditch that code which distinguishes having sviews or >>>>>>>>>>> not. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> hmm, I could squash in something like: >>>>>>>>>> >>>>>>>>>> ---- >>>>>>>>>> diff --git a/src/gallium/drivers/llvmpipe/lp_state_fs.c >>>>>>>>>> b/src/gallium/drivers/llvmpipe/lp_state_fs.c >>>>>>>>>> index b5ce868..2878c49 100644 >>>>>>>>>> --- a/src/gallium/drivers/llvmpipe/lp_state_fs.c >>>>>>>>>> +++ b/src/gallium/drivers/llvmpipe/lp_state_fs.c >>>>>>>>>> @@ -3059,27 +3059,11 @@ make_variant_key(struct llvmpipe_context *lp, >>>>>>>>>> } >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> - /* >>>>>>>>>> - * XXX If TGSI_FILE_SAMPLER_VIEW exists assume all texture >>>>>>>>>> opcodes >>>>>>>>>> - * are dx10-style? Can't really have mixed opcodes, at least not >>>>>>>>>> - * if we want to skip the holes here (without rescanning tgsi). >>>>>>>>>> - */ >>>>>>>>>> - if (shader->info.base.file_max[TGSI_FILE_SAMPLER_VIEW] != -1) { >>>>>>>>>> - key->nr_sampler_views = >>>>>>>>>> shader->info.base.file_max[TGSI_FILE_SAMPLER_VIEW] + 1; >>>>>>>>>> - for(i = 0; i < key->nr_sampler_views; ++i) { >>>>>>>>>> - if(shader->info.base.file_mask[TGSI_FILE_SAMPLER_VIEW] & (1 >>>>>>>>>> << i)) { >>>>>>>>>> - >>>>>>>>>> lp_sampler_static_texture_state(&key->state[i].texture_state, >>>>>>>>>> - >>>>>>>>>> lp->sampler_views[PIPE_SHADER_FRAGMENT][i]); >>>>>>>>>> - } >>>>>>>>>> - } >>>>>>>>>> - } >>>>>>>>>> - else { >>>>>>>>>> - key->nr_sampler_views = key->nr_samplers; >>>>>>>>>> - for(i = 0; i < key->nr_sampler_views; ++i) { >>>>>>>>>> - if(shader->info.base.file_mask[TGSI_FILE_SAMPLER] & (1 << >>>>>>>>>> i)) { >>>>>>>>>> - >>>>>>>>>> lp_sampler_static_texture_state(&key->state[i].texture_state, >>>>>>>>>> - >>>>>>>>>> lp->sampler_views[PIPE_SHADER_FRAGMENT][i]); >>>>>>>>>> - } >>>>>>>>>> + key->nr_sampler_views = key->nr_samplers; >>>>>>>>>> + for(i = 0; i < key->nr_sampler_views; ++i) { >>>>>>>>>> + if(shader->info.base.file_mask[TGSI_FILE_SAMPLER] & (1 << i)) >>>>>>>>>> { >>>>>>>>>> + >>>>>>>>>> lp_sampler_static_texture_state(&key->state[i].texture_state, >>>>>>>>>> + >>>>>>>>>> lp->sampler_views[PIPE_SHADER_FRAGMENT][i]); >>>>>>>>>> } >>>>>>>>>> } >>>>>>>>>> } >>>>>>>>> >>>>>>>>> No that won't work for d3d10 state trackers, where you typically have >>>>>>>>> more >>>>>>>>> views than samplers (though the opposite is possible as well and you >>>>>>>>> can >>>>>>>>> have holes in both). I think it really would look nicer if you could >>>>>>>>> just rely on sampler views being present. >>>>>>>> >>>>>>>> >>>>>>>> what d3d10 state tracker? >>>>>>>> >>>>>>> >>>>>>> Make that "some non-public code which uses the sample opcodes"... >>>>>> >>>>>> >>>>>> well, not to be rude, but I can't really be expected to fix / >>>>>> avoid-breaking code that doesn't exist (in upstream mesa).. >>>>> >>>>> >>>>> Rob, >>>>> >>>>> Of course. Nobody expects or asked you to do such thing. >>>>> >>>>> Roland's just explaining why your proposed llvmpipe workaround is not >>>>> sustainable. Breaking our closed state tracker is OK and happens all the >>>>> time FYI. What we can't have is irreparable breakage. In short, we just >>>>> need to workout a solution that doesn't leave us with our hands tied to >>>>> fix >>>>> it. >>>>> >>>>> And in fact, Roland's suggestion of requiring all TGSI producers including >>>>> SVIEW dcl does not require fixing any closed-source state tracker but >>>>> rather >>>>> just fix TGSI generators in Mesa to put the SVIEW. >>>> >>>> Hmm, well there are a *lot* of things in mesa which generate tgsi (vl, >>>> gallium9, etc).. fixing them all, esp. when I don't necessarily know >>>> all the other state trackers well enough to know what they *should* be >>>> doing, didn't seem reasonable. >>>> >>>> I was hoping we could just leave SVIEW decl as optional to avoid >>>> having to fix up everything. It just means that something which does >>>> use SVIEW/SAMPLE* needs to avoid having SVIEW index's which conflict >>>> with SAMP decl indexes. >>>> >>>> Although maybe w/ the draw stuff, if it is potentially inserting >>>> SAMP/TEX* into shaders already using SVIEW/SAMPLE*, maybe it needs to >>>> take care to insert SAMP decl's with #'s that start above max SVIEW >>>> index. I'm not sure, I haven't used the aux/draw stuff yet, so I'm >>>> not terribly familiar with it. >>>> >>>> Not sure if it would help, but I could add an shader cap and for now >>>> only insert the SVIEW decl for drivers that want them. Which I think, >>>> at the moment, is just tgsi_to_nir (and currently the two tgsi_to_nir >>>> users, vc4 and freedreno, use any of the aux/draw stuff). >>>> >>>>> Anyway I really like your approach, appreciate your perseverance so far, >>>>> and >>>>> want to see it through. So please just bear with us a bit more. >>>>> >>>> >>>> fair enough.. I was mostly just trying to avoid having to fix massive >>>> amounts of state trackers, etc, which I am not necessarily terribly >>>> familiar with or even have a way to test. >>>> >>>>> >>>>> Roland, would it be possible to swap the >>>>> `shader->info.base.file_max[TGSI_FILE_SAMPLER_VIEW] != -1` condition with >>>>> something else? Just temporarily, so that we can decouple things, >>>>> allowing >>>>> to commit this as is, then enforce the SVIEW everywhere on a 2nd round. >>>>> >>>>> Specially because enforcing this everywhere will require some work, >>>>> sepcially on u_blit / u_mipmap_gen helpers, which need yet another key. >>>>> >>>>> >>>>> For example, what about actually using the TGSI_OPCODE_SAMPLE_* & friends >>>>> opcode count? In fact, we could go even farther, and compute the opcount >>>>> for TGSI_OPCODE_TEX & friends and ensure that they are never non-zero >>>>> simultanouesly. >>>>> >>>> >>>> hmm, possibly something I didn't understand properly.. if you never >>>> have both TEX* and SAMPLE* at same time, this sounds much easier.. >>>> Ie. just use # of opcodes instead of file_max. Could even just add a >>>> trivial thing to tgsi_scan to count up # of opcodes of each category >>>> (ie. TEX* and SAMPLE*) >>>> >>> >>> That doesn't tell you how many dcls (either samplers or sampler views) >>> you actually have. Even with GL reusing the same texture unit with a >>> different instruction is of course possible (and quite frequently done). >>> Though you are right that because both sample and tex opcodes can't be >>> used in the same shader, scanning the shader could determine if it's >>> either tex or sample and then use the appropriate shader file if we'd >>> want to keep unused views out of the shader. That may also be ok as an >>> interim measure. >> >> yeah, that was basically my idea.. not so much caring about the count, >> as much of which one is non-zero.. >> >> BR, >> -R >> >>> Roland >>> >>> _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev