On Tue, Jun 9, 2015 at 5:01 AM, Jose Fonseca <jfons...@vmware.com> wrote: > On 09/06/15 04:03, Rob Clark wrote: >> >> On Mon, Jun 8, 2015 at 10:50 PM, Roland Scheidegger <srol...@vmware.com> >> wrote: >>> >>> Am 09.06.2015 um 04:40 schrieb Rob Clark: >>>> >>>> On Mon, Jun 8, 2015 at 10:36 PM, Roland Scheidegger <srol...@vmware.com> >>>> wrote: >>>>> >>>>> Am 09.06.2015 um 04:20 schrieb Rob Clark: >>>>>> >>>>>> On Mon, Jun 8, 2015 at 9:40 PM, Roland Scheidegger >>>>>> <srol...@vmware.com> wrote: >>>>>>> >>>>>>> Am 09.06.2015 um 03:20 schrieb Rob Clark: >>>>>>>> >>>>>>>> On Mon, Jun 8, 2015 at 8:35 PM, Roland Scheidegger >>>>>>>> <srol...@vmware.com> wrote: >>>>>>>>> >>>>>>>>> Am 08.06.2015 um 20:15 schrieb Rob Clark: >>>>>>>>>> >>>>>>>>>> From: Rob Clark <robcl...@freedesktop.org> >>>>>>>>>> >>>>>>>>>> Freedreno needs sampler type information to deal with int/uint >>>>>>>>>> textures. >>>>>>>>>> To accomplish this, start creating sampler-view declarations, as >>>>>>>>>> suggested here: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> http://lists.freedesktop.org/archives/mesa-dev/2014-November/071583.html >>>>>>>>>> >>>>>>>>>> create a sampler-view with index matching the sampler, to encode >>>>>>>>>> the >>>>>>>>>> texture type (ie. SINT/UINT/FLOAT). Ie: >>>>>>>>>> >>>>>>>>>> DCL SVIEW[n], 2D, UINT >>>>>>>>>> DCL SAMP[n] >>>>>>>>>> TEX OUT[1], IN[1], SAMP[n] >>>>>>>>>> >>>>>>>>>> For tgsi texture instructions which do not take an explicit SVIEW >>>>>>>>>> argument, the SVIEW index is implied by the SAMP index. >>>>>>>>> >>>>>>>>> I wonder if there should be some doc update somewhere. >>>>>>>>> >>>>>>>> >>>>>>>> yeah, perhaps.. I wasn't quite sure where, tgsi.rst only talks about >>>>>>>> SVIEW in terms of the SAMPLE and related opc's (ie. the ones which >>>>>>>> do >>>>>>>> take an SVIEW arg), and the way things are with this patch, other >>>>>>>> drivers simply need to ignore the extra SVIEW decl's and not >>>>>>>> randomly >>>>>>>> assert for things to continue working as before.. >>>>>>>> >>>>>>>> hypothetically if something in tree actually created SVIEW decl >>>>>>>> currently, and the shader also had TEX* style instructions, it would >>>>>>>> have to take care to not have conflicting SVIEW's. But afaict >>>>>>>> nothing >>>>>>>> in-tree creates sampler view decl's currently. >>>>>>>> >>>>>>>>>> >>>>>>>>>> Signed-off-by: Rob Clark <robcl...@freedesktop.org> >>>>>>>>>> --- >>>>>>>>>> src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 29 >>>>>>>>>> +++++++++++++++++++++++++++++ >>>>>>>>>> 1 file changed, 29 insertions(+) >>>>>>>>>> >>>>>>>>>> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp >>>>>>>>>> b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp >>>>>>>>>> index 0e60d95..ce8f2ea 100644 >>>>>>>>>> --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp >>>>>>>>>> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp >>>>>>>>>> @@ -239,6 +239,7 @@ public: >>>>>>>>>> st_src_reg sampler; /**< sampler register */ >>>>>>>>>> int sampler_array_size; /**< 1-based size of sampler array, 1 >>>>>>>>>> if not array */ >>>>>>>>>> int tex_target; /**< One of TEXTURE_*_INDEX */ >>>>>>>>>> + glsl_base_type tex_type; >>>>>>>>>> GLboolean tex_shadow; >>>>>>>>>> >>>>>>>>>> st_src_reg tex_offsets[MAX_GLSL_TEXTURE_OFFSET]; >>>>>>>>>> @@ -345,6 +346,8 @@ public: >>>>>>>>>> >>>>>>>>>> int num_address_regs; >>>>>>>>>> int samplers_used; >>>>>>>>>> + glsl_base_type sampler_types[PIPE_MAX_SAMPLERS]; >>>>>>>>>> + int sampler_targets[PIPE_MAX_SAMPLERS]; /**< One of >>>>>>>>>> TGSI_TEXTURE_* */ >>>>>>>>>> bool indirect_addr_consts; >>>>>>>>>> int wpos_transform_const; >>>>>>>>>> >>>>>>>>>> @@ -3323,6 +3326,8 @@ glsl_to_tgsi_visitor::visit(ir_texture *ir) >>>>>>>>>> assert(!"Should not get here."); >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> + inst->tex_type = ir->type->base_type; >>>>>>>>>> + >>>>>>>>>> this->result = result_src; >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> @@ -3471,6 +3476,11 @@ count_resources(glsl_to_tgsi_visitor *v, >>>>>>>>>> gl_program *prog) >>>>>>>>>> for (int i = 0; i < inst->sampler_array_size; i++) { >>>>>>>>>> v->samplers_used |= 1 << (inst->sampler.index + i); >>>>>>>>>> >>>>>>>>>> + debug_assert(i < (int)ARRAY_SIZE(v->sampler_types)); >>>>>>>>>> + v->sampler_types[i] = inst->tex_type; >>>>>>>>>> + v->sampler_targets[i] = >>>>>>>>>> + st_translate_texture_target(inst->tex_target, >>>>>>>>>> inst->tex_shadow); >>>>>>>>>> + >>>>>>>>>> if (inst->tex_shadow) { >>>>>>>>>> prog->ShadowSamplers |= 1 << (inst->sampler.index >>>>>>>>>> + i); >>>>>>>>>> } >>>>>>>>>> @@ -5529,7 +5539,26 @@ st_translate_program( >>>>>>>>>> /* texture samplers */ >>>>>>>>>> for (i = 0; i < >>>>>>>>>> ctx->Const.Program[MESA_SHADER_FRAGMENT].MaxTextureImageUnits; i++) { >>>>>>>>>> if (program->samplers_used & (1 << i)) { >>>>>>>>>> + unsigned type; >>>>>>>>>> + >>>>>>>>>> t->samplers[i] = ureg_DECL_sampler(ureg, i); >>>>>>>>>> + >>>>>>>>>> + switch (program->sampler_types[i]) { >>>>>>>>>> + case GLSL_TYPE_INT: >>>>>>>>>> + type = TGSI_RETURN_TYPE_SINT; >>>>>>>>>> + break; >>>>>>>>>> + case GLSL_TYPE_UINT: >>>>>>>>>> + type = TGSI_RETURN_TYPE_UINT; >>>>>>>>>> + break; >>>>>>>>>> + case GLSL_TYPE_FLOAT: >>>>>>>>>> + type = TGSI_RETURN_TYPE_FLOAT; >>>>>>>>>> + break; >>>>>>>>>> + default: >>>>>>>>>> + unreachable("not reached"); >>>>>>>>>> + } >>>>>>>>>> + >>>>>>>>>> + ureg_DECL_sampler_view( ureg, i, >>>>>>>>>> program->sampler_targets[i], >>>>>>>>>> + type, type, type, type ); >>>>>>>>>> } >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> This indeed seems like a consistent solution. Hopefully drivers >>>>>>>>> don't >>>>>>>>> assert when they see sampler view dcls... >>>>>>>> >>>>>>>> >>>>>>>> well, tgsi_to_nir did before I fixed it :-( >>>>>>>> >>>>>>>> (mostly just because of liberal debug_assert() usage) >>>>>>>> >>>>>>>> it's at least not something I'd land right before a release branch >>>>>>>> point. But I guess the fix is easy enough (ie. remove a bogus >>>>>>>> assert), and 10.7 branch point is quite some time from now.. >>>>>>>> >>>>>>>>> I am also somewhat worried that this only changes the glsl-to-tgsi >>>>>>>>> path, >>>>>>>>> not other paths. llvmpipe and draw rely on either having sview >>>>>>>>> declarations for all (sample) instructions or for none, and with >>>>>>>>> draw >>>>>>>>> possibly inserting tex opcodes (for aalines etc.) on its own this >>>>>>>>> could >>>>>>>>> probably break (these draw paths aren't used with d3d10 sample >>>>>>>>> opcodes >>>>>>>>> as it's more or less illegal to use tex and sample opcodes in the >>>>>>>>> same >>>>>>>>> shader, but that doesn't really matter here). I think using sview >>>>>>>>> declarations is the right thing to do but it probably should be >>>>>>>>> illegal >>>>>>>>> then to NOT have them in some places. >>>>>>>> >>>>>>>> >>>>>>>> fwiw, the way I implemented it in tgsi_to_nir (and what I'd >>>>>>>> recommend >>>>>>>> for other drivers) is that for the "tex style" opcodes, if there is >>>>>>>> not a matching SVIEW decl, assume float. That seemed more sane than >>>>>>>> fixing up *all* the other state trackers.. >>>>>>>> >>>>>>> >>>>>>> Yeah but for llvmpipe/draw it's not just the type. The problem is how >>>>>>> they are parsing this stuff - if there's at least one sview decl >>>>>>> they'll >>>>>>> construct the static texture state based on the defined sampler >>>>>>> views, >>>>>>> but if there's none (up to now for gl state tracker) they'll do this >>>>>>> based on the defined samplers, and for some there simply won't be any >>>>>>> such definitions (if draw inserted some stage). llvmpipe can handle >>>>>>> tex >>>>>>> opcodes without sview dcls, it should also be able to handle them if >>>>>>> you >>>>>>> have them (but I wouldn't bet on it without testing as that's >>>>>>> definitely >>>>>>> going to hit some code paths never seen before), but it can't handle >>>>>>> having "some" sview dcls (see for example lp_state_fs.c, line 3050 >>>>>>> and >>>>>>> following why it can't work). This is fixable but only with some >>>>>>> awkwardness (since due to d3d10 sample opcodes not having 1:1 mapping >>>>>>> between samplers and sampler views). Hence my proposal of making it >>>>>>> mandatory to have sview decls instead of having them optionally. That >>>>>>> way could also ditch that code which distinguishes having sviews or >>>>>>> not. >>>>>> >>>>>> >>>>>> hmm, I could squash in something like: >>>>>> >>>>>> ---- >>>>>> diff --git a/src/gallium/drivers/llvmpipe/lp_state_fs.c >>>>>> b/src/gallium/drivers/llvmpipe/lp_state_fs.c >>>>>> index b5ce868..2878c49 100644 >>>>>> --- a/src/gallium/drivers/llvmpipe/lp_state_fs.c >>>>>> +++ b/src/gallium/drivers/llvmpipe/lp_state_fs.c >>>>>> @@ -3059,27 +3059,11 @@ make_variant_key(struct llvmpipe_context *lp, >>>>>> } >>>>>> } >>>>>> >>>>>> - /* >>>>>> - * XXX If TGSI_FILE_SAMPLER_VIEW exists assume all texture opcodes >>>>>> - * are dx10-style? Can't really have mixed opcodes, at least not >>>>>> - * if we want to skip the holes here (without rescanning tgsi). >>>>>> - */ >>>>>> - if (shader->info.base.file_max[TGSI_FILE_SAMPLER_VIEW] != -1) { >>>>>> - key->nr_sampler_views = >>>>>> shader->info.base.file_max[TGSI_FILE_SAMPLER_VIEW] + 1; >>>>>> - for(i = 0; i < key->nr_sampler_views; ++i) { >>>>>> - if(shader->info.base.file_mask[TGSI_FILE_SAMPLER_VIEW] & (1 >>>>>> << i)) { >>>>>> - >>>>>> lp_sampler_static_texture_state(&key->state[i].texture_state, >>>>>> - >>>>>> lp->sampler_views[PIPE_SHADER_FRAGMENT][i]); >>>>>> - } >>>>>> - } >>>>>> - } >>>>>> - else { >>>>>> - key->nr_sampler_views = key->nr_samplers; >>>>>> - for(i = 0; i < key->nr_sampler_views; ++i) { >>>>>> - if(shader->info.base.file_mask[TGSI_FILE_SAMPLER] & (1 << >>>>>> i)) { >>>>>> - >>>>>> lp_sampler_static_texture_state(&key->state[i].texture_state, >>>>>> - >>>>>> lp->sampler_views[PIPE_SHADER_FRAGMENT][i]); >>>>>> - } >>>>>> + key->nr_sampler_views = key->nr_samplers; >>>>>> + for(i = 0; i < key->nr_sampler_views; ++i) { >>>>>> + if(shader->info.base.file_mask[TGSI_FILE_SAMPLER] & (1 << i)) { >>>>>> + >>>>>> lp_sampler_static_texture_state(&key->state[i].texture_state, >>>>>> + >>>>>> lp->sampler_views[PIPE_SHADER_FRAGMENT][i]); >>>>>> } >>>>>> } >>>>>> } >>>>> >>>>> No that won't work for d3d10 state trackers, where you typically have >>>>> more >>>>> views than samplers (though the opposite is possible as well and you >>>>> can >>>>> have holes in both). I think it really would look nicer if you could >>>>> just rely on sampler views being present. >>>> >>>> >>>> what d3d10 state tracker? >>>> >>> >>> Make that "some non-public code which uses the sample opcodes"... >> >> >> well, not to be rude, but I can't really be expected to fix / >> avoid-breaking code that doesn't exist (in upstream mesa).. > > > Rob, > > Of course. Nobody expects or asked you to do such thing. > > Roland's just explaining why your proposed llvmpipe workaround is not > sustainable. Breaking our closed state tracker is OK and happens all the > time FYI. What we can't have is irreparable breakage. In short, we just > need to workout a solution that doesn't leave us with our hands tied to fix > it. > > And in fact, Roland's suggestion of requiring all TGSI producers including > SVIEW dcl does not require fixing any closed-source state tracker but rather > just fix TGSI generators in Mesa to put the SVIEW.
Hmm, well there are a *lot* of things in mesa which generate tgsi (vl, gallium9, etc).. fixing them all, esp. when I don't necessarily know all the other state trackers well enough to know what they *should* be doing, didn't seem reasonable. I was hoping we could just leave SVIEW decl as optional to avoid having to fix up everything. It just means that something which does use SVIEW/SAMPLE* needs to avoid having SVIEW index's which conflict with SAMP decl indexes. Although maybe w/ the draw stuff, if it is potentially inserting SAMP/TEX* into shaders already using SVIEW/SAMPLE*, maybe it needs to take care to insert SAMP decl's with #'s that start above max SVIEW index. I'm not sure, I haven't used the aux/draw stuff yet, so I'm not terribly familiar with it. Not sure if it would help, but I could add an shader cap and for now only insert the SVIEW decl for drivers that want them. Which I think, at the moment, is just tgsi_to_nir (and currently the two tgsi_to_nir users, vc4 and freedreno, use any of the aux/draw stuff). > Anyway I really like your approach, appreciate your perseverance so far, and > want to see it through. So please just bear with us a bit more. > fair enough.. I was mostly just trying to avoid having to fix massive amounts of state trackers, etc, which I am not necessarily terribly familiar with or even have a way to test. > > Roland, would it be possible to swap the > `shader->info.base.file_max[TGSI_FILE_SAMPLER_VIEW] != -1` condition with > something else? Just temporarily, so that we can decouple things, allowing > to commit this as is, then enforce the SVIEW everywhere on a 2nd round. > > Specially because enforcing this everywhere will require some work, > sepcially on u_blit / u_mipmap_gen helpers, which need yet another key. > > > For example, what about actually using the TGSI_OPCODE_SAMPLE_* & friends > opcode count? In fact, we could go even farther, and compute the opcount > for TGSI_OPCODE_TEX & friends and ensure that they are never non-zero > simultanouesly. > hmm, possibly something I didn't understand properly.. if you never have both TEX* and SAMPLE* at same time, this sounds much easier.. Ie. just use # of opcodes instead of file_max. Could even just add a trivial thing to tgsi_scan to count up # of opcodes of each category (ie. TEX* and SAMPLE*) BR, -R > > Jose > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev