Am 10.06.2015 um 23:31 schrieb Rob Clark: > On Wed, Jun 10, 2015 at 5:13 PM, Jose Fonseca <jfons...@vmware.com> wrote: >> I think it make sense for us to start using SVIEW regardless, and uniformize >> things. >> >> >> Even if GLSL will never support independent texture/samplers, D3D10, OpenCL, >> Metal, and potential SPIR-V all do. >> >> Roland, could you prepare a patch for llvmpipe, so that it infers from >> SAMPLE_* opcode count instead of SVIEW for, and therefore unblock Rob? >> > > fwiw, I've got the three tgsi_transform_shader() cases which insert a > SAMP updated to insert also an SVIEW if the shader is already using > SVIEW's.. this should be enough to keep llvmpipe happy without > changes. I'll send this plus tgsi.rst update shortly. Sounds like a plan.
> Going forward, I will have to make u_blitter keep track of > int/unsigned/float variants (which it seems to do already for msaa > resolve but not other cases), and similar treatment for some freedreno > internally generated shaders (used to restore the tile buffer). > >> >> BTW, even if you avoid intermediate TGSI on GLSL -> NIR, don't you still >> need to handle TGSI generated by the state tracker? (For things like blits, >> and mipmap generation, identity shaders, clear shaders, etc) >> > > yeah.. rough idea was add a 'const void *preferred_ir' to > pipe_shader_state, which could be NIR or whatever based on > PIPE_SHADER_CAP_PREFERRED_IR. Or it could be NULL in which case you > get tgsi tokens ptr instead and have to call tgsi_to_nir yourself. I > didn't see an easy way around that, given u_blitter, other state > trackers, etc. > > The biggest reason that I expect to want to go straight glsl->nir in > the future is when nir gains mediump support, since that can be a bit > perf win on mobile (gles oriented) hw, and it didn't seem like fun to > have to plumb that through tgsi and tgsi_to_nir.. Yes the untyped register file seems to start to look a bit limiting... That said, d3d11.1 also supports "true" lower precision values (half floats were supported in hlsl before but in the low-level assembly that wasn't visible at all, like all the other types which were just using untyped 32bit wide registers). I have no idea though how this is actually represented in the assembly, but this shows it should be possible to do in some hopefully sane way. Roland > > BR, > -R > >> >> Jose >> >> >> >> On 10/06/15 21:07, Rob Clark wrote: >>> >>> that is starting to look more attractive, mostly just because >>> tgsi_transform stuff is so cumbersome.. >>> >>> (I did start thinking about just adding type to decl's in general, >>> since really it would be better to have for type information for IN's >>> and OUT's too.. but then decided I'd probably rather spend my time on >>> support in mesa st to go straight from glsl to nir and bypass tgsi, >>> rather than going down that rabbit hole) >>> >>> BR, >>> -R >>> >>> On Wed, Jun 10, 2015 at 3:51 PM, Marek Olšák <mar...@gmail.com> wrote: >>>> >>>> There is also the option of adding the sampler type to either the SAMP >>>> declaration or texture instructions. This will move us further away >>>> from adopting SVIEW, but I don't see that happening for OpenGL anyway. >>>> >>>> Marek >>>> >>>> On Wed, Jun 10, 2015 at 8:59 PM, Rob Clark <robdcl...@gmail.com> wrote: >>>>> >>>>> So, afaiu, everything that might insert a sampler is using >>>>> tgsi_transform_shader()? There aren't too many of those, and I think >>>>> I can fix them up to not violate that constraint. >>>>> >>>>> (It does occur to me that I may end up needing to fix u_blitter to >>>>> differentiate between blitting float vs int to avoid some regressions >>>>> in freedreno, since I'd no longer be using shader variants based on >>>>> bound samplers.. but I guess that is unrelated and a separate patch) >>>>> >>>>> BR, >>>>> -R >>>>> >>>>> On Wed, Jun 10, 2015 at 2:55 PM, Roland Scheidegger <srol...@vmware.com> >>>>> wrote: >>>>>> >>>>>> My biggest problem with that is the initial case I saw as a problem: >>>>>> draw is going to modify these shaders in some cases (aaline stage for >>>>>> example), adding its own sampler, and it doesn't know anything about >>>>>> distinguishing shaders with sampler views or without. >>>>>> The same goes for any other potential code which may modify shaders >>>>>> similarly - needs to be modified not just to always use sampler views >>>>>> but use them based on if the incoming shader already uses them or not. >>>>>> Which conceptually looks worse to me. But otherwise I agree this should >>>>>> work. >>>>>> >>>>>> Roland >>>>>> >>>>>> >>>>>> Am 10.06.2015 um 20:30 schrieb Rob Clark: >>>>>>> >>>>>>> Hmm, at least tgsi_text_translate() doesn't appear to use tgsi_ureg.. >>>>>>> and there are still a number of users of tgsi_text_translate().. I >>>>>>> guess handling this in tgsi_ureg would avoid fixing all the tgsi_ureg >>>>>>> users, but that still leaves a lot of others. Changing them all still >>>>>>> seems to be too intrusive to me. >>>>>>> >>>>>>> (And also, I have a large collection of saved tgsi shaders that I use >>>>>>> for standalone testing of my shader compiler and don't really like the >>>>>>> idea of fixing up 700 or 800 tgsi shaders by hand :-P) >>>>>>> >>>>>>> That said, looking at the code like llvmpipe where Roland/Jose where >>>>>>> thinking we might have problems.. by making the assumption that we >>>>>>> never mix TEX* and SAMPLE* opc's, I think we can loosen the >>>>>>> restriction to: >>>>>>> >>>>>>> for TEX* instructions, the tgsi must either *not* include SVIEW, or >>>>>>> *must* include a matching SVIEW[idx] for every SAMP[idx] >>>>>>> >>>>>>> Which is a restriction that glsl_to_tgsi follows. >>>>>>> >>>>>>> If you follow this restriction, then for TEX* shaders which have >>>>>>> SVIEW's, file_max[TGSI_FILE_SAMPLER_VIEW] == >>>>>>> file_max[TGSI_FILE_SAMPLER] and file_mask[TGSI_FILE_SAMPLER_VIEW] == >>>>>>> file_mask[TGSI_FILE_SAMPLER].. so code takes a different but >>>>>>> equivalent path. And for TEX* shaders which don't have SVIEW's >>>>>>> everything continues to work as before. >>>>>>> >>>>>>> With this approach, we don't have to fix up everything to create >>>>>>> SVIEW[idx] for every SAMP[idx], as long as glsl_to_tgsi always creates >>>>>>> SVIEW[idx] for each SAMP[idx], and any other tgsi generator that later >>>>>>> adds SVIEW support for TEX* instructions follows the same pattern. >>>>>>> >>>>>>> So, tl;dr: I think really all I need to add to this patch is add blurb >>>>>>> in tgsi.rst to explain this restriction and usage of SVIEW for TEX* >>>>>>> >>>>>>> Thoughts? >>>>>>> >>>>>>> BR, >>>>>>> -R >>>>>>> >>>>>>> On Tue, Jun 9, 2015 at 1:20 PM, Marek Olšák <mar...@gmail.com> wrote: >>>>>>>> >>>>>>>> If you only want to modify TGSI and not all the users, you only have >>>>>>>> to fix tgsi_ureg. tgsi_ureg is a layer that can hide a lot of small >>>>>>>> ugly details if needed, including sampler view declarations when the >>>>>>>> users don't even know about them. >>>>>>>> >>>>>>>> Marek >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Jun 9, 2015 at 6:01 PM, Rob Clark <robdcl...@gmail.com> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> On Tue, Jun 9, 2015 at 9:32 AM, Roland Scheidegger >>>>>>>>> <srol...@vmware.com> wrote: >>>>>>>>>> >>>>>>>>>> Am 09.06.2015 um 15:00 schrieb Rob Clark: >>>>>>>>>>> >>>>>>>>>>> On Tue, Jun 9, 2015 at 5:01 AM, Jose Fonseca <jfons...@vmware.com> >>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> On 09/06/15 04:03, Rob Clark wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Mon, Jun 8, 2015 at 10:50 PM, Roland Scheidegger >>>>>>>>>>>>> <srol...@vmware.com> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Am 09.06.2015 um 04:40 schrieb Rob Clark: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Mon, Jun 8, 2015 at 10:36 PM, Roland Scheidegger >>>>>>>>>>>>>>> <srol...@vmware.com> >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Am 09.06.2015 um 04:20 schrieb Rob Clark: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Mon, Jun 8, 2015 at 9:40 PM, Roland Scheidegger >>>>>>>>>>>>>>>>> <srol...@vmware.com> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Am 09.06.2015 um 03:20 schrieb Rob Clark: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Mon, Jun 8, 2015 at 8:35 PM, Roland Scheidegger >>>>>>>>>>>>>>>>>>> <srol...@vmware.com> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Am 08.06.2015 um 20:15 schrieb Rob Clark: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> From: Rob Clark <robcl...@freedesktop.org> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Freedreno needs sampler type information to deal with >>>>>>>>>>>>>>>>>>>>> int/uint >>>>>>>>>>>>>>>>>>>>> textures. >>>>>>>>>>>>>>>>>>>>> To accomplish this, start creating sampler-view >>>>>>>>>>>>>>>>>>>>> declarations, as >>>>>>>>>>>>>>>>>>>>> suggested here: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> http://lists.freedesktop.org/archives/mesa-dev/2014-November/071583.html >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> create a sampler-view with index matching the sampler, >>>>>>>>>>>>>>>>>>>>> to encode >>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>> texture type (ie. SINT/UINT/FLOAT). Ie: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> DCL SVIEW[n], 2D, UINT >>>>>>>>>>>>>>>>>>>>> DCL SAMP[n] >>>>>>>>>>>>>>>>>>>>> TEX OUT[1], IN[1], SAMP[n] >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> For tgsi texture instructions which do not take an >>>>>>>>>>>>>>>>>>>>> explicit SVIEW >>>>>>>>>>>>>>>>>>>>> argument, the SVIEW index is implied by the SAMP index. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I wonder if there should be some doc update somewhere. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> yeah, perhaps.. I wasn't quite sure where, tgsi.rst only >>>>>>>>>>>>>>>>>>> talks about >>>>>>>>>>>>>>>>>>> SVIEW in terms of the SAMPLE and related opc's (ie. the >>>>>>>>>>>>>>>>>>> ones which >>>>>>>>>>>>>>>>>>> do >>>>>>>>>>>>>>>>>>> take an SVIEW arg), and the way things are with this >>>>>>>>>>>>>>>>>>> patch, other >>>>>>>>>>>>>>>>>>> drivers simply need to ignore the extra SVIEW decl's and >>>>>>>>>>>>>>>>>>> not >>>>>>>>>>>>>>>>>>> randomly >>>>>>>>>>>>>>>>>>> assert for things to continue working as before.. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> hypothetically if something in tree actually created SVIEW >>>>>>>>>>>>>>>>>>> decl >>>>>>>>>>>>>>>>>>> currently, and the shader also had TEX* style >>>>>>>>>>>>>>>>>>> instructions, it would >>>>>>>>>>>>>>>>>>> have to take care to not have conflicting SVIEW's. But >>>>>>>>>>>>>>>>>>> afaict >>>>>>>>>>>>>>>>>>> nothing >>>>>>>>>>>>>>>>>>> in-tree creates sampler view decl's currently. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Signed-off-by: Rob Clark <robcl...@freedesktop.org> >>>>>>>>>>>>>>>>>>>>> --- >>>>>>>>>>>>>>>>>>>>> src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 29 >>>>>>>>>>>>>>>>>>>>> +++++++++++++++++++++++++++++ >>>>>>>>>>>>>>>>>>>>> 1 file changed, 29 insertions(+) >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp >>>>>>>>>>>>>>>>>>>>> b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp >>>>>>>>>>>>>>>>>>>>> index 0e60d95..ce8f2ea 100644 >>>>>>>>>>>>>>>>>>>>> --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp >>>>>>>>>>>>>>>>>>>>> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp >>>>>>>>>>>>>>>>>>>>> @@ -239,6 +239,7 @@ public: >>>>>>>>>>>>>>>>>>>>> st_src_reg sampler; /**< sampler register */ >>>>>>>>>>>>>>>>>>>>> int sampler_array_size; /**< 1-based size of >>>>>>>>>>>>>>>>>>>>> sampler array, 1 >>>>>>>>>>>>>>>>>>>>> if not array */ >>>>>>>>>>>>>>>>>>>>> int tex_target; /**< One of TEXTURE_*_INDEX */ >>>>>>>>>>>>>>>>>>>>> + glsl_base_type tex_type; >>>>>>>>>>>>>>>>>>>>> GLboolean tex_shadow; >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> st_src_reg tex_offsets[MAX_GLSL_TEXTURE_OFFSET]; >>>>>>>>>>>>>>>>>>>>> @@ -345,6 +346,8 @@ public: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> int num_address_regs; >>>>>>>>>>>>>>>>>>>>> int samplers_used; >>>>>>>>>>>>>>>>>>>>> + glsl_base_type sampler_types[PIPE_MAX_SAMPLERS]; >>>>>>>>>>>>>>>>>>>>> + int sampler_targets[PIPE_MAX_SAMPLERS]; /**< One >>>>>>>>>>>>>>>>>>>>> of >>>>>>>>>>>>>>>>>>>>> TGSI_TEXTURE_* */ >>>>>>>>>>>>>>>>>>>>> bool indirect_addr_consts; >>>>>>>>>>>>>>>>>>>>> int wpos_transform_const; >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> @@ -3323,6 +3326,8 @@ >>>>>>>>>>>>>>>>>>>>> glsl_to_tgsi_visitor::visit(ir_texture *ir) >>>>>>>>>>>>>>>>>>>>> assert(!"Should not get here."); >>>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> + inst->tex_type = ir->type->base_type; >>>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>>> this->result = result_src; >>>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> @@ -3471,6 +3476,11 @@ >>>>>>>>>>>>>>>>>>>>> count_resources(glsl_to_tgsi_visitor *v, >>>>>>>>>>>>>>>>>>>>> gl_program *prog) >>>>>>>>>>>>>>>>>>>>> for (int i = 0; i < >>>>>>>>>>>>>>>>>>>>> inst->sampler_array_size; i++) { >>>>>>>>>>>>>>>>>>>>> v->samplers_used |= 1 << >>>>>>>>>>>>>>>>>>>>> (inst->sampler.index + i); >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> + debug_assert(i < >>>>>>>>>>>>>>>>>>>>> (int)ARRAY_SIZE(v->sampler_types)); >>>>>>>>>>>>>>>>>>>>> + v->sampler_types[i] = inst->tex_type; >>>>>>>>>>>>>>>>>>>>> + v->sampler_targets[i] = >>>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>>> st_translate_texture_target(inst->tex_target, >>>>>>>>>>>>>>>>>>>>> inst->tex_shadow); >>>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>>> if (inst->tex_shadow) { >>>>>>>>>>>>>>>>>>>>> prog->ShadowSamplers |= 1 << >>>>>>>>>>>>>>>>>>>>> (inst->sampler.index >>>>>>>>>>>>>>>>>>>>> + i); >>>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>>> @@ -5529,7 +5539,26 @@ st_translate_program( >>>>>>>>>>>>>>>>>>>>> /* texture samplers */ >>>>>>>>>>>>>>>>>>>>> for (i = 0; i < >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> ctx->Const.Program[MESA_SHADER_FRAGMENT].MaxTextureImageUnits; >>>>>>>>>>>>>>>>>>>>> i++) { >>>>>>>>>>>>>>>>>>>>> if (program->samplers_used & (1 << i)) { >>>>>>>>>>>>>>>>>>>>> + unsigned type; >>>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>>> t->samplers[i] = ureg_DECL_sampler(ureg, i); >>>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>>> + switch (program->sampler_types[i]) { >>>>>>>>>>>>>>>>>>>>> + case GLSL_TYPE_INT: >>>>>>>>>>>>>>>>>>>>> + type = TGSI_RETURN_TYPE_SINT; >>>>>>>>>>>>>>>>>>>>> + break; >>>>>>>>>>>>>>>>>>>>> + case GLSL_TYPE_UINT: >>>>>>>>>>>>>>>>>>>>> + type = TGSI_RETURN_TYPE_UINT; >>>>>>>>>>>>>>>>>>>>> + break; >>>>>>>>>>>>>>>>>>>>> + case GLSL_TYPE_FLOAT: >>>>>>>>>>>>>>>>>>>>> + type = TGSI_RETURN_TYPE_FLOAT; >>>>>>>>>>>>>>>>>>>>> + break; >>>>>>>>>>>>>>>>>>>>> + default: >>>>>>>>>>>>>>>>>>>>> + unreachable("not reached"); >>>>>>>>>>>>>>>>>>>>> + } >>>>>>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>>>>>> + ureg_DECL_sampler_view( ureg, i, >>>>>>>>>>>>>>>>>>>>> program->sampler_targets[i], >>>>>>>>>>>>>>>>>>>>> + type, type, type, type >>>>>>>>>>>>>>>>>>>>> ); >>>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> This indeed seems like a consistent solution. Hopefully >>>>>>>>>>>>>>>>>>>> drivers >>>>>>>>>>>>>>>>>>>> don't >>>>>>>>>>>>>>>>>>>> assert when they see sampler view dcls... >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> well, tgsi_to_nir did before I fixed it :-( >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> (mostly just because of liberal debug_assert() usage) >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> it's at least not something I'd land right before a >>>>>>>>>>>>>>>>>>> release branch >>>>>>>>>>>>>>>>>>> point. But I guess the fix is easy enough (ie. remove a >>>>>>>>>>>>>>>>>>> bogus >>>>>>>>>>>>>>>>>>> assert), and 10.7 branch point is quite some time from >>>>>>>>>>>>>>>>>>> now.. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I am also somewhat worried that this only changes the >>>>>>>>>>>>>>>>>>>> glsl-to-tgsi >>>>>>>>>>>>>>>>>>>> path, >>>>>>>>>>>>>>>>>>>> not other paths. llvmpipe and draw rely on either having >>>>>>>>>>>>>>>>>>>> sview >>>>>>>>>>>>>>>>>>>> declarations for all (sample) instructions or for none, >>>>>>>>>>>>>>>>>>>> and with >>>>>>>>>>>>>>>>>>>> draw >>>>>>>>>>>>>>>>>>>> possibly inserting tex opcodes (for aalines etc.) on its >>>>>>>>>>>>>>>>>>>> own this >>>>>>>>>>>>>>>>>>>> could >>>>>>>>>>>>>>>>>>>> probably break (these draw paths aren't used with d3d10 >>>>>>>>>>>>>>>>>>>> sample >>>>>>>>>>>>>>>>>>>> opcodes >>>>>>>>>>>>>>>>>>>> as it's more or less illegal to use tex and sample >>>>>>>>>>>>>>>>>>>> opcodes in the >>>>>>>>>>>>>>>>>>>> same >>>>>>>>>>>>>>>>>>>> shader, but that doesn't really matter here). I think >>>>>>>>>>>>>>>>>>>> using sview >>>>>>>>>>>>>>>>>>>> declarations is the right thing to do but it probably >>>>>>>>>>>>>>>>>>>> should be >>>>>>>>>>>>>>>>>>>> illegal >>>>>>>>>>>>>>>>>>>> then to NOT have them in some places. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> fwiw, the way I implemented it in tgsi_to_nir (and what >>>>>>>>>>>>>>>>>>> I'd >>>>>>>>>>>>>>>>>>> recommend >>>>>>>>>>>>>>>>>>> for other drivers) is that for the "tex style" opcodes, if >>>>>>>>>>>>>>>>>>> there is >>>>>>>>>>>>>>>>>>> not a matching SVIEW decl, assume float. That seemed more >>>>>>>>>>>>>>>>>>> sane than >>>>>>>>>>>>>>>>>>> fixing up *all* the other state trackers.. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Yeah but for llvmpipe/draw it's not just the type. The >>>>>>>>>>>>>>>>>> problem is how >>>>>>>>>>>>>>>>>> they are parsing this stuff - if there's at least one sview >>>>>>>>>>>>>>>>>> decl >>>>>>>>>>>>>>>>>> they'll >>>>>>>>>>>>>>>>>> construct the static texture state based on the defined >>>>>>>>>>>>>>>>>> sampler >>>>>>>>>>>>>>>>>> views, >>>>>>>>>>>>>>>>>> but if there's none (up to now for gl state tracker) >>>>>>>>>>>>>>>>>> they'll do this >>>>>>>>>>>>>>>>>> based on the defined samplers, and for some there simply >>>>>>>>>>>>>>>>>> won't be any >>>>>>>>>>>>>>>>>> such definitions (if draw inserted some stage). llvmpipe >>>>>>>>>>>>>>>>>> can handle >>>>>>>>>>>>>>>>>> tex >>>>>>>>>>>>>>>>>> opcodes without sview dcls, it should also be able to >>>>>>>>>>>>>>>>>> handle them if >>>>>>>>>>>>>>>>>> you >>>>>>>>>>>>>>>>>> have them (but I wouldn't bet on it without testing as >>>>>>>>>>>>>>>>>> that's >>>>>>>>>>>>>>>>>> definitely >>>>>>>>>>>>>>>>>> going to hit some code paths never seen before), but it >>>>>>>>>>>>>>>>>> can't handle >>>>>>>>>>>>>>>>>> having "some" sview dcls (see for example lp_state_fs.c, >>>>>>>>>>>>>>>>>> line 3050 >>>>>>>>>>>>>>>>>> and >>>>>>>>>>>>>>>>>> following why it can't work). This is fixable but only with >>>>>>>>>>>>>>>>>> some >>>>>>>>>>>>>>>>>> awkwardness (since due to d3d10 sample opcodes not having >>>>>>>>>>>>>>>>>> 1:1 mapping >>>>>>>>>>>>>>>>>> between samplers and sampler views). Hence my proposal of >>>>>>>>>>>>>>>>>> making it >>>>>>>>>>>>>>>>>> mandatory to have sview decls instead of having them >>>>>>>>>>>>>>>>>> optionally. That >>>>>>>>>>>>>>>>>> way could also ditch that code which distinguishes having >>>>>>>>>>>>>>>>>> sviews or >>>>>>>>>>>>>>>>>> not. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> hmm, I could squash in something like: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> ---- >>>>>>>>>>>>>>>>> diff --git a/src/gallium/drivers/llvmpipe/lp_state_fs.c >>>>>>>>>>>>>>>>> b/src/gallium/drivers/llvmpipe/lp_state_fs.c >>>>>>>>>>>>>>>>> index b5ce868..2878c49 100644 >>>>>>>>>>>>>>>>> --- a/src/gallium/drivers/llvmpipe/lp_state_fs.c >>>>>>>>>>>>>>>>> +++ b/src/gallium/drivers/llvmpipe/lp_state_fs.c >>>>>>>>>>>>>>>>> @@ -3059,27 +3059,11 @@ make_variant_key(struct >>>>>>>>>>>>>>>>> llvmpipe_context *lp, >>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> - /* >>>>>>>>>>>>>>>>> - * XXX If TGSI_FILE_SAMPLER_VIEW exists assume all >>>>>>>>>>>>>>>>> texture opcodes >>>>>>>>>>>>>>>>> - * are dx10-style? Can't really have mixed opcodes, at >>>>>>>>>>>>>>>>> least not >>>>>>>>>>>>>>>>> - * if we want to skip the holes here (without rescanning >>>>>>>>>>>>>>>>> tgsi). >>>>>>>>>>>>>>>>> - */ >>>>>>>>>>>>>>>>> - if (shader->info.base.file_max[TGSI_FILE_SAMPLER_VIEW] >>>>>>>>>>>>>>>>> != -1) { >>>>>>>>>>>>>>>>> - key->nr_sampler_views = >>>>>>>>>>>>>>>>> shader->info.base.file_max[TGSI_FILE_SAMPLER_VIEW] + 1; >>>>>>>>>>>>>>>>> - for(i = 0; i < key->nr_sampler_views; ++i) { >>>>>>>>>>>>>>>>> - >>>>>>>>>>>>>>>>> if(shader->info.base.file_mask[TGSI_FILE_SAMPLER_VIEW] & (1 >>>>>>>>>>>>>>>>> << i)) { >>>>>>>>>>>>>>>>> - >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> lp_sampler_static_texture_state(&key->state[i].texture_state, >>>>>>>>>>>>>>>>> - >>>>>>>>>>>>>>>>> lp->sampler_views[PIPE_SHADER_FRAGMENT][i]); >>>>>>>>>>>>>>>>> - } >>>>>>>>>>>>>>>>> - } >>>>>>>>>>>>>>>>> - } >>>>>>>>>>>>>>>>> - else { >>>>>>>>>>>>>>>>> - key->nr_sampler_views = key->nr_samplers; >>>>>>>>>>>>>>>>> - for(i = 0; i < key->nr_sampler_views; ++i) { >>>>>>>>>>>>>>>>> - if(shader->info.base.file_mask[TGSI_FILE_SAMPLER] >>>>>>>>>>>>>>>>> & (1 << >>>>>>>>>>>>>>>>> i)) { >>>>>>>>>>>>>>>>> - >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> lp_sampler_static_texture_state(&key->state[i].texture_state, >>>>>>>>>>>>>>>>> - >>>>>>>>>>>>>>>>> lp->sampler_views[PIPE_SHADER_FRAGMENT][i]); >>>>>>>>>>>>>>>>> - } >>>>>>>>>>>>>>>>> + key->nr_sampler_views = key->nr_samplers; >>>>>>>>>>>>>>>>> + for(i = 0; i < key->nr_sampler_views; ++i) { >>>>>>>>>>>>>>>>> + if(shader->info.base.file_mask[TGSI_FILE_SAMPLER] & >>>>>>>>>>>>>>>>> (1 << i)) { >>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> lp_sampler_static_texture_state(&key->state[i].texture_state, >>>>>>>>>>>>>>>>> + >>>>>>>>>>>>>>>>> lp->sampler_views[PIPE_SHADER_FRAGMENT][i]); >>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> No that won't work for d3d10 state trackers, where you >>>>>>>>>>>>>>>> typically have >>>>>>>>>>>>>>>> more >>>>>>>>>>>>>>>> views than samplers (though the opposite is possible as well >>>>>>>>>>>>>>>> and you >>>>>>>>>>>>>>>> can >>>>>>>>>>>>>>>> have holes in both). I think it really would look nicer if >>>>>>>>>>>>>>>> you could >>>>>>>>>>>>>>>> just rely on sampler views being present. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> what d3d10 state tracker? >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Make that "some non-public code which uses the sample >>>>>>>>>>>>>> opcodes"... >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> well, not to be rude, but I can't really be expected to fix / >>>>>>>>>>>>> avoid-breaking code that doesn't exist (in upstream mesa).. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Rob, >>>>>>>>>>>> >>>>>>>>>>>> Of course. Nobody expects or asked you to do such thing. >>>>>>>>>>>> >>>>>>>>>>>> Roland's just explaining why your proposed llvmpipe workaround is >>>>>>>>>>>> not >>>>>>>>>>>> sustainable. Breaking our closed state tracker is OK and happens >>>>>>>>>>>> all the >>>>>>>>>>>> time FYI. What we can't have is irreparable breakage. In short, >>>>>>>>>>>> we just >>>>>>>>>>>> need to workout a solution that doesn't leave us with our hands >>>>>>>>>>>> tied to fix >>>>>>>>>>>> it. >>>>>>>>>>>> >>>>>>>>>>>> And in fact, Roland's suggestion of requiring all TGSI producers >>>>>>>>>>>> including >>>>>>>>>>>> SVIEW dcl does not require fixing any closed-source state tracker >>>>>>>>>>>> but rather >>>>>>>>>>>> just fix TGSI generators in Mesa to put the SVIEW. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Hmm, well there are a *lot* of things in mesa which generate tgsi >>>>>>>>>>> (vl, >>>>>>>>>>> gallium9, etc).. fixing them all, esp. when I don't necessarily >>>>>>>>>>> know >>>>>>>>>>> all the other state trackers well enough to know what they >>>>>>>>>>> *should* be >>>>>>>>>>> doing, didn't seem reasonable. >>>>>>>>>>> >>>>>>>>>>> I was hoping we could just leave SVIEW decl as optional to avoid >>>>>>>>>>> having to fix up everything. It just means that something which >>>>>>>>>>> does >>>>>>>>>>> use SVIEW/SAMPLE* needs to avoid having SVIEW index's which >>>>>>>>>>> conflict >>>>>>>>>>> with SAMP decl indexes. >>>>>>>>>>> >>>>>>>>>>> Although maybe w/ the draw stuff, if it is potentially inserting >>>>>>>>>>> SAMP/TEX* into shaders already using SVIEW/SAMPLE*, maybe it needs >>>>>>>>>>> to >>>>>>>>>>> take care to insert SAMP decl's with #'s that start above max >>>>>>>>>>> SVIEW >>>>>>>>>>> index. I'm not sure, I haven't used the aux/draw stuff yet, so >>>>>>>>>>> I'm >>>>>>>>>>> not terribly familiar with it. >>>>>>>>>>> >>>>>>>>>>> Not sure if it would help, but I could add an shader cap and for >>>>>>>>>>> now >>>>>>>>>>> only insert the SVIEW decl for drivers that want them. Which I >>>>>>>>>>> think, >>>>>>>>>>> at the moment, is just tgsi_to_nir (and currently the two >>>>>>>>>>> tgsi_to_nir >>>>>>>>>>> users, vc4 and freedreno, use any of the aux/draw stuff). >>>>>>>>>>> >>>>>>>>>>>> Anyway I really like your approach, appreciate your perseverance >>>>>>>>>>>> so far, and >>>>>>>>>>>> want to see it through. So please just bear with us a bit more. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> fair enough.. I was mostly just trying to avoid having to fix >>>>>>>>>>> massive >>>>>>>>>>> amounts of state trackers, etc, which I am not necessarily >>>>>>>>>>> terribly >>>>>>>>>>> familiar with or even have a way to test. >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Roland, would it be possible to swap the >>>>>>>>>>>> `shader->info.base.file_max[TGSI_FILE_SAMPLER_VIEW] != -1` >>>>>>>>>>>> condition with >>>>>>>>>>>> something else? Just temporarily, so that we can decouple >>>>>>>>>>>> things, allowing >>>>>>>>>>>> to commit this as is, then enforce the SVIEW everywhere on a 2nd >>>>>>>>>>>> round. >>>>>>>>>>>> >>>>>>>>>>>> Specially because enforcing this everywhere will require some >>>>>>>>>>>> work, >>>>>>>>>>>> sepcially on u_blit / u_mipmap_gen helpers, which need yet >>>>>>>>>>>> another key. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> For example, what about actually using the TGSI_OPCODE_SAMPLE_* & >>>>>>>>>>>> friends >>>>>>>>>>>> opcode count? In fact, we could go even farther, and compute the >>>>>>>>>>>> opcount >>>>>>>>>>>> for TGSI_OPCODE_TEX & friends and ensure that they are never >>>>>>>>>>>> non-zero >>>>>>>>>>>> simultanouesly. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> hmm, possibly something I didn't understand properly.. if you >>>>>>>>>>> never >>>>>>>>>>> have both TEX* and SAMPLE* at same time, this sounds much easier.. >>>>>>>>>>> Ie. just use # of opcodes instead of file_max. Could even just >>>>>>>>>>> add a >>>>>>>>>>> trivial thing to tgsi_scan to count up # of opcodes of each >>>>>>>>>>> category >>>>>>>>>>> (ie. TEX* and SAMPLE*) >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> That doesn't tell you how many dcls (either samplers or sampler >>>>>>>>>> views) >>>>>>>>>> you actually have. Even with GL reusing the same texture unit with >>>>>>>>>> a >>>>>>>>>> different instruction is of course possible (and quite frequently >>>>>>>>>> done). >>>>>>>>>> Though you are right that because both sample and tex opcodes can't >>>>>>>>>> be >>>>>>>>>> used in the same shader, scanning the shader could determine if >>>>>>>>>> it's >>>>>>>>>> either tex or sample and then use the appropriate shader file if >>>>>>>>>> we'd >>>>>>>>>> want to keep unused views out of the shader. That may also be ok as >>>>>>>>>> an >>>>>>>>>> interim measure. >>>>>>>>> >>>>>>>>> >>>>>>>>> yeah, that was basically my idea.. not so much caring about the >>>>>>>>> count, >>>>>>>>> as much of which one is non-zero.. >>>>>>>>> >>>>>>>>> BR, >>>>>>>>> -R >>>>>>>>> >>>>>>>>>> Roland >>>>>>>>>> >>>>>>>>>> >>>>>> >> _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev