On Tue, Jun 9, 2015 at 5:01 AM, Jose Fonseca <jfons...@vmware.com> wrote:
On 09/06/15 04:03, Rob Clark wrote:
On Mon, Jun 8, 2015 at 10:50 PM, Roland Scheidegger <srol...@vmware.com>
wrote:
Am 09.06.2015 um 04:40 schrieb Rob Clark:
On Mon, Jun 8, 2015 at 10:36 PM, Roland Scheidegger <srol...@vmware.com>
wrote:
Am 09.06.2015 um 04:20 schrieb Rob Clark:
On Mon, Jun 8, 2015 at 9:40 PM, Roland Scheidegger
<srol...@vmware.com> wrote:
Am 09.06.2015 um 03:20 schrieb Rob Clark:
On Mon, Jun 8, 2015 at 8:35 PM, Roland Scheidegger
<srol...@vmware.com> wrote:
Am 08.06.2015 um 20:15 schrieb Rob Clark:
From: Rob Clark <robcl...@freedesktop.org>
Freedreno needs sampler type information to deal with int/uint
textures.
To accomplish this, start creating sampler-view declarations, as
suggested here:
http://lists.freedesktop.org/archives/mesa-dev/2014-November/071583.html
create a sampler-view with index matching the sampler, to encode
the
texture type (ie. SINT/UINT/FLOAT). Ie:
DCL SVIEW[n], 2D, UINT
DCL SAMP[n]
TEX OUT[1], IN[1], SAMP[n]
For tgsi texture instructions which do not take an explicit SVIEW
argument, the SVIEW index is implied by the SAMP index.
I wonder if there should be some doc update somewhere.
yeah, perhaps.. I wasn't quite sure where, tgsi.rst only talks about
SVIEW in terms of the SAMPLE and related opc's (ie. the ones which
do
take an SVIEW arg), and the way things are with this patch, other
drivers simply need to ignore the extra SVIEW decl's and not
randomly
assert for things to continue working as before..
hypothetically if something in tree actually created SVIEW decl
currently, and the shader also had TEX* style instructions, it would
have to take care to not have conflicting SVIEW's. But afaict
nothing
in-tree creates sampler view decl's currently.
Signed-off-by: Rob Clark <robcl...@freedesktop.org>
---
src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 29
+++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)
diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 0e60d95..ce8f2ea 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -239,6 +239,7 @@ public:
st_src_reg sampler; /**< sampler register */
int sampler_array_size; /**< 1-based size of sampler array, 1
if not array */
int tex_target; /**< One of TEXTURE_*_INDEX */
+ glsl_base_type tex_type;
GLboolean tex_shadow;
st_src_reg tex_offsets[MAX_GLSL_TEXTURE_OFFSET];
@@ -345,6 +346,8 @@ public:
int num_address_regs;
int samplers_used;
+ glsl_base_type sampler_types[PIPE_MAX_SAMPLERS];
+ int sampler_targets[PIPE_MAX_SAMPLERS]; /**< One of
TGSI_TEXTURE_* */
bool indirect_addr_consts;
int wpos_transform_const;
@@ -3323,6 +3326,8 @@ glsl_to_tgsi_visitor::visit(ir_texture *ir)
assert(!"Should not get here.");
}
+ inst->tex_type = ir->type->base_type;
+
this->result = result_src;
}
@@ -3471,6 +3476,11 @@ count_resources(glsl_to_tgsi_visitor *v,
gl_program *prog)
for (int i = 0; i < inst->sampler_array_size; i++) {
v->samplers_used |= 1 << (inst->sampler.index + i);
+ debug_assert(i < (int)ARRAY_SIZE(v->sampler_types));
+ v->sampler_types[i] = inst->tex_type;
+ v->sampler_targets[i] =
+ st_translate_texture_target(inst->tex_target,
inst->tex_shadow);
+
if (inst->tex_shadow) {
prog->ShadowSamplers |= 1 << (inst->sampler.index
+ i);
}
@@ -5529,7 +5539,26 @@ st_translate_program(
/* texture samplers */
for (i = 0; i <
ctx->Const.Program[MESA_SHADER_FRAGMENT].MaxTextureImageUnits; i++) {
if (program->samplers_used & (1 << i)) {
+ unsigned type;
+
t->samplers[i] = ureg_DECL_sampler(ureg, i);
+
+ switch (program->sampler_types[i]) {
+ case GLSL_TYPE_INT:
+ type = TGSI_RETURN_TYPE_SINT;
+ break;
+ case GLSL_TYPE_UINT:
+ type = TGSI_RETURN_TYPE_UINT;
+ break;
+ case GLSL_TYPE_FLOAT:
+ type = TGSI_RETURN_TYPE_FLOAT;
+ break;
+ default:
+ unreachable("not reached");
+ }
+
+ ureg_DECL_sampler_view( ureg, i,
program->sampler_targets[i],
+ type, type, type, type );
}
}
This indeed seems like a consistent solution. Hopefully drivers
don't
assert when they see sampler view dcls...
well, tgsi_to_nir did before I fixed it :-(
(mostly just because of liberal debug_assert() usage)
it's at least not something I'd land right before a release branch
point. But I guess the fix is easy enough (ie. remove a bogus
assert), and 10.7 branch point is quite some time from now..
I am also somewhat worried that this only changes the glsl-to-tgsi
path,
not other paths. llvmpipe and draw rely on either having sview
declarations for all (sample) instructions or for none, and with
draw
possibly inserting tex opcodes (for aalines etc.) on its own this
could
probably break (these draw paths aren't used with d3d10 sample
opcodes
as it's more or less illegal to use tex and sample opcodes in the
same
shader, but that doesn't really matter here). I think using sview
declarations is the right thing to do but it probably should be
illegal
then to NOT have them in some places.
fwiw, the way I implemented it in tgsi_to_nir (and what I'd
recommend
for other drivers) is that for the "tex style" opcodes, if there is
not a matching SVIEW decl, assume float. That seemed more sane than
fixing up *all* the other state trackers..
Yeah but for llvmpipe/draw it's not just the type. The problem is how
they are parsing this stuff - if there's at least one sview decl
they'll
construct the static texture state based on the defined sampler
views,
but if there's none (up to now for gl state tracker) they'll do this
based on the defined samplers, and for some there simply won't be any
such definitions (if draw inserted some stage). llvmpipe can handle
tex
opcodes without sview dcls, it should also be able to handle them if
you
have them (but I wouldn't bet on it without testing as that's
definitely
going to hit some code paths never seen before), but it can't handle
having "some" sview dcls (see for example lp_state_fs.c, line 3050
and
following why it can't work). This is fixable but only with some
awkwardness (since due to d3d10 sample opcodes not having 1:1 mapping
between samplers and sampler views). Hence my proposal of making it
mandatory to have sview decls instead of having them optionally. That
way could also ditch that code which distinguishes having sviews or
not.
hmm, I could squash in something like:
----
diff --git a/src/gallium/drivers/llvmpipe/lp_state_fs.c
b/src/gallium/drivers/llvmpipe/lp_state_fs.c
index b5ce868..2878c49 100644
--- a/src/gallium/drivers/llvmpipe/lp_state_fs.c
+++ b/src/gallium/drivers/llvmpipe/lp_state_fs.c
@@ -3059,27 +3059,11 @@ make_variant_key(struct llvmpipe_context *lp,
}
}
- /*
- * XXX If TGSI_FILE_SAMPLER_VIEW exists assume all texture opcodes
- * are dx10-style? Can't really have mixed opcodes, at least not
- * if we want to skip the holes here (without rescanning tgsi).
- */
- if (shader->info.base.file_max[TGSI_FILE_SAMPLER_VIEW] != -1) {
- key->nr_sampler_views =
shader->info.base.file_max[TGSI_FILE_SAMPLER_VIEW] + 1;
- for(i = 0; i < key->nr_sampler_views; ++i) {
- if(shader->info.base.file_mask[TGSI_FILE_SAMPLER_VIEW] & (1
<< i)) {
-
lp_sampler_static_texture_state(&key->state[i].texture_state,
-
lp->sampler_views[PIPE_SHADER_FRAGMENT][i]);
- }
- }
- }
- else {
- key->nr_sampler_views = key->nr_samplers;
- for(i = 0; i < key->nr_sampler_views; ++i) {
- if(shader->info.base.file_mask[TGSI_FILE_SAMPLER] & (1 <<
i)) {
-
lp_sampler_static_texture_state(&key->state[i].texture_state,
-
lp->sampler_views[PIPE_SHADER_FRAGMENT][i]);
- }
+ key->nr_sampler_views = key->nr_samplers;
+ for(i = 0; i < key->nr_sampler_views; ++i) {
+ if(shader->info.base.file_mask[TGSI_FILE_SAMPLER] & (1 << i)) {
+
lp_sampler_static_texture_state(&key->state[i].texture_state,
+
lp->sampler_views[PIPE_SHADER_FRAGMENT][i]);
}
}
}
No that won't work for d3d10 state trackers, where you typically have
more
views than samplers (though the opposite is possible as well and you
can
have holes in both). I think it really would look nicer if you could
just rely on sampler views being present.
what d3d10 state tracker?
Make that "some non-public code which uses the sample opcodes"...
well, not to be rude, but I can't really be expected to fix /
avoid-breaking code that doesn't exist (in upstream mesa)..
Rob,
Of course. Nobody expects or asked you to do such thing.
Roland's just explaining why your proposed llvmpipe workaround is not
sustainable. Breaking our closed state tracker is OK and happens all the
time FYI. What we can't have is irreparable breakage. In short, we just
need to workout a solution that doesn't leave us with our hands tied to fix
it.
And in fact, Roland's suggestion of requiring all TGSI producers including
SVIEW dcl does not require fixing any closed-source state tracker but rather
just fix TGSI generators in Mesa to put the SVIEW.
Hmm, well there are a *lot* of things in mesa which generate tgsi (vl,
gallium9, etc).. fixing them all, esp. when I don't necessarily know
all the other state trackers well enough to know what they *should* be
doing, didn't seem reasonable.
I was hoping we could just leave SVIEW decl as optional to avoid
having to fix up everything. It just means that something which does
use SVIEW/SAMPLE* needs to avoid having SVIEW index's which conflict
with SAMP decl indexes.
Although maybe w/ the draw stuff, if it is potentially inserting
SAMP/TEX* into shaders already using SVIEW/SAMPLE*, maybe it needs to
take care to insert SAMP decl's with #'s that start above max SVIEW
index. I'm not sure, I haven't used the aux/draw stuff yet, so I'm
not terribly familiar with it.
Not sure if it would help, but I could add an shader cap and for now
only insert the SVIEW decl for drivers that want them. Which I think,
at the moment, is just tgsi_to_nir (and currently the two tgsi_to_nir
users, vc4 and freedreno, use any of the aux/draw stuff).
Anyway I really like your approach, appreciate your perseverance so far, and
want to see it through. So please just bear with us a bit more.
fair enough.. I was mostly just trying to avoid having to fix massive
amounts of state trackers, etc, which I am not necessarily terribly
familiar with or even have a way to test.
Roland, would it be possible to swap the
`shader->info.base.file_max[TGSI_FILE_SAMPLER_VIEW] != -1` condition with
something else? Just temporarily, so that we can decouple things, allowing
to commit this as is, then enforce the SVIEW everywhere on a 2nd round.
Specially because enforcing this everywhere will require some work,
sepcially on u_blit / u_mipmap_gen helpers, which need yet another key.
For example, what about actually using the TGSI_OPCODE_SAMPLE_* & friends
opcode count? In fact, we could go even farther, and compute the opcount
for TGSI_OPCODE_TEX & friends and ensure that they are never non-zero
simultanouesly.
hmm, possibly something I didn't understand properly.. if you never
have both TEX* and SAMPLE* at same time, this sounds much easier..
Ie. just use # of opcodes instead of file_max. Could even just add a
trivial thing to tgsi_scan to count up # of opcodes of each category
(ie. TEX* and SAMPLE*)