On Fri, May 8, 2015 at 11:11 AM, Ian Romanick <i...@freedesktop.org> wrote: > On 05/08/2015 03:36 AM, Kenneth Graunke wrote: >> According to Glenn, shifts on R600 have 5x the throughput as multiplies. >> >> Intel GPUs have strange integer multiplication restrictions - on most >> hardware, MUL actually only does a 32-bit x 16-bit multiply. This >> means the arguments aren't commutative, which can limit our constant >> propagation options. SHL has no such restrictions. >> >> Shifting is probably reasonable on most people's hardware, so let's just >> do that. >> >> i965 shader-db results (using NIR for VS): >> total instructions in shared programs: 7432587 -> 7388982 (-0.59%) >> instructions in affected programs: 1360411 -> 1316806 (-3.21%) >> helped: 5772 >> HURT: 0 >> >> Signed-off-by: Kenneth Graunke <kenn...@whitecape.org> >> Cc: matts...@gmail.com >> Cc: ja...@jlekstrand.net >> --- >> src/glsl/nir/nir_opt_algebraic.py | 5 +++++ >> 1 file changed, 5 insertions(+) >> >> So...I found a bizarre issue with this patch. >> >> (('imul', 4, a), ('ishl', a, 2)), >> >> totally optimizes things. However... >> >> (('imul', a, 4), ('ishl', a, 2)), >> >> doesn't seem to do anything, even though imul is commutative, and nir_search >> should totally handle that... >> >> ▄▄ ▄▄ ▄▄ ▄▄▄▄▄▄▄▄ ▄▄▄▄▄ ▄▄ >> ██ ██ ████ ▀▀▀██▀▀▀ █▀▀▀▀██ ██ >> ▀█▄ ██ ▄█▀ ████ ██ ▄█▀ ██ >> ██ ██ ██ ██ ██ ██ ▄██▀ ██ >> ███▀▀███ ██████ ██ ██ ▀▀ >> ███ ███ ▄██ ██▄ ██ ▄▄ ▄▄ >> ▀▀▀ ▀▀▀ ▀▀ ▀▀ ▀▀ ▀▀ ▀▀ >> >> If you know why, let me know, otherwise I may have to look into it when more >> awake. > > I've noticed a couple other weird things that I have been unable to > understand. Shaders like the one below end with fmul/ffma instaed of > flrp, for example. I understand why that happens from GLSL IR > opt_algebraic, but it seems like nir_opt_algebraic should handle it.
Just a guess, but it's quite possibly due to the commutative operations bug I just sent a patch for. --Jason > [require] > GLSL >= 1.30 > > [vertex shader] > in vec4 v; > in vec2 tc_in; > > out vec2 tc; > > void main() { > gl_Position = v; > tc = tc_in; > } > > [fragment shader] > in vec2 tc; > > out vec4 color; > > uniform sampler2D s; > uniform float a; > uniform vec3 base_color; > > void main() { > vec3 tex_color = texture(s, tc).xyz; > > color.xyz = (base_color * a) + (tex_color * (1.0 - a)); > color.a = 1.0; > } > > > >> diff --git a/src/glsl/nir/nir_opt_algebraic.py >> b/src/glsl/nir/nir_opt_algebraic.py >> index 400d60e..350471f 100644 >> --- a/src/glsl/nir/nir_opt_algebraic.py >> +++ b/src/glsl/nir/nir_opt_algebraic.py >> @@ -247,6 +247,11 @@ late_optimizations = [ >> (('fge', ('fadd', a, b), 0.0), ('fge', a, ('fneg', b))), >> (('feq', ('fadd', a, b), 0.0), ('feq', a, ('fneg', b))), >> (('fne', ('fadd', a, b), 0.0), ('fne', a, ('fneg', b))), >> + >> + # Multiplication by 4 comes up fairly often in indirect offset >> calculations. >> + # Some GPUs have weird integer multiplication limitations, but shifts >> should work >> + # equally well everywhere. >> + (('imul', 4, a), ('ishl', a, 2)), > > This should be conditionalized on whether the platform has native integers. > >> ] >> >> print nir_algebraic.AlgebraicPass("nir_opt_algebraic", >> optimizations).render() >> > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev