On Wed, 8 Sep 2010 18:57:46 +0200, Luca Barbieri <l...@luca-barbieri.com> wrote: > S I'd been wanting to do this. Only, I was thinking that instead of > > adding an ir_binop_all_equal and ir_binop_any_equal, those would just be > > expressed as not(any(nequal())) and any(equal()). And I say that as > > probably one of the few that has a backend that wants to recognize > > all_equal. What do you think? > > I think it makes sense: by the way, ir_to_mesa emits the current > nequal, or my new any_nequal, exactly as it does emit any(equal()). > > Of course, what ir_to_mesa does is not really optimal, because it > should use predicates/condition codes, which are however badly > supported everywhere. > > In general if(any(nequal(a, b))) should become, in pseudo-code, > assuming a vector predicate register, > > SNE_update_pred NONE, a, b > IFC pred.xyzw: > > and certainly not anything using DP4 for optimal performance on > hardware that has predicates like nv30/nv40. > > Modern/scalar hardware would probably prefer that representation too, > since unlike DP4 it can be readily scalarized.
As far as scalar hardware, right now in the 965 FS backend for: if (any(lessThan(args, vec4(3.0)))) gl_FragColor = vec4(0.0, 1.0, 0.0, 0.0); else gl_FragColor = vec4(1.0, 0.0, 0.0, 0.0); I'm getting: (expression bool || (swiz w (expression bool < (swiz w (var_ref a...@0x8753010) )(constant float (3.000000)) ) )(expression bool || (swiz z (expression bool < (swiz z (var_ref a...@0x8753010) )(constant float (3.000000)) ) )(expression bool || (expression bool < (swiz x (var_ref a...@0x8753010) )(constant float (3.000000)) ) (swiz y (expression bool < (swiz y (var_ref a...@0x8753010) )(constant float (3.000000)) ) )) ) ) mov(8) g19<1>F g3.3<0,1,0>F { align1 }; mov(8) g20<1>F 3F { align1 }; cmp.l(8) g21<1>D g19<8,8,1>F g20<8,8,1>F { align1 }; and(8) g21<1>D g21<8,8,1>D 1D { align1 }; mov(8) g22<1>D g24<8,8,1>D { align1 }; mov(8) g23<1>F g3.2<0,1,0>F { align1 }; mov(8) g24<1>F 3F { align1 }; cmp.l(8) g25<1>D g23<8,8,1>F g24<8,8,1>F { align1 }; and(8) g25<1>D g25<8,8,1>D 1D { align1 }; mov(8) g26<1>D g27<8,8,1>D { align1 }; mov(8) g27<1>F g3<0,1,0>F { align1 }; mov(8) g28<1>F 3F { align1 }; cmp.l(8) g29<1>D g27<8,8,1>F g28<8,8,1>F { align1 }; and(8) g29<1>D g29<8,8,1>D 1D { align1 }; mov(8) g30<1>F g3.1<0,1,0>F { align1 }; mov(8) g31<1>F 3F { align1 }; cmp.l(8) g32<1>D g30<8,8,1>F g31<8,8,1>F { align1 }; and(8) g32<1>D g32<8,8,1>D 1D { align1 }; mov(8) g33<1>D g33<8,8,1>D { align1 }; or(8) g34<1>D g29<8,8,1>D g33<8,8,1>D { align1 }; or(8) g35<1>D g26<8,8,1>D g34<8,8,1>D { align1 }; or(8) g36<1>D g22<8,8,1>D g35<8,8,1>D { align1 }; mov.ne(8) null g36<8,8,1>D { align1 }; (+f0) if(8) ip 18D { align1 switch }; [...] So the current implementation of if(any()) is looking pretty OK ("or or or mov.ne if"), though there's a register dependency that could be eliminated with a bit of juggling, and we could move the cond update up to the last or. Also, not sure whether to make bools be 0,1 immediately at cmp time, or to do that at b2f/b2i time. (ignoring gratuitous moves in that code that will get eliminated later, and obviously register allocation isn't done.) If you're working on a driver for a scalar chip, you might want to pull brw_fs_channel_expressions and brw_fs_vector_splitting up and get them used -- it should make sensible codegen a lot easier for them. Hmm, that dependency thing might be nice in general. Take the or(or(or(a,b),c),d) and make or(or(a,b), or(c,d)) when you've got an associating operator (probably make sure component counts are equal, or the type changes can be painful).
pgpxVVfsXPGn5.pgp
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev