On 23 August 2013 13:19, Matt Turner <matts...@gmail.com> wrote: > On Fri, Aug 23, 2013 at 8:55 AM, Paul Berry <stereotype...@gmail.com> > wrote: > > On 22 August 2013 16:08, Matt Turner <matts...@gmail.com> wrote: > >> > >> --- > >> src/glsl/builtins/ir/frexp.ir | 25 > >> +++++++++++++++++++++++++ > >> src/glsl/builtins/ir/ldexp.ir | 25 > >> +++++++++++++++++++++++++ > >> src/glsl/builtins/profiles/ARB_gpu_shader5.glsl | 10 ++++++++++ > >> 3 files changed, 60 insertions(+) > >> create mode 100644 src/glsl/builtins/ir/frexp.ir > >> create mode 100644 src/glsl/builtins/ir/ldexp.ir > >> > >> diff --git a/src/glsl/builtins/ir/frexp.ir b/src/glsl/builtins/ir/ > frexp.ir > >> new file mode 100644 > >> index 0000000..a514994 > >> --- /dev/null > >> +++ b/src/glsl/builtins/ir/frexp.ir > >> @@ -0,0 +1,25 @@ > >> +((function frexp > >> + (signature float > >> + (parameters > >> + (declare (in) float x) > >> + (declare (out) int exp)) > >> + ((return (expression float frexp (var_ref x) (var_ref exp))))) > > > > > > Having an ir_expression that writes to one of its parameters is going to > > break assumptions in a lot of our optimization passes. > > I'm concerned that that may be a problem we have to solve anyway. > > While our hardware doesn't support an frexp instruction (like e.g., > AMD does) and we could probably do what you suggest, we do have > instructions that correspond directly to the uaddCarry() and > usubBorrow() built-ins in this same extension. They return a value and > also have an out parameter. > > genUType uaddCarry(genUType x, genUType y, out genUType carry); > genUType usubBorrow(genUType x, genUType y, out genUType borrow); > > We could probably avoid the problem you describe by lowering them, but > it's feeling increasingly distasteful. > > Your code would make a good piglit test. I'll do some experiments. >
Hmm, interesting. The way LLVM solves this problem, as I understand it, is through so-called "intrinsic functions" (http://llvm.org/docs/LangRef.html#intrinsic-functions). I wonder if we should start doing that in Mesa. Briefly, here is what it would look like, using uaddCarry as an example: 1. First we do an inefficient implementation of uaddCarry in terms of existing GLSL functions, much like you did for frexp in your frexp_to_arith lowering pass, except that we do it in src/glsl/builtins/glsl/uaddCarry.glsl, so it's a little easier to review :). Optimization passes already deal with function "out" parameters properly, and function inlining automatically splices in the proper code during linking. 2. For back-ends that don't have an efficient native way to do uaddCarry, we're done. The uaddCarry function works as is. 3. For back-ends that do have an efficient way to do uaddCarry, we add a mechanism to allow the back-end to tell the linker: "don't inline the definition of this built-in. Just leave it as an ir_call because I have my own special implementation of it"*. 4. In the back-end visitor code, the ir_call visitor looks at the name of the function being called. If it's "uaddCarry", then the back-end visitor just emits the efficient back-end code. Any other ir_calls should have been eliminated by the function inlining. *We'll need to be careful to make sure that the right thing happens if the user overrides uaddCarry with their own user-defined function, of course :) Now that I've actually thought through it, I'm really excited about this idea. It seems way more straightforward than what we are currently doing (e.g. in lower_packing_builtins.cpp), and it works nicely with the other back-ends because if a back-end doesn't advertise an intrinsic definition of a given function, it automtically gets the version declared in src/glsl/builtins without having to do any extra work. What do you think?
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev