On Thu, Jan 9, 2014 at 11:28 AM, Ian Romanick <i...@freedesktop.org> wrote: > On 01/08/2014 12:43 PM, Matt Turner wrote: >> +/** >> + * \file opt_vectorize.cpp >> + * >> + * Combines scalar assignments of the same expression (modulo swizzle) to >> + * multiple channels of the same variable into a single vectorized >> expression >> + * and assignment. >> + * >> + * Many generated shaders contain scalarized code. That is, they contain >> + * >> + * r1.x = log2(v0.x); >> + * r1.y = log2(v0.y); >> + * r1.z = log2(v0.z); >> + * >> + * rather than >> + * >> + * r1.xyz = log2(v0.xyz); >> + * >> + * We look for consecutive assignments of the same expression (modulo >> swizzle) >> + * to each channel of the same variable. >> + * >> + * For instance, we want to convert these three scalar operations >> + * >> + * (assign (x) (var_ref r1) (expression float log2 (swiz x (var_ref v0)))) >> + * (assign (y) (var_ref r1) (expression float log2 (swiz y (var_ref v0)))) >> + * (assign (z) (var_ref r1) (expression float log2 (swiz z (var_ref v0)))) >> + * >> + * into a single vector operation >> + * >> + * (assign (xyz) (var_ref r1) (expression vec3 log2 (swiz xyz (var_ref >> v0)))) > > I think it's worth adding a note that this pass only attempts to combine > assignments that are sequential.
That comment block already says that: + * We look for consecutive assignments of the same expression (modulo swizzle) + * to each channel of the same variable. I'll change the first comment to use the word consecutive. > The above example gets fully > vectorized, but this sequence would not: > > (assign (x) (var_ref r1) (expression float log2 (swiz x (var_ref v0)))) > (assign (x) (var_ref r2) (expression float log2 (swiz y (var_ref v0)))) > (assign (y) (var_ref r1) (expression float log2 (swiz z (var_ref v0)))) > (assign (y) (var_ref r2) (expression float log2 (swiz w (var_ref v0)))) > > I think this will also break on code like > > (assign (x) (var_ref r1) (expression float log2 (swiz w (var_ref r1)))) > (assign (y) (var_ref r1) (expression float log2 (swiz z (var_ref r1)))) > # r1.xy have different values now. > (assign (z) (var_ref r1) (expression float log2 (swiz y (var_ref r1)))) > (assign (w) (var_ref r1) (expression float log2 (swiz x (var_ref r1)))) > > Maybe just skip assignments where the LHS also appears in the RHS for > now? Or does the check write_mask_matches_swizzle take care of this? It won't break because the code rejects expressions that contain swizzles that don't match the LHS's write mask. See the call to write_mask_matches_swizzle(). The good thing about this is that we can combine expressions that use the LHS, like (assign (x) (var_ref r1) (expression float log2 (swiz x (var_ref r1)))) (assign (y) (var_ref r1) (expression float log2 (swiz y (var_ref r1)))) _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev