Hi, I have been looking into this bug:
Compiling of shader gets stuck in infinite loop https://bugs.freedesktop.org/show_bug.cgi?id=78468 Although this occurs at link time when the Intel driver has run some of its specific lowering passes, it looks like the problem could hit other drivers if the right conditions are met, as the actual problem happens inside common optimization passes. I reproduced the problem with a very simple shader like this: uniform sampler2D tex; out vec4 FragColor; void main() { vec4 col = texture(tex, vec2(0, 0)); for (int i=0; i<30; i++) col += vec4(0.1, 0.1, 0.1, 0.1); col = vec4(col.rgb / 2.0, col.a); FragColor = col; } and for this shader, I traced the problem down to the fact that do_tree_grafting() is generating instructions like this: (assign (x) (var_ref flattening_tmp_y@116) (expression float * (swiz x (expression float + (swiz x (expression float + (swiz x (expression float + (swiz x (expression float + (swiz x (expression float + (swiz x (expression float + (swiz x (expression float + (swiz x (expression float + (swiz x (expression float + (swiz x (expression float + (swiz x (expression float + (swiz x (expression float + (swiz x (expression float + (swiz x (expression float + (swiz x (expression float + (swiz x (expression float + (swiz x (expression float + (swiz x (expression float + (swiz x (expression float + (swiz x (expression float + (swiz x (expression float + (swiz x (expression float + (swiz x (expression float + (swiz x (expression float + (swiz x (expression float + (swiz x (expression float + (swiz x (expression float + (swiz x (expression float + (swiz x (expression float + (var_ref col_y) (constant float (0.100000)) ) )(constant float (0.100000)) ) )(constant float (0.100000)) ) )(constant float (0.100000)) ) )(constant float (0.100000)) ) )(constant float (0.100000)) ) )(constant float (0.100000)) ) )(constant float (0.100000)) ) )(constant float (0.100000)) ) )(constant float (0.100000)) ) )(constant float (0.100000)) ) )(constant float (0.100000)) ) )(constant float (0.100000)) ) )(constant float (0.100000)) ) )(constant float (0.100000)) ) )(constant float (0.100000)) ) )(constant float (0.100000)) ) )(constant float (0.100000)) ) )(constant float (0.100000)) ) )(constant float (0.100000)) ) )(constant float (0.100000)) ) )(constant float (0.100000)) ) )(constant float (0.100000)) ) )(constant float (0.100000)) ) )(constant float (0.100000)) ) )(constant float (0.100000)) ) )(constant float (0.100000)) ) )(constant float (0.100000)) ) )(constant float (0.100000)) ) )(constant float (0.500000)) ) ) And when we feed these to do_constant_folding() it takes forever to finish. For this shader in particular, removing the tree grafting pass from do_common_optimization eliminates the problem. Notice that small, seemingly irrelevant changes to the shader code, can make it so that this never happens. For example, if we initialize 'col' to something like vec4(0,0,0,0) instead of using the texture function, or we remove the division by 2.0 in the last assignment to 'col', these instructions are never produced and the shader compiles okay. The number of iterations in the loop is also important, if we have too many we do not unroll the loop and the problem never happens, if we have too few, rather than generating a super large tree of expressions like above, we generate something like this and the problem, again, does not happen: (notice how it adds 0.1 nine times to make 0.9 rather than chaining 9 add expressions for 10 iterations of the loop): (assign (x) (var_ref flattening_tmp_y) (expression float * (expression float + (constant float (0.900000)) (var_ref col_y) ) (constant float (0.500000)) ) ) So it seems that whether we generate a huge chunk of expressions or not is subject to a number of factors, but when the right conditions are met we can generate code that can stall compilation forever. Reading what tree grafting is supposed to do, this does not seem to be an unexpected result though, so I wonder what would be the right way to fix this. It would look like we would want to do whatever we are doing when we only have a few iterations in the loop, but I don't know why we generate different code in that case and I am not familiar enough with all the optimization and lowering passes to assess what would make sense to do here... so, any suggestions? Iago _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev