On 07/02/2014 07:29 AM, Brian Paul wrote: > On 07/02/2014 06:49 AM, Brian Paul wrote: >> On 07/02/2014 06:36 AM, Brian Paul wrote: >>> On 07/01/2014 08:43 PM, Michel Dänzer wrote: >>>> On 02.07.2014 00:43, Brian Paul wrote: >>>>> Module: Mesa >>>>> Branch: master >>>>> Commit: f4b0ab7afd83c811329211eae8167c9bf238870c >>>>> URL: >>>>> https://urldefense.proofpoint.com/v1/url?u=http://cgit.freedesktop.org/mesa/mesa/commit/?id%3Df4b0ab7afd83c811329211eae8167c9bf238870c&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=lGQMzzTgII0I7jefp2FHq7WtZ%2BTLs8wadB%2BiIj9xpBY%3D%0A&m=X5dPXUSLXb%2B60RW5%2FHCgcIkPXz083MgfZAbiRJFVOxc%3D%0A&s=c9766b9e3797ccad30888c534fae5891f9a7313418a228c81c0528230412cf1c >>>>> >>>>> >>>>> >>>>> >>>>> Author: Brian Paul <bri...@vmware.com> >>>>> Date: Tue Jul 1 08:17:09 2014 -0600 >>>>> >>>>> st/mesa: fix incorrect size of UBO declarations >>>>> >>>>> UniformBufferSize is in bytes so we need to divide by 16 to get the >>>>> number of constant buffer slots. Also, the ureg_DECL_constant2D() >>>>> function takes first..last parameters so we need to subtract one >>>>> for the last value. >>>> >>>> This change broke the GLSL uniform_buffer fs/vs/gs-struct-pad piglit >>>> tests with radeonsi: >>>> >>>> ../../../../src/gallium/auxiliary/gallivm/lp_bld_tgsi.c:306:lp_build_emit_fetch: >>>> >>>> >>>> Assertion `reg->Register.Index <= >>>> bld_base->info->file_max[reg->Register.File]' failed. >>>> >>>> AFAICT reg->Register.Index is 2, and the new code calls >>>> ureg_DECL_constant2D() with last=1. This is the TGSI code: >>>> >>>> FRAG >>>> PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 >>>> DCL OUT[0], COLOR >>>> DCL CONST[1][0..1] >>>> DCL TEMP[0], LOCAL >>>> IMM[0] UINT32 {0, 16, 32, 0} >>>> 0: MOV TEMP[0].x, CONST[1][0].xxxx >>>> 1: MOV TEMP[0].y, CONST[1][1].xxxx >>>> 2: MOV TEMP[0].z, CONST[1][2].xxxx >>>> 3: MOV TEMP[0].w, CONST[1][2].xxxx >>>> 4: MOV OUT[0], TEMP[0] >>>> 5: END >>>> >>>> which results from this GLSL code: >>>> >>>> #version 140 >>>> >>>> struct S1 { >>>> float r; >>>> }; >>>> >>>> struct S2 { >>>> float g; >>>> float b; >>>> float a; >>>> }; >>>> >>>> struct S { >>>> S1 s1; >>>> S2 s2; >>>> }; >>>> >>>> uniform ubo1 { >>>> S s; >>>> }; >>>> >>>> void main() >>>> { >>>> gl_FragColor = vec4(s.s1.r, s.s2.g, s.s2.b, s.s2.a); >>>> } >>>> >>>> >>>> Something doesn't seem to add up, but I'm not sure where exactly the >>>> problem is or how to fix it. >>> >>> >>> I think there's at least a one GLSL compiler bug exposed here. >>> >>> From the code fragments above, the z and w components of TEMP[0] are >>> getting the same value when they should not. >>> >>> I reverted Mesa to before my changes and modified the piglit test to use >>> different values for the blue/alpha channels and it fails (nvidia >>> passes). We were just getting lucky before my changes. >>> >>> I suspect the code which is laying out the floats in the UBO is doing >>> something wrong, and the size of the UBO is miscalculated as a result. >>> >>> I'll dig a bit more but I think someone more familiar with the compiler >>> will need to help. >> >> The initial IR looks OK: >> >> >> (loop ... >> >> (assign (xyzw) (var_ref gl_Position) (array_ref (var_ref >> vertex_to_gs) (var_ref i) ) ) >> (declare (temporary ) vec4 vec_ctor) >> (assign (x) (var_ref vec_ctor) (record_ref (record_ref >> (var_ref s) s1) r) ) >> (assign (y) (var_ref vec_ctor) (record_ref (record_ref >> (var_ref s) s2) g) ) >> (assign (z) (var_ref vec_ctor) (record_ref (record_ref >> (var_ref s) s2) b) ) >> (assign (w) (var_ref vec_ctor) (record_ref (record_ref >> (var_ref s) s2) a) ) >> (assign (xyzw) (var_ref v) (var_ref vec_ctor) ) >> (call EmitVertex ()) >> >> ... endloop) >> >> >> But later, after linking, the Z and W components of the vector >> constructor seem to get merged. Note: there's only 3 ubo_load >> instructions now instead of four: >> >> ... >> (assign (xyzw) (var_ref gl_Position) (array_ref (var_ref >> vertex_to_gs) (constant int (0)) ) ) >> (declare (temporary ) vec4 vec_ctor) >> (declare () float cse) >> (assign (x) (var_ref cse) (expression float ubo_load (constant >> uint (0)) (constant uint (0)) ) ) >> (assign (x) (var_ref vec_ctor) (var_ref cse) ) >> (declare () float cse@3) >> (assign (x) (var_ref cse@3) (expression float ubo_load >> (constant uint (0)) (constant uint (16)) ) ) >> (assign (y) (var_ref vec_ctor) (var_ref cse@3) ) >> (declare () float cse@4) >> (assign (x) (var_ref cse@4) (expression float ubo_load >> (constant uint (0)) (constant uint (32)) ) ) >> (assign (z) (var_ref vec_ctor) (var_ref cse@4) ) >> (assign (w) (var_ref vec_ctor) (var_ref cse@4) ) >> (assign (xyzw) (var_ref v) (var_ref vec_ctor) ) >> (assign (xyzw) (var_ref v@5) (var_ref v) ) >> (assign (xyzw) (var_ref gl_Position@6) (var_ref gl_Position) ) >> (emit-vertex (constant int (0)) ) >> >> I'm not sure I can debug this much further right now. > > I think the problem is in lower_ubo_reference_visitor::handle_rvalue() > in lower_ubo_reference.cpp > > If I disable the "const_offset = glsl_align(const_offset, > max_field_align);" near line 231 things seem to work. > > Something is wrong in the logic that does struct/field alignment in that > function.
Oof. There are a number of bugs in this area. I've been looking at one for the last few days, and I've just about got it resolved. Basically. uniform U { layout(row_major) mat3x4 m; vec4 v; } u; doesn't layout u.m as row-major. This only occurs if there is an instance name. :( I can dig into this bug once I'm done with the matrix bug. > Eric wrote the original code, maybe he can take a look? > > I've also been reading the GL_ARB_uniform_buffer_object spec to > understand the rules of layout "std140". For a structure it says "the > base alignmentof the structure is <N>, where <N> is the largest base > alignment value of any of its members, and rounded up to the base > alignment of a vec4." > > So it sounds like all structs should start at 16-byte boundaries. So > for the piglit test in question, I'd expect the UBO block to be 32 > bytes. But both Mesa and nvidia compute the UBO size to be 16 bytes. Hmm... do both drivers give 32 for struct S1 { float f; }; struct S2 { vec4 v; }; uniform U { S1 s1; S2 s2; }; Also, the ES3.1 and GL4.4 specs have some changes to the std140 rules to make them more clear. Do those specs agree with the extension spec? > -Brian > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev