On 8 December 2014 at 20:41, Vadim Girlin <vadimgir...@gmail.com> wrote: > On 12/06/2014 07:13 AM, Vadim Girlin wrote: >> >> On 12/04/2014 01:43 AM, Dave Airlie wrote: >>> >>> Hi Vadim, >>> >>> I've been looking with Glenn's help into a bug in sb for a couple of >>> weeks now triggered by a change in how GLSL generates switch >>> statements. >>> >>> I understand you probably aren't too interested in r600g but I believe >>> I'm hitting a design level problem and I would like some advice. >>> >>> So it appears that GLSL can create loops that don't repeat for switch >>> statements, and it appears SB wasn't ready to handle such a thing. >> >> >> Hi, Dave, >> >> I suspect we should rather get rid of such loops somehow, i.e. convert >> to something else, the loop that never repeats is not really a loop >> anyway. AFAICS "continue" is not supported in switch statements >> according to GLSL specs, so the loops generated for switch will never be >> repeated. Am I missing something? Even if repeating is possible somehow, >> at least we can get rid of the loops that are not repeated. >> >> I think loops are less efficient than other control flow instructions on >> r600g hw (at least because they increase stack usage), and possibly on >> other hw too. >> >> In fact it seems sb basically gets rid of it already in IR, it just >> doesn't know how to translate resulting control flow to ISA, because so >> far it only supports specific control flow structure for if-then-else >> that was previously preserved during optimizations. I think it may be >> not very hard to implement support for that in finalizer, I'll look into >> it. > > > In fact handling that control flow in finalizer is not as easy as I hoped, > probably impossible, at least if we want to make it efficient. I forgot > about the limitations of R600 ISA. > > OTOH it seems I've managed to fix the issues with loops, the patch is > attached (it's meant to be used instead of 7b0067d2). There are no piglit > regressions on evergreen, but I didn't test any real apps. >
This fixes one thing, but the switches are still broken here on cayman at least tests/spec/glsl-1.30/execution/switch/fs-default_last.shader_test -------------------------------------------------------------- FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL OUT[0], COLOR DCL CONST[0] DCL TEMP[0..2], LOCAL IMM[0] FLT32 { 0.0000, 1.0000, 0.0000, 0.0000} IMM[1] UINT32 {0, 4294967295, 0, 0} IMM[2] INT32 {1, 0, 0, 0} 0: MOV TEMP[0], IMM[0].xxxx 1: MOV TEMP[1].x, IMM[1].xxxx 2: BGNLOOP :0 3: UCMP TEMP[1].x, CONST[0].xxxx, TEMP[1].xxxx, IMM[1].yyyy 4: UIF TEMP[1].xxxx :0 5: MOV TEMP[0].x, IMM[0].yyyy 6: BRK 7: ENDIF 8: USEQ TEMP[2].x, IMM[2].xxxx, CONST[0].xxxx 9: UCMP TEMP[1].x, TEMP[2].xxxx, IMM[1].yyyy, TEMP[1].xxxx 10: UIF TEMP[1].xxxx :0 11: MOV TEMP[0].y, IMM[0].yyyy 12: BRK 13: ENDIF 14: MOV TEMP[1].x, IMM[1].yyyy 15: MOV TEMP[0].z, IMM[0].yyyy 16: BRK 17: ENDLOOP :0 18: MOV OUT[0], TEMP[0] 19: END ===== SHADER #13 ======================================== PS/CAYMAN/CAYMAN ===== ===== 72 dw ===== 6 gprs ===== 2 stack ========================================= 0000 00000012 a0100000 ALU 5 @36 0036 000000f8 00200c90 1 x: MOV R1.x, 0 0038 000000f8 20200c90 y: MOV R1.y, 0 0040 000000f8 40200c90 z: MOV R1.z, 0 0042 800000f8 60200c90 w: MOV R1.w, 0 0044 800000f8 00400c90 2 x: MOV R2.x, 0 0002 0000000f 81800000 LOOP_START_DX10 @30 0004 40000017 a4040000 ALU_PUSH_BEFORE 2 @46 KC0[CB0:0-15] 0046 809f6080 0043c002 3 x: CNDGE_INT R2.x, KC0[0].x, -1, R2.x 0048 801f00fe 00a0229c 4 MP x: PRED_SETNE_INT R5.x, PV.x, 0 0006 00000007 82800001 JUMP @14 POP:1 0008 00000019 a0000000 ALU 1 @50 0050 800004f9 00200c90 5 x: MOV R1.x, 1.0 0010 0000000e 82400000 LOOP_BREAK @28 0012 00000007 83800001 POP @14 POP:1 0014 4000001a a4080000 ALU_PUSH_BEFORE 3 @52 KC0[CB0:0-15] 0052 801000fa 00601d10 6 x: SETE_INT R3.x, 1, KC0[0].x 0054 800040fe 0043c4fb 7 x: CNDGE_INT R2.x, PV.x, R2.x, -1 0056 801f00fe 00a0229c 8 MP x: PRED_SETNE_INT R5.x, PV.x, 0 0016 0000000c 82800001 JUMP @24 POP:1 0018 0000001d a0000000 ALU 1 @58 0058 800004f9 20200c90 9 y: MOV R1.y, 1.0 0020 0000000e 82400000 LOOP_BREAK @28 0022 0000000c 83800001 POP @24 POP:1 0024 0000001e a0040000 ALU 2 @60 0060 000004fb 00400c90 10 x: MOV R2.x, -1 0062 800004f9 40200c90 z: MOV R1.z, 1.0 0026 0000000e 82400000 LOOP_BREAK @28 0028 00000002 81400000 LOOP_END @4 0030 00000020 a00c0000 ALU 4 @64 0064 00000001 00000c90 11 x: MOV R0.x, R1.x 0066 00000401 20000c90 y: MOV R0.y, R1.y 0068 00000801 40000c90 z: MOV R0.z, R1.z 0070 80000c01 60000c90 w: MOV R0.w, R1.w 0032 c0000000 95000688 EXPORT_DONE PIXEL 0 R0.xyzw 0034 00000000 88000000 CF_END @0 ===== SHADER_END =============================================================== ===== SHADER #13 OPT ==================================== PS/CAYMAN/CAYMAN ===== ===== 62 dw ===== 1 gprs ===== 2 stack ========================================= 0000 40000011 a0080000 ALU 3 @34 KC0[CB0:0-15] 0034 001000fa 0f801d10 1 x: SETE_INT T0.x, 1, KC0[0].x 0036 801f6080 2003c0f8 y: CNDGE_INT R0.y, KC0[0].x, -1, 0 0038 8080007c 4003c0fb 2 z: CNDGE_INT R0.z, T0.x, R0.y, -1 0002 0000000f 81800000 LOOP_START_DX10 @30 0004 00000014 a4000000 ALU_PUSH_BEFORE 1 @40 0040 801f0400 00002284 3 M x: PRED_SETNE_INT __.x, R0.y, 0 0006 00000007 82800001 JUMP @14 POP:1 0008 00000015 a0080000 ALU 3 @42 0042 000000f9 00000c90 4 x: MOV R0.x, 1.0 0044 000000f8 20000c90 y: MOV R0.y, 0 0046 800000f8 40000c90 z: MOV R0.z, 0 0010 0000000e 82400000 LOOP_BREAK @28 0012 00000007 83800001 POP @14 POP:1 0014 00000018 a4000000 ALU_PUSH_BEFORE 1 @48 0048 801f0800 00002284 5 M x: PRED_SETNE_INT __.x, R0.z, 0 0016 0000000c 82800001 JUMP @24 POP:1 0018 00000019 a0080000 ALU 3 @50 0050 000000f8 00000c90 6 x: MOV R0.x, 0 0052 000000f9 20000c90 y: MOV R0.y, 1.0 0054 800000f8 40000c90 z: MOV R0.z, 0 0020 0000000e 82400000 LOOP_BREAK @28 0022 0000000c 83800001 POP @24 POP:1 0024 0000001c a0080000 ALU 3 @56 0056 000000f8 00000c90 7 x: MOV R0.x, 0 0058 000000f8 20000c90 y: MOV R0.y, 0 0060 800000f9 40000c90 z: MOV R0.z, 1.0 0026 0000000e 82400000 LOOP_BREAK @28 0028 00000002 81400000 LOOP_END @4 0030 c0000000 95000888 EXPORT_DONE PIXEL 0 R0.xyz0 0032 00000000 88000000 CF_END @0 ===== SHADER_END =============================================================== Now I suspect it fails here because the stack depth is incorrectly calculated, though there is a chance this may be a cayman specific issue and the stack depth is just calculated wrong always. Dave. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev