On Fri, Nov 14, 2014 at 1:38 PM, Henri Verbeet <hverb...@gmail.com> wrote: > On 14 November 2014 18:50, Ilia Mirkin <imir...@alum.mit.edu> wrote: >> I can't speak for the radeon guys, but I know I sure would love to see >> any reports of poor code being generated by nouveau in response to >> legitimate-seeming TGSI (or GLSL). In some cases, a simple >> optimization can be added to take care of it, and I'd definitely >> appreciate the extra pair of eyeballs on driver-generated code :) >> >> The report can be as simple as "here is the TGSI snippet, take a look >> at how crappy the code it generates is". At least for nouveau, I can >> feed that directly into a compiler that can target any of the relevant >> backends. >> >> [Note, r600g didn't have an optimizer enabled until ~1y ago; not sure >> if your analysis was with or without sb.] >> > It was with sb, but probably before TGSI got FSLT/FSGE/etc. > > For reference, what currently happens for r600g is something like this: > > D3D: > cnd r[0], r[0].w, c[1], c[2] > > GLSL: > R0.xyzw = (R0.w > 0.5 ? ps_c[1].xyzw : ps_c[2].xyzw); > > TGSI: > FSLT TEMP[0].x, IMM[0].xxxx, TEMP[0].xxxx > UIF TEMP[0].xxxx :0 > MOV TEMP[0], CONST[1] > ELSE :0 > MOV TEMP[0], CONST[2] > ENDIF > > R600: > SETGE_DX10 T0.x, 0.5, T0.x > CNDE_INT R0.x, T0.x, KC0[1].x, KC0[2].x > CNDE_INT R0.y, T0.x, KC0[1].y, KC0[2].y > CNDE_INT R0.z, T0.x, KC0[1].z, KC0[2].z > CNDE_INT R0.w, T0.x, KC0[1].w, KC0[2].w > > While ideally that would just be 4 CNDGE's, that's better than what I > remember. IIRC there used to be a bunch of int/float conversions as > well.
In the future, a full TGSI program would be preferred, since then it can just be fed in... for this one (with a few assumptions baked in about the immediate, where TEMP[0] comes from, etc), targeted to nvc0 (GF100): 00000000: fff01c06 06000000 ld b32 $r0 a[0x0] 0x0 unk39 00000008: fc01dc00 220e0000 set $p0 0x1 gt f32 $r0 0x0 00000010: 43f000c6 14000000 $p0 ld b128 $r0q c0[0x10] 00000018: 83f020c6 14000000 (not $p0) ld b128 $r0q c0[0x20] 00000020: 03f01c66 0a7e0000 st b128 a[0x0] $r0:$r1:$r2:$r3 0x0 unk39 Which seems pretty reasonable. -ilia _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev