On Mon, Jan 27, 2014 at 04:43:14PM +0900, Michel Dänzer wrote: > On Fre, 2014-01-24 at 07:40 -0800, Tom Stellard wrote: > > On Fri, Jan 24, 2014 at 01:27:00PM +0900, Michel Dänzer wrote: > > > From: Michel Dänzer <michel.daen...@amd.com> > > > > > > Fixes half a dozen piglit tests with radeonsi. > > > > > > Signed-off-by: Michel Dänzer <michel.daen...@amd.com> > > > --- > > > lib/Target/R600/SIInstructions.td | 5 +++++ > > > test/CodeGen/R600/trunc.ll | 10 ++++++++++ > > > 2 files changed, 15 insertions(+) > > > > > > diff --git a/lib/Target/R600/SIInstructions.td > > > b/lib/Target/R600/SIInstructions.td > > > index 03e7e32..b7b710f 100644 > > > --- a/lib/Target/R600/SIInstructions.td > > > +++ b/lib/Target/R600/SIInstructions.td > > > @@ -2126,6 +2126,11 @@ def : Pat < > > > (EXTRACT_SUBREG $a, sub0) > > > >; > > > > > > +def : Pat < > > > + (i1 (trunc i32:$a)), > > > + (V_CMP_EQ_I32_e64 (V_AND_B32_e32 (i32 1), $a), 1) > > > +>; > > > + > > > > I'm guessing you added V_CMP_EQ_I32_e64 in order to make the types match. > > Not really. The truncation is used for testing whether the LSB of an i32 > value is set, and storing the resulting boolean as an i1 value. My > pattern does this for the VGPRs in all thread of a wavefront in > parallel, storing the resulting boolean bits in a 64-bit SGPR. >
Ok, I didn't realize this pattern was meant to be used with control flow instructions. The pattern is fine as is. The patch is: Reviewed-by: Tom Stellard <thomas.stell...@amd.com> > > > Try this pattern instead: > > > > def : Pat < > > (i1 (trunc i32:$a)), > > (COPY_TO_REGCLASS (V_AND_B32_e32 (i32 1), $a), VReg_32) > > I don't understand the idea behind your suggestion, can you elaborate? > Without the COPY_TO_REGCLASS, LLVM tablegen will complain because the output register of V_AND_V32_e32 (VReg_32) does not support i1 types. Adding the COPY_TO_REGCLASS allows tablegen to accept the pattern, because COPY_TO_REGCLASS is untyped. > Anyway, it fails for one of the relevant piglit tests: > > amd_vertex_shader_layer-layered-2d-texture-render: > /home/daenzer/src/llvm-git/llvm/lib/Target/R600/SIInstrInfo.cpp:133: virtual > void llvm::SIInstrInfo::copyPhysReg(llvm::MachineBasicBlock&, > llvm::MachineBasicBlock::iterator, llvm::DebugLoc, unsigned int, unsigned > int, bool) const: Assertion `AMDGPU::VReg_64RegClass.contains(SrcReg) || > AMDGPU::SReg_64RegClass.contains(SrcReg)' failed. > > > -- > Earthling Michel Dänzer | http://www.amd.com > Libre software enthusiast | Mesa and X developer > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev