On Jul 3, 2007, at 2:13 PM, Dan Gohman wrote: >>> We overload ISD::FADD and quite a lot of others. Why not >>> ISD::ConstantFP too? >> >> Fair enough, after pondering on it, I agree with you. The proposed >> semantics are that a ConstantFP (and also a normal Constant?) produce >> the splatted immediate value? > > Constant sounds good too. And UNDEF, for that matter. And yes, > that's the > semantics I mean.
Ok, makes sense. I think we already use UNDEF for vectors. >> Please add a dag combine xform from build_vector [c,c,c,c] -> >> constantfp and friends. > > I sketched out some of the code for this. One question that's come > up so far is > whether if the vector has some undef elements but all the non-undef > elements > are equal it should still be folded. My initial preference is to > still fold it, > since that lets things like isBuildVectorAllZeros become trivial to > unnecessary, > but it is a pessimization in some obscure cases. I'm not sure about it. One specific issue is with shuffle masks, which we want to retain the undef element values for. I don't think there is a good way to retain shuffle masks but not other build vectors, so we probably need to keep the individual undef elements in it. isBuildVectorAllZeros and friends are another issue. To me there are actually two issues that should be resolved at some point: 1. vector constant and shuffle mask matching code is crazily complex, particularly in the x86 backend. For vector constants, this is only slightly annoying. For vector shuffle masks, the selected shuffles are currently whatever is best for yonah, and it's not really possible to prefer different shuffles on different subtargets. We really want to add a layer of abstraction in the shuffle/constant matching code, which would make the undef handling stuff happen implicitly. Making a more declarative description of the various masks would make it much easier to maintain, understand, and debug. 2. the x86 backend specifically has a problem with the way it selects vector constants (I think this is in the readme). In particular, if you have a 4 x f32 and a 4 x i32 zero vector, you'll get two different pxor instructions, because they are of different type. There are two different ways to solve this problem: The easy answer is to do what the ppc backend does. It always selects zero (and -1) vectors to 4 x i32 IIRC, and then does a bitcast to the desired type if needed. This ensures that the constant vectors always get CSEd. The tricky part of this is to ensure that the 0/-1 vectors still get folded if you have operations (like ~) that require one of these as an operand. This ugliness is why we have "vnot" and "vnot_conv" and have to duplicate patterns. The better fix is to change the way the select phase produces code. In particular, the reason these two zero vectors don't get CSE'd after selection is because they have two different value types, and the autocse stuff doesn't "know" that the two VTs end up in the same register class. To solve this, it seems like we can add a new MVT type, where a certain range of MVTs (128-255?) correspond to register class ID's. At selection time, instead of giving the new nodes their old MVT's, they would get new MVT's that correspond to the regclass of the result (ok, we'd keep MVT::Other, MVT::Flag and maybe some others). This makes the scheduler slightly simpler (because it doesn't need to map MVT -> regclass) anymore, and opens up future possibilities. In particular, it lets us fix a long-standing class of issues where we can't have fp stack and SSE registers around at the same time, both with MVT::f32 or f64 type. The current scheduler can only map f32 to one register class (thus, it can't keep the distinction) but with this change the select pass can pick any regclass it wants. Anyway, this is a bit of a crazy tangent, but I think undef's in buildvector should probably stay :) -Chris _______________________________________________ llvm-commits mailing list llvm-commits@cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits