On Tue, Jun 21, 2016 at 03:14:51PM -0500, Bill Schmidt wrote: > I discovered recently that, with -mcpu=power9, an attempt to generate a > vspltish instruction resulted instead in an xxspltib followed by a vupkhsb. > This is semantically correct but the extra instruction is not optimal. I > found that there was some logic in xxspltib_constant_p to do special casing > for const_vector with small constants, but not for vec_duplicate with small > constants. This patch duplicates that logic so we can generate the single > instruction when possible.
This part is okay. > When I did this, I ran into a problem with an existing test case. We end up > matching the *vsx_splat_v4si_internal pattern instead of falling back to the > altivec_vspltisw pattern. The constraints don't match for constant input. > To avoid this, I added a pattern ahead of this one that will match for VMX > output registers and produce the vspltisw as desired. This corrected the > failing test and produces the expected code. Why does the predicate allow constant input, while the constraints do not? > I've added a test case to demonstrate the code works properly now in the > usual case. Thanks :-) Segher