On 5/10/2022 12:30 AM, Richard Biener wrote:
On Tue, May 10, 2022 at 12:58 AM Jeff Law via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
On 5/5/2022 2:26 AM, Richard Biener via Gcc-patches wrote:
On Thu, May 5, 2022 at 7:04 AM liuhongt <hongtao....@intel.com> wrote:
Optimize
_1 = *srcp_3(D);
_4 = VEC_PERM_EXPR <_1, _1, { 4, 5, 6, 7, 4, 5, 6, 7 }>;
_5 = BIT_FIELD_REF <_4, 128, 0>;
to
_1 = *srcp_3(D);
_5 = BIT_FIELD_REF <_1, 128, 128>;
the upper will finally be optimized to
_5 = BIT_FIELD_REF <*srcp_3(D), 128, 128>;
Bootstrapped and regtested on x86_64-pc-linux-gnu{m32,}.
Ok for trunk?
Hmm, tree-ssa-forwprop.cc:simplify_bitfield_ref should already
handle this in the
if (code == VEC_PERM_EXPR
&& constant_multiple_p (bit_field_offset (op), size, &idx))
{
part of the code - maybe that needs to be enhanced to cover
a contiguous stride in the VEC_PERM_EXPR. I see
we have
size = TREE_INT_CST_LOW (TYPE_SIZE (elem_type));
if (maybe_ne (bit_field_size (op), size))
return false;
where it will currently bail, so adjust that to check for a
constant multiple. I also think we should only handle the
case where the new bit_field_offset alignment is not
worse than the original one.
That said, I'd prefer if you integrate this transform with
simplify_bitfield_ref.
I've got a hack here that tries to do something similar, but it's trying
to catch the case where we CONSTRUCTOR feeds the BIT_FIELD_REF. It
walks the CONSTRUCTOR elements to see if an element has the right
offset/size to satisify the BIT_FIELD_REF. For x264 we're often able to
eliminate the VEC_PERMUTE entirely and just forward operands into the
BIT_FIELD_REF.
I was leaning towards moving those bits into match.pd before submitting,
but if you'd prefer them in tree-ssa-forwprop, that's even easier.
I think when deciding where to put things it's important to look where related
transforms reside. We already do have a (simplify (BIT_FIELD_REF
CONSTRUCTOR@ ...))
pattern which should also handle your case already. So instead of
adding something
new it would be nice to figure why it doesn't handle the case you are
interested in and
eventually just adjust the existing pattern.
I'm aware of that pattern. I've found it painfully inadequate in every
case I've looked at. In general I've found tree-ssa-forwprop is a
reasonable place to prototype a lot of stuff to see how it works in
practice, but I think match.pd is better for most of the transformations
in the long term.
It sounds like you'd prefer this particular case to move into match.pd.
Fine. That's what I'd originally planned to do. It's pretty simple
support code, so doing it in match.pd shouldn't be too hard.
Jeff