On 26/04/2019 15:13, Jeff Law wrote: > On 4/16/19 10:29 AM, Steve Ellcey wrote: >> Re-ping. I know there are discussions about bigger changes to fix the >> various failures listed in PR rtl-optimization/87763 but this patch >> at least fixes one of them (gcc.target/aarch64/lsl_asr_sbfiz.c). >> >> Steve Ellcey >> sell...@marvell.com > So here's my take on fixing the lsl_asr_sbfix.c regression. > > As I mentioned earlier the problem is the aarch64 is using the old way > of describing bitfield extractions (extv, extzv). Specifically it > defined a single expander that operated on the natural word mode (DImode). > > This forces the input operand to be used in DImode as well. So we get > those annoying subregs in the expressions that combine generates. But > there's a better way :-) > > The new way to handle this stuff is to define expanders for supported > modes (SI and DI for aarch64). Interestingly enough the aarch64 port > already does this for bitfield insertions via insv. > > So I made the obvious changes to the extv/extzv expander. That fixes > the lsl_asr_sbfiz regression. But doesn't bootstrap. The availability > of the new expander makes extract_bit_field_using_extv want to extract a > field from an HFmode object and shove it into an SImode. That runs > afoul of checks in validate_subreg (as it should). We shouldn't be > using subregs to change the size of a floating point object, so that > needs to be filtered out in extract_bit_field_using_extv. > > That fixes the bootstrap issue. But regression testing trips over > ashtidisi.c. That can be easily fixed by allowing zero/sign extractions > which change size. ie: > > (set (reg:DI) (sign_extract:DI (reg:SI) ...))) > > Which seems like a reasonable thing to handle. > > So here's what I've got. I've bootstrapped and regression tested on > aarch64. It's also bootstrapped on aarch64_be for good measure. > > OK (from aarch64 maintainers) for the gcc-9 branch and trunk? Or we > could just address on the trunk for gcc-10. I don't have strong > preferences either way. > > Jeff > > ps. Time to return insv regressions and address Segher's comments :-) > > > > P > > * aarch64.md (extv, extzv expander): Generalize so that it works > for both SImode and DImode. > (extv_3264, extzv_3264): New pattern to handle extractions with > size change between the input and output operand. > * expmed.c (extract_bitfield_using_extv): Do not allow changing > sizes of floating point objects using SUBREGs. > > diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md > index 5a1894063a1..13e2bca05a1 100644 > --- a/gcc/config/aarch64/aarch64.md > +++ b/gcc/config/aarch64/aarch64.md > @@ -5390,16 +5390,16 @@ > ;; Bitfields > ;; ------------------------------------------------------------------- > > -(define_expand "<optab>" > - [(set (match_operand:DI 0 "register_operand" "=r") > - (ANY_EXTRACT:DI (match_operand:DI 1 "register_operand") > +(define_expand "<optab><mode>" > + [(set (match_operand:GPI 0 "register_operand" "=r") > + (ANY_EXTRACT:GPI (match_operand:GPI 1 "register_operand") > (match_operand 2 > - "aarch64_simd_shift_imm_offset_di") > - (match_operand 3 "aarch64_simd_shift_imm_di")))] > + "aarch64_simd_shift_imm_offset_<mode>") > + (match_operand 3 "aarch64_simd_shift_imm_<mode>")))] > "" > { > if (!IN_RANGE (INTVAL (operands[2]) + INTVAL (operands[3]), > - 1, GET_MODE_BITSIZE (DImode) - 1)) > + 1, GET_MODE_BITSIZE (<MODE>mode) - 1)) > FAIL; > } > ) > @@ -5418,6 +5418,21 @@ > [(set_attr "type" "bfx")] > ) > > +;; Similar to the previous pattern, but 32->64 extraction > +(define_insn "*<optab>_3264" > + [(set (match_operand:DI 0 "register_operand" "=r") > + (ANY_EXTRACT:DI (match_operand:SI 1 "register_operand" "r")
So is that valid RTL (DI extract of SI value)? Why wouldn't that just use a paradoxical subreg to widen the register being extracted? R. > + (match_operand 2 > + "aarch64_simd_shift_imm_offset_si" "n") > + (match_operand 3 > + "aarch64_simd_shift_imm_si" "n")))] > + "IN_RANGE (INTVAL (operands[2]) + INTVAL (operands[3]), > + 1, GET_MODE_BITSIZE (DImode) - 1)" > + "<su>bfx\\t%x0, %x1, %3, %2" > + [(set_attr "type" "bfx")] > +) > + > + > ;; When the bit position and width add up to 32 we can use a W-reg LSR > ;; instruction taking advantage of the implicit zero-extension of the X-reg. > (define_split > diff --git a/gcc/expmed.c b/gcc/expmed.c > index d7f8e9a5d76..ce8f9595b9a 100644 > --- a/gcc/expmed.c > +++ b/gcc/expmed.c > @@ -1544,7 +1544,14 @@ extract_bit_field_using_extv (const extraction_insn > *extv, rtx op0, > mode. Instead, create a temporary and use convert_move to set > the target. */ > if (REG_P (target) > - && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode)) > + && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode) > + /* We can't change the size of float mode subregs (see > + validate_subreg). So if either mode is a floating point > + mode and the sizes are not equal, then reject doing this > + via gen_lowpart. */ > + && !((FLOAT_MODE_P (GET_MODE (target)) || FLOAT_MODE_P (ext_mode)) > + && maybe_ne (GET_MODE_BITSIZE (GET_MODE (target)), > + GET_MODE_BITSIZE (ext_mode)))) > { > target = gen_lowpart (ext_mode, target); > if (partial_subreg_p (GET_MODE (spec_target), ext_mode)) >