On 26/04/2019 17:08, Jeff Law wrote: > On 4/26/19 8:52 AM, Richard Earnshaw (lists) wrote: >> On 26/04/2019 15:13, Jeff Law wrote: >>> On 4/16/19 10:29 AM, Steve Ellcey wrote: >>>> Re-ping. I know there are discussions about bigger changes to fix the >>>> various failures listed in PR rtl-optimization/87763 but this patch >>>> at least fixes one of them (gcc.target/aarch64/lsl_asr_sbfiz.c). >>>> >>>> Steve Ellcey >>>> sell...@marvell.com >>> So here's my take on fixing the lsl_asr_sbfix.c regression. >>> >>> As I mentioned earlier the problem is the aarch64 is using the old way >>> of describing bitfield extractions (extv, extzv). Specifically it >>> defined a single expander that operated on the natural word mode (DImode). >>> >>> This forces the input operand to be used in DImode as well. So we get >>> those annoying subregs in the expressions that combine generates. But >>> there's a better way :-) >>> >>> The new way to handle this stuff is to define expanders for supported >>> modes (SI and DI for aarch64). Interestingly enough the aarch64 port >>> already does this for bitfield insertions via insv. >>> >>> So I made the obvious changes to the extv/extzv expander. That fixes >>> the lsl_asr_sbfiz regression. But doesn't bootstrap. The availability >>> of the new expander makes extract_bit_field_using_extv want to extract a >>> field from an HFmode object and shove it into an SImode. That runs >>> afoul of checks in validate_subreg (as it should). We shouldn't be >>> using subregs to change the size of a floating point object, so that >>> needs to be filtered out in extract_bit_field_using_extv. >>> >>> That fixes the bootstrap issue. But regression testing trips over >>> ashtidisi.c. That can be easily fixed by allowing zero/sign extractions >>> which change size. ie: >>> >>> (set (reg:DI) (sign_extract:DI (reg:SI) ...))) >>> >>> Which seems like a reasonable thing to handle. >>> >>> So here's what I've got. I've bootstrapped and regression tested on >>> aarch64. It's also bootstrapped on aarch64_be for good measure. >>> >>> OK (from aarch64 maintainers) for the gcc-9 branch and trunk? Or we >>> could just address on the trunk for gcc-10. I don't have strong >>> preferences either way. >>> >>> Jeff >>> >>> ps. Time to return insv regressions and address Segher's comments :-) >>> >>> >>> >>> P >>> >>> * aarch64.md (extv, extzv expander): Generalize so that it works >>> for both SImode and DImode. >>> (extv_3264, extzv_3264): New pattern to handle extractions with >>> size change between the input and output operand. >>> * expmed.c (extract_bitfield_using_extv): Do not allow changing >>> sizes of floating point objects using SUBREGs. >>> >>> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md >>> index 5a1894063a1..13e2bca05a1 100644 >>> --- a/gcc/config/aarch64/aarch64.md >>> +++ b/gcc/config/aarch64/aarch64.md >>> @@ -5390,16 +5390,16 @@ >>> ;; Bitfields >>> ;; ------------------------------------------------------------------- >>> >>> -(define_expand "<optab>" >>> - [(set (match_operand:DI 0 "register_operand" "=r") >>> - (ANY_EXTRACT:DI (match_operand:DI 1 "register_operand") >>> +(define_expand "<optab><mode>" >>> + [(set (match_operand:GPI 0 "register_operand" "=r") >>> + (ANY_EXTRACT:GPI (match_operand:GPI 1 "register_operand") >>> (match_operand 2 >>> - "aarch64_simd_shift_imm_offset_di") >>> - (match_operand 3 "aarch64_simd_shift_imm_di")))] >>> + "aarch64_simd_shift_imm_offset_<mode>") >>> + (match_operand 3 "aarch64_simd_shift_imm_<mode>")))] >>> "" >>> { >>> if (!IN_RANGE (INTVAL (operands[2]) + INTVAL (operands[3]), >>> - 1, GET_MODE_BITSIZE (DImode) - 1)) >>> + 1, GET_MODE_BITSIZE (<MODE>mode) - 1)) >>> FAIL; >>> } >>> ) >>> @@ -5418,6 +5418,21 @@ >>> [(set_attr "type" "bfx")] >>> ) >>> >>> +;; Similar to the previous pattern, but 32->64 extraction >>> +(define_insn "*<optab>_3264" >>> + [(set (match_operand:DI 0 "register_operand" "=r") >>> + (ANY_EXTRACT:DI (match_operand:SI 1 "register_operand" "r") >> >> So is that valid RTL (DI extract of SI value)? Why wouldn't that just >> use a paradoxical subreg to widen the register being extracted? > [ ... ] > And for completeness, the patch also survived regression testing > aarch64be where it fixes the lsl_asr_sbfiz test. > > I think the big question here is whether or not we consider this valid RTL. >
The documentation for sign_extract says: @item (sign_extract:@var{m} @var{loc} @var{size} @var{pos}) ... If @var{loc} is in memory, its mode must be a single-byte integer mode. If @var{loc} is in a register, the mode to use is specified by the operand of the @code{insv} or @code{extv} pattern (@pxref{Standard Names}) and is usually a full-word integer mode, which is the default if none is specified. So it's a little unclear to me whether the mode of loc is ignored for registers, or if it must match the mode of the extract. It would be a bit perverse if the mode on loc were completely ignored. What would it mean if the register was a floating-point or vector mode? If we do allow a mode mismatch, should the extract be wider than the contained register, or can it also be narrower (subject to the range information being compatible with both the source and the destination)? R. > jeff >