On Fri, 23 Jul 2021 at 15:02, Richard Earnshaw <richard.earns...@foss.arm.com> wrote: > > On 23/07/2021 08:04, Prathamesh Kulkarni via Gcc-patches wrote: > > On Thu, 22 Jul 2021 at 20:29, Richard Earnshaw > > <richard.earns...@foss.arm.com> wrote: > >> > >> > >> > >> On 22/07/2021 14:47, Prathamesh Kulkarni via Gcc-patches wrote: > >>> On Thu, 22 Jul 2021 at 17:28, Richard Earnshaw > >>> <richard.earns...@foss.arm.com> wrote: > >>>> > >>>> > >>>> > >>>> On 22/07/2021 12:32, Prathamesh Kulkarni wrote: > >>>>> On Thu, 22 Jul 2021 at 16:03, Richard Earnshaw > >>>>> <richard.earns...@foss.arm.com> wrote: > >>>>>> > >>>>>> > >>>>>> > >>>>>> On 22/07/2021 08:45, Prathamesh Kulkarni via Gcc-patches wrote: > >>>>>>> Hi, > >>>>>>> The attached patch removes calls to builtins from vshl_n intrinsics, > >>>>>>> and replacing them > >>>>>>> with left shift operator. The patch passes bootstrap+test on > >>>>>>> arm-linux-gnueabihf. > >>>>>>> > >>>>>>> Altho, I noticed, that the patch causes 3 extra registers to spill > >>>>>>> using << instead > >>>>>>> of the builtin for vshl_n.c. Could that be perhaps due to inlining of > >>>>>>> intrinsics ? > >>>>>>> Before patch, the shift operation was performed by call to > >>>>>>> __builtin_neon_vshl<type> (__a, __b) > >>>>>>> and now it's inlined to __a << __b, which might result in increased > >>>>>>> register pressure ? > >>>>>>> > >>>>>>> Thanks, > >>>>>>> Prathamesh > >>>>>>> > >>>>>> > >>>>>> > >>>>>> You're missing a ChangeLog for the patch. > >>>>> Sorry, updated in this patch. > >>>>>> > >>>>>> However, I'm not sure about this. The register shift form of VSHL > >>>>>> performs a right shift if the value is negative, which is UB if you > >>>>>> write `<<` instead. > >>>>>> > >>>>>> Have I missed something here? > >>>>> Hi Richard, > >>>>> According to this article: > >>>>> https://developer.arm.com/documentation/den0018/a/NEON-Intrinsics-Reference/Shift/VSHL-N > >>>>> For vshl_n, the shift amount is always in the non-negative range for > >>>>> all types. > >>>>> > >>>>> I tried using vshl_n_s32 (a, -1), and the compiler emitted following > >>>>> diagnostic: > >>>>> foo.c: In function ‘main’: > >>>>> foo.c:17:1: error: constant -1 out of range 0 - 31 > >>>>> 17 | } > >>>>> | ^ > >>>>> > >>>> > >>>> It does do that now, but that's because the intrinsic expansion does > >>>> some bounds checking; when you remove the call into the back-end > >>>> intrinsic that will no-longer happen. > >>>> > >>>> I think with this change various things are likely: > >>>> > >>>> - We'll no-longer reject non-immediate values, so users will be able to > >>>> write > >>>> > >>>> int b = 5; > >>>> vshl_n_s32 (a, b); > >>>> > >>>> which will expand to a vdup followed by the register form. > >>>> > >>>> - we'll rely on the front-end diagnosing out-of range shifts > >>>> > >>>> - code of the form > >>>> > >>>> int b = -1; > >>>> vshl_n_s32 (a, b); > >>>> > >>>> will probably now go through without any errors, especially at low > >>>> optimization levels. It may end up doing what the user wanted, but it's > >>>> definitely a change in behaviour - and perhaps worse, the compiler might > >>>> diagnose the above as UB and silently throw some stuff away. > >>>> > >>>> It might be that we need to insert some form of static assertion that > >>>> the second argument is a __builtin_constant_p(). > >>> Ah right, thanks for the suggestions! > >>> I tried the above example: > >>> int b = -1; > >>> vshl_n_s32 (a, b); > >>> and it compiled without any errors with -O0 after patch. > >>> > >>> Would it be OK to use _Static_assert (__builtin_constant_p (b)) to > >>> guard against non-immediate values ? > >>> > >>> With the following change: > >>> __extension__ extern __inline int32x2_t > >>> __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) > >>> vshl_n_s32 (int32x2_t __a, const int __b) > >>> { > >>> _Static_assert (__builtin_constant_p (__b)); > >>> return __a << __b; > >>> } > >>> > >>> the above example fails at -O0: > >>> ../armhf-build/gcc/include/arm_neon.h: In function ‘vshl_n_s32’: > >>> ../armhf-build/gcc/include/arm_neon.h:4904:3: error: static assertion > >>> failed > >>> 4904 | _Static_assert (__builtin_constant_p (__b)); > >>> | ^~~~~~~~~~~~~~ > >> > >> I've been playing with that but unfortunately it doesn't seem to work in > >> the way we want it to. For a complete test: > >> > >> > >> > >> typedef __simd64_int32_t int32x2_t; > >> > >> __extension__ extern __inline int32x2_t > >> __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) > >> vshl_n_s32 (int32x2_t __a, const int __b) > >> { > >> _Static_assert (__builtin_constant_p (__b), "Second argument must be > >> a litteral constant"); > >> return __a << __b; > >> } > >> > >> int32x2_t f (int32x2_t x, const int b) > >> { > >> return vshl_n_s32 (x, 1); > >> } > >> > >> At -O0 I get: > >> > >> test.c: In function ‘vshl_n_s32’: > >> test.c:7:3: error: static assertion failed: "Second argument must be a > >> litteral constant" > >> 7 | _Static_assert (__builtin_constant_p (__b), "Second argument > >> must be a litteral constant"); > >> | ^~~~~~~~~~~~~~ > >> > >> While at -O1 and above I get: > >> > >> > >> test.c: In function ‘vshl_n_s32’: > >> test.c:7:19: error: expression in static assertion is not constant > >> 7 | _Static_assert (__builtin_constant_p (__b), "Second argument > >> must be a litteral constant"); > >> | ^~~~~~~~~~~~~~~~~~~~~~~~~~ > >> > >> Which indicates that it doesn't consider __builtin_constant_p() to be a > >> constant expression :( > >> > >> So either I'm writing the static assertion incorrectly, or something > >> weird is going on. The most likely issue is that the static assertion > >> is being processed too early, before the function is inlined. > > Ah indeed. I wonder if we should add an attribute to parameter that it > > should be constant, > > and emit an error if the caller passes non-constant value ? > > sth like: > > void foo(int x __attribute__((runtime_constant))); > > and the front-end can then diagnose if the argument is > > __builtin_constant_p while type-checking call to foo. > > It's an interesting idea, it would have to be on the prototype, not on > the function declaration (except where that serves both purposes). We > might also want an optional range check on the value as well. > > I think a better name for the immediate would be literal_constant, which > is more in keeping with the semantics of the language. So: > > void foo(int x __attribute__((literal_constant (min_val, max_val))); Thanks for the suggestions! I will raise a RFC on gcc@ for literal_constant attribute.
Digging a bit into discrepancy in warnings: assertion failed at -O0 vs expression not constant at -O1+: The errors come from following hunk in c-parser.c:c_parser_static_assert_declaration_no_semi: if (TREE_CODE (value) != INTEGER_CST) { error_at (value_loc, "expression in static assertion is not constant"); return; } constant_expression_warning (value); if (integer_zerop (value)) { if (string) error_at (assert_loc, "static assertion failed: %E", string); else error_at (assert_loc, "static assertion failed"); } So at -O0, "value" is literal constant 0, while at -O1+, "value" is CALL_EXPR, which is why it seems to give different warnings at -O0 and -O1+. Thanks, Prathamesh > > R. > > > > > Thanks, > > Prathamesh > >> > >> R. > >> > >>> > >>> Thanks, > >>> Prathamesh > >>>> > >>>> R. > >>>> > >>>>> So, is the attached patch OK ? > >>>> > >>>>> > >>>>> Thanks, > >>>>> Prathamesh > >>>>>> > >>>>>> R. >