from:"Uros Bizjak"

Re: [PATCH] i386: Fix kshift intrinsics [PR93673]

2020-02-11 Thread Uros Bizjak

On Wed, Feb 12, 2020 at 7:33 AM Jakub Jelinek wrote: > > Hi! > > As mentioned in the PR, the intrinsics allow counts from 0 to 255, but > we actually reject values from 128 to 255. That is because QImode > CONST_INTs can be only -128 to 127. Fixed by using const_0_to_255_operand > and adjusting

Re: [PATCH] i386: Fix up vec_extract_lo* patterns [PR93670]

2020-02-12 Thread Uros Bizjak

On Wed, Feb 12, 2020 at 10:27 AM Jakub Jelinek wrote: > > Hi! > > The VEXTRACT* insns have way too many different CPUID feature flags (ATT > syntax) > vextractf128 $imm, %ymm, %xmm/mem AVX > vextracti128 $imm, %ymm, %xmm/mem AVX2 > vextract{f,i}32x4 $imm, %ymm, %xmm/mem

Re: [PATCH] i386: Skip ENDBR32 at nested function entry

2020-02-13 Thread Uros Bizjak

On Wed, Feb 12, 2020 at 1:21 PM H.J. Lu wrote: > > On Mon, Feb 10, 2020 at 12:01 PM Uros Bizjak wrote: > > > > On Mon, Feb 10, 2020 at 8:53 PM H.J. Lu wrote: > > > > > > On Mon, Feb 10, 2020 at 11:40 AM Uros Bizjak wrote: > > > > > &

Re: [PATCH]Several intrinsic macros lack a closing parenthesis[PR93274]

2020-02-13 Thread Uros Bizjak

> Changelog > gcc/ >* config/i386/avx512vbmi2intrin.h >(_mm512_[,mask_,maskz_]shrdi_epi16, >_mm512_[,mask_,maskz_]shrdi_epi32, >_m512_[,mask_,maskz_]shrdi_epi64, >_mm512_[,mask_,maskz_]shldi_epi16, >_mm512_[,mask_,maskz_]shldi_epi32, >_m512_[,

Re: [PATCH]Several intrinsic macros lack a closing parenthesis[PR93274]

2020-02-13 Thread Uros Bizjak

On Thu, Feb 13, 2020 at 9:53 AM Jakub Jelinek wrote: > > On Thu, Feb 13, 2020 at 09:39:05AM +0100, Uros Bizjak wrote: > > > Changelog > > > gcc/ > > >* config/i386/avx512vbmi2intrin.h > > >(_mm512_[,mask_,maskz_]shrdi_epi16, >

Re: [PATCH] i386: Fix up _mm_mask_popcnt_epi [PR93696]

2020-02-13 Thread Uros Bizjak

On Thu, Feb 13, 2020 at 9:47 AM Jakub Jelinek wrote: > > Hi! > > As mentioned in the PR and as > https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_mask_popcnt_epi > also documents, _mm*_popcnt_epi* intrinsics are consistent with all other > unary AVX512* intrinsics regarding argu

Re: [PATCH] i386: Also skip ENDBR32 at the target function entry

2020-02-13 Thread Uros Bizjak

On Thu, Feb 13, 2020 at 1:06 PM H.J. Lu wrote: > > On Thu, Feb 13, 2020 at 09:29:32AM +0100, Uros Bizjak wrote: > > On Wed, Feb 12, 2020 at 1:21 PM H.J. Lu wrote: > > > > > > On Mon, Feb 10, 2020 at 12:01 PM Uros Bizjak wrote: > > > > > > &g

Re: [PATCH] i386: Also skip ENDBR32 at the target function entry

2020-02-13 Thread Uros Bizjak

On Thu, Feb 13, 2020 at 1:42 PM H.J. Lu wrote: > > On Thu, Feb 13, 2020 at 01:28:43PM +0100, Uros Bizjak wrote: > > On Thu, Feb 13, 2020 at 1:06 PM H.J. Lu wrote: > > > > > > On Thu, Feb 13, 2020 at 09:29:32AM +0100, Uros Bizjak wrote: > > > > On W

Re: [PATCH]Several intrinsic macros lack a closing parenthesis[PR93274]

2020-02-13 Thread Uros Bizjak

On Fri, Feb 14, 2020 at 7:03 AM Hongtao Liu wrote: > > On Thu, Feb 13, 2020 at 5:31 PM Hongtao Liu wrote: > > > > On Thu, Feb 13, 2020 at 5:12 PM Uros Bizjak wrote: > > > > > > On Thu, Feb 13, 2020 at 9:53 AM Jakub Jelinek wrote: > > > > > &

Re: [PATCH]Several intrinsic macros lack a closing parenthesis[PR93274]

2020-02-14 Thread Uros Bizjak

On Fri, Feb 14, 2020 at 8:06 AM Uros Bizjak wrote: > > On Fri, Feb 14, 2020 at 7:03 AM Hongtao Liu wrote: > > > > On Thu, Feb 13, 2020 at 5:31 PM Hongtao Liu wrote: > > > > > > On Thu, Feb 13, 2020 at 5:12 PM Uros Bizjak wrote: > > > > >

[committed] i386: Fix atan2l argument order [PR93743]

2020-02-16 Thread Uros Bizjak

i386: Fix atan2l argument order [PR93743] Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. 2020-02-16 Uroš Bizjak PR target/93743 * config/i386/i386.md (atan2xf3): Swap operands 1 and 2. (atan23): Update operand order in the call to gen_atan2xf3. testsuite/ChangeLo

Re: [PATCH] Fix ICE with movstrictqi (PR target/92791)

2019-12-05 Thread Uros Bizjak

On Thu, Dec 5, 2019 at 9:21 AM Jakub Jelinek wrote: > > Hi! > > The huge LTO testcase in the PR ICEs, because in a function > where optimize_function_for_speed_p (cfun) and when targetting > -march=i686 optab_handler (movstrict_optab, E_QImode) is > CODE_FOR_movstrictqi, but when the *movstrictqi

Re: [PATCH] Use OPTION_MASK_ISA2_$target_[SET, UNSET, ] to indicate those for x_ix86_isa_flags2

2019-12-09 Thread Uros Bizjak

On Mon, Dec 9, 2019 at 11:25 AM Hongtao Liu wrote: > > Hi uros: > This patch is about to rename OPTION_MASK_ISA_$target_[SET,UNSET, ] > to OPTION_MASK_ISA2_$target_[SET,UNSET, ] for those targets setting > x_ix86_isa_flags2. > target list as bellow: > - > 188static struct ix86_target_opts

Re: [PATCH] Add abs pattern to handle {si,di} mode abs to avoid pmax/cmove conversion (PR92651)

2019-12-16 Thread Uros Bizjak

On Wed, Dec 11, 2019 at 4:24 AM 玩还有 wrote: > > Hi: > Currently smax/smin pattern added by r274481 cause some regression > in 525.x264_r by 8% with -O2 -march=corei7. The reason is some IA > backends (contain TARGET_SSE4_1) will do transform for simple abs > (using rshift, xor and sub) to pmax/pm

Re: [PATCH] i386: Use add for a = a + b and a = b + a when possible

2019-12-16 Thread Uros Bizjak

> ince except for Bonnell, > > 01 fbadd%edi,%ebx > > is faster and shorter than > > 8d 1c 1f lea(%rdi,%rbx,1),%ebx > > we should use add for a = a + b and a = b + a when possible if not > optimizing for Bonnell. > > Tested on x86-64. > > gcc/ > > PR target/92807

Re: [PATCH] Some x86 AMD -march= docs fixes + formatting fixes (PR target/92962)

2019-12-17 Thread Uros Bizjak

On Tue, Dec 17, 2019 at 10:09 AM Jakub Jelinek wrote: > > Hi! > > The bug report complained just about missing RDPID and WBNOINVD in znver2 > description and double comma before CLWB, but reading the docs I found > various other nits and when trying to compare it with what the compiler > actually

Re: [PATCH] Oprimize stack_protect_set_1_ followed by a move to the same register (PR target/92841)

2019-12-17 Thread Uros Bizjak

On Tue, Dec 10, 2019 at 10:57 AM Jakub Jelinek wrote: > > Hi! > > The stack_protect_set_1_ pattern intentionally clears the register it > used as a temporary to read the canary from the register and push it back > on the stack for security reasons, to make sure the stack canary isn't > spilled som

Re: Patch ping (was Re: [PATCH] Oprimize stack_protect_set_1_ followed by a move to the same register (PR target/92841))

2019-12-19 Thread Uros Bizjak

On Fri, Dec 20, 2019 at 12:26 AM Jakub Jelinek wrote: > > On Thu, Dec 19, 2019 at 06:23:59PM +0100, Jakub Jelinek wrote: > > On Thu, Dec 19, 2019 at 04:50:40PM +0100, Jan Hubicka wrote: > > > Outputting the move as RIP relative movq would work. > > > LC12 is string "s" and has nothing to do with s

Re: [PATCH] Optimize decl %eax; cmpl $-1, %eax; jne .Lxx into subl $1, %eax; jnc .Lxx using peephole2 (PR target/93002)

2019-12-19 Thread Uros Bizjak

On Fri, Dec 20, 2019 at 12:29 AM Jakub Jelinek wrote: > > Hi! > > The following patch optimizes > decl %eax; cmpl $-1, %eax; jne .Lxx; > into shorter and even possible to be fused: > subl $1, %eax; jnc .Lxx; > > Bootstrapped/regtested on x86_64-linux and i686-linux, during which > this peephole2 t

Re: [PATCH] Allow {nearby,r}int{,f} vectorization on x86 with sse4.1 and later (PR target/93078)

2019-12-28 Thread Uros Bizjak

On Sat, Dec 28, 2019 at 10:33 AM Jakub Jelinek wrote: > > Hi! > > In i386.md, we have nearbyint2 and rint2 patterns that expand > SF/DF/XF mode patterns to rounding instructions. For pre-sse4.1 that is > done using XFmode and so inappropriate for vectorization, but for sse4.1 > and later we can j

Re: [PATCH] Allow {nearby,r}int{,f} vectorization on x86 with sse4.1 and later (PR target/93078)

2019-12-28 Thread Uros Bizjak

On Sat, Dec 28, 2019 at 12:02 PM Jakub Jelinek wrote: > > On Sat, Dec 28, 2019 at 11:48:12AM +0100, Uros Bizjak wrote: > > On Sat, Dec 28, 2019 at 10:33 AM Jakub Jelinek wrote: > > > > > > Hi! > > > > > > In i386.md, we have nearbyint2 and rint2 p

Re: [PATCH] Fix x86 abs2 expander for ia32 (PR target/93110)

2020-01-03 Thread Uros Bizjak

On Fri, Jan 3, 2020 at 9:23 AM Jakub Jelinek wrote: > > Hi! > > The newly added absdi2 expander doesn't work well on ia32, because it > requires a xordi3 pattern, which is available even for !TARGET_64BIT, > but only if TARGET_STV && TARGET_SSE2. > > The following patch just uses expand_simple_bin

Re: [PATCH] Allow prefer-vector-width= in target attribute (PR target/93089)

2020-01-03 Thread Uros Bizjak

On Fri, Jan 3, 2020 at 9:31 AM Jakub Jelinek wrote: > > Hi! > > For a patch I'm going to post next I need to be able to tweak > prefer_vector_width= for simd clones (the thing is, in the declare simd > clones it makes no sense to restrict to a subset of vector sizes the > selected ISA is capable o

Re: [PATCH] Improve __builtin_add_overflow on x86 for double-word types (PR target/93141)

2020-01-04 Thread Uros Bizjak

On Sat, Jan 4, 2020 at 12:50 AM Jakub Jelinek wrote: > > Hi! > > As the following testcase shows, we generate quite bad code for > double-word __builtin_add_overflow on x86, we have add[dt]i3_doubleword > pattern and emit add[ql]; adc[ql];, but then we could just use setc/seto > or adc etc., but w

Re: [PATCH] Improve __builtin_add_overflow on x86 for double-word types (PR target/93141)

2020-01-05 Thread Uros Bizjak

On Sat, Jan 4, 2020 at 12:39 PM Jakub Jelinek wrote: > > On Sat, Jan 04, 2020 at 12:13:50PM +0100, Uros Bizjak wrote: > > LGTM, but I wonder if *addcarry_1 gets overmacroized, the insn > > condition is really hard to comprehend. Perhaps it should be written > > as a sep

Re: [PATCH] Fix ia32 ICE while compiling glibc (PR target/93174)

2020-01-08 Thread Uros Bizjak

On Wed, Jan 8, 2020 at 8:48 AM Jakub Jelinek wrote: > > Hi! > > Joseph reported ia32 glibc build ICEs, because the > *adddi3_doubleword_cc_overflow_1 pattern allows a memory output and matching > input, but addcarry* to which it splits doesn't, for some strange > reason it only allows register out

Re: [PATCH] Fix x86 ICE when peepholing2 @stack_protect_set_1_ with *lea (PR target/93187)

2020-01-08 Thread Uros Bizjak

On Wed, Jan 8, 2020 at 8:58 AM Jakub Jelinek wrote: > > Hi! > > On the following testcase, the peephole2s merge @stack_protect_set_1_ > with not the expected *mov{si,di}_internal, but *lea instead - > which looks like a mov, but uses address_no_seg_operand predicate/Ts > constraint. The peephole2

Re: [PATCH] Improve __builtin_sub_overflow with signed double-word operands (PR target/93141)

2020-01-08 Thread Uros Bizjak

On Wed, Jan 8, 2020 at 9:09 AM Jakub Jelinek wrote: > > Hi! > > This is very similar to the previous PR93141 addv4 half and > improves signed __builtin_sub_overflow on double-words rather than > __builtin_add_overflow. > > I have left out the uaddv4 double-word stuff, because I ran into > issues w

[committed] alpha: Introduce UMUL_HIGHPART rtx_code [PR113720]

2024-03-03 Thread Uros Bizjak

umuldi3_highpart expander does: if (REG_P (operands[2])) operands[2] = gen_rtx_ZERO_EXTEND (TImode, operands[2]); on register_operand predicate, which also allows SUBREG RTX. So, subregs were emitted without ZERO_EXTEND RTX. But nowadays we have UMUL_HIGHPART that allows us to fix this i

Re: [PATCH] i386: Fix ICEs with SUBREGs from vector etc. constants to XFmode [PR114184]

2024-03-04 Thread Uros Bizjak

On Mon, Mar 4, 2024 at 9:25 AM Jakub Jelinek wrote: > > Hi! > > The Intel extended format has the various weird number categories, > pseudo denormals, pseudo infinities, pseudo NaNs and unnormals. > Those are not representable in the GCC real_value and so neither > GIMPLE nor RTX VIEW_CONVERT_EXPR

Re: [PATCH] i386: Fix ICEs with SUBREGs from vector etc. constants to XFmode [PR114184]

2024-03-04 Thread Uros Bizjak

On Mon, Mar 4, 2024 at 9:41 AM Jakub Jelinek wrote: > > On Mon, Mar 04, 2024 at 09:34:30AM +0100, Uros Bizjak wrote: > > > --- gcc/config/i386/i386-expand.cc.jj 2024-03-01 14:56:34.120925989 > > > +0100 > > > +++ gcc/config/i386/i386-expand.cc 2024-03-0

Re: [PATCH] i386: Fix up the vzeroupper REG_DEAD/REG_UNUSED note workaround [PR114190]

2024-03-06 Thread Uros Bizjak

On Wed, Mar 6, 2024 at 9:10 AM Jakub Jelinek wrote: > > Hi! > > When writing the rest_of_handle_insert_vzeroupper workaround to manually > remove all the REG_DEAD/REG_UNUSED notes from the IL, I've missed that > there is a df_analyze () call right after it and that the problems added > earlier in

[committed] i386: Eliminate common code from x86_32 TARGET_MACHO part in ix86_expand_move

2024-03-06 Thread Uros Bizjak

Eliminate common code from x86_32 TARGET_MACHO part in ix86_expand_move and use generic code instead. No functional changes. gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_expand_move) [TARGET_MACHO]: Eliminate common code and use generic code instead. Bootstrapped and regression tes

[committed] i386: Fix and improve insn constraint for V2QI arithmetic/shift insns

2024-03-06 Thread Uros Bizjak

optimize_function_for_size_p predicate is not stable during optab selection, because it also depends on node->count/node->frequency of the current function, which are updated during IPA, so they may change between early opts and late opts. Use optimize_size instead - optimize_size implies optimize

[PATCH v2] combine: Fix ICE in try_combine on pr112494.c [PR112560]

2024-03-07 Thread Uros Bizjak

The compiler, configured with --enable-checking=yes,rtl,extra ICEs with: internal compiler error: RTL check: expected elt 0 type 'e' or 'u', have 'E' (rtx unspec) in try_combine, at combine.cc:3237 This is 3236 /* Just replace the CC reg with a new mode. */ 3237 SUBST

Re: [PATCH v2] combine: Fix ICE in try_combine on pr112494.c [PR112560]

2024-03-07 Thread Uros Bizjak

On Thu, Mar 7, 2024 at 10:56 AM Richard Biener wrote: > > On Thu, 7 Mar 2024, Uros Bizjak wrote: > > > The compiler, configured with --enable-checking=yes,rtl,extra ICEs with: > > > > internal compiler error: RTL check: expected elt 0 type 'e' or 'u'

Re: [PATCH v2] combine: Fix ICE in try_combine on pr112494.c [PR112560]

2024-03-07 Thread Uros Bizjak

On Thu, Mar 7, 2024 at 11:37 AM Jakub Jelinek wrote: > > On Thu, Mar 07, 2024 at 11:11:35AM +0100, Uros Bizjak wrote: > > > Since you CCed me - looking at the code I wonder why we fatally fail. > > > The following might also fix the issue and preserve more of the >

Re: [PATCH v2] combine: Fix ICE in try_combine on pr112494.c [PR112560]

2024-03-07 Thread Uros Bizjak

On Thu, Mar 7, 2024 at 12:11 PM Richard Biener wrote: > > On Thu, 7 Mar 2024, Jakub Jelinek wrote: > > > On Thu, Mar 07, 2024 at 11:11:35AM +0100, Uros Bizjak wrote: > > > > Since you CCed me - looking at the code I wonder why we fatally fail. > > > > The

Re: [PATCH v2] combine: Fix ICE in try_combine on pr112494.c [PR112560]

2024-03-07 Thread Uros Bizjak

On Thu, Mar 7, 2024 at 6:39 PM Segher Boessenkool wrote: > > On Thu, Mar 07, 2024 at 10:55:12AM +0100, Richard Biener wrote: > > On Thu, 7 Mar 2024, Uros Bizjak wrote: > > > This is > > > > > > 3236 /* Just replace the CC reg with a new mode.

Re: [PATCH v2] combine: Fix ICE in try_combine on pr112494.c [PR112560]

2024-03-07 Thread Uros Bizjak

On Thu, Mar 7, 2024 at 10:04 PM Uros Bizjak wrote: > The source code that deals with the *user* of the CC register assumes > the former form, so it blindly tries to update the mode of the CC > register inside LT comparison RTX (some other nearby source code even > checks for (cons

Re: [PATCH v2] combine: Fix ICE in try_combine on pr112494.c [PR112560]

2024-03-07 Thread Uros Bizjak

On Thu, Mar 7, 2024 at 10:37 PM Segher Boessenkool wrote: > > On Thu, Mar 07, 2024 at 10:04:32PM +0100, Uros Bizjak wrote: > > [snip] > > > The part we want to fix deals with the *user* of the CC register. It > > is not true that this is always COMPARISON_P, so EQ, NE,

Re: [PATCH v2] combine: Fix ICE in try_combine on pr112494.c [PR112560]

2024-03-07 Thread Uros Bizjak

On Thu, Mar 7, 2024 at 11:07 PM Uros Bizjak wrote: > > On Thu, Mar 7, 2024 at 10:37 PM Segher Boessenkool > wrote: > > > > On Thu, Mar 07, 2024 at 10:04:32PM +0100, Uros Bizjak wrote: > > > > [snip] > > > > > The part we want to fix deals with the

Re: [PATCH v2] combine: Fix ICE in try_combine on pr112494.c [PR112560]

2024-03-07 Thread Uros Bizjak

On Thu, Mar 7, 2024 at 11:29 PM Segher Boessenkool wrote: > > On Thu, Mar 07, 2024 at 11:07:18PM +0100, Uros Bizjak wrote: > > On Thu, Mar 7, 2024 at 10:37 PM Segher Boessenkool > > wrote: > > > > but can be something else, such as the above not

Fwd: [PATCH v3] combine: Fix ICE in try_combine on pr112494.c [PR112560]

2024-03-12 Thread Uros Bizjak

Forgot to CC gcc-patches@ ML... sorry for the duplicate... The compiler, configured with --enable-checking=yes,rtl,extra ICEs with: internal compiler error: RTL check: expected elt 0 type 'e' or 'u', have 'E' (rtx unspec) in try_combine, at combine.cc:3237 This is 3236 /* Just repl

Re: [PATCH] i386[stv]: Handle REG_EH_REGION note

2024-03-14 Thread Uros Bizjak

On Thu, Mar 14, 2024 at 2:33 AM liuhongt wrote: > > When we split > (insn 37 36 38 10 (set (reg:DI 104 [ _18 ]) > (mem:DI (reg/f:SI 98 [ CallNative_nclosure.0_1 ]) [6 MEM[(struct > SQRefCounted *)CallNative_nclosure.0_1]._uiRef+0 S8 A32])) "test.C":22:42 84 > {*movdi_internal} > (ex

Re: [PATCH] i386[stv]: Handle REG_EH_REGION note

2024-03-14 Thread Uros Bizjak

On Thu, Mar 14, 2024 at 8:32 AM Hongtao Liu wrote: > > On Thu, Mar 14, 2024 at 3:22 PM Uros Bizjak wrote: > > > > On Thu, Mar 14, 2024 at 2:33 AM liuhongt wrote: > > > > > > When we split > > > (insn 37 36 38 10 (set (reg:DI 104 [ _18 ]) > > &

Re: [PATCH] i386[stv]: Handle REG_EH_REGION note

2024-03-14 Thread Uros Bizjak

On Thu, Mar 14, 2024 at 8:42 AM Uros Bizjak wrote: > > On Thu, Mar 14, 2024 at 8:32 AM Hongtao Liu wrote: > > > > On Thu, Mar 14, 2024 at 3:22 PM Uros Bizjak wrote: > > > > > > On Thu, Mar 14, 2024 at 2:33 AM liuhongt wrote: > > > > > > &g

Re: [PATCH] i386: Fix a pasto in ix86_expand_int_sse_cmp [PR114339]

2024-03-15 Thread Uros Bizjak

On Fri, Mar 15, 2024 at 9:50 AM Jakub Jelinek wrote: > > Hi! > > In r13-3803-gfa271afb58 I've added an optimization for LE/LEU/GE/GEU > comparison against CONST_VECTOR. As the comments say: > /* x <= cst can be handled as x < cst + 1 unless there is > wrap around in cst + 1.

Re: [PATCH] i386 [stv]: Handle REG_EH_REGION note [pr111822].

2024-03-18 Thread Uros Bizjak

On Mon, Mar 18, 2024 at 11:52 AM liuhongt wrote: > > Commit r14-9459-g618e34d56cc38e only handles > general_scalar_chain::convert_op. The patch also handles > timode_scalar_chain::convert_op to avoid potential similar bug. > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > Ok for trun

Re: [PATCH v2] combine: Fix ICE in try_combine on pr112494.c [PR112560]

2024-03-18 Thread Uros Bizjak

On Mon, Mar 18, 2024 at 3:46 PM Segher Boessenkool wrote: > > On Thu, Mar 07, 2024 at 11:27:28PM +0100, Uros Bizjak wrote: > > On Thu, Mar 7, 2024 at 11:07 PM Uros Bizjak wrote: > > > > > (unspec:DI [ > > > > > (reg:CC 17 flags) > > >

Re: [PATCH v2] combine: Fix ICE in try_combine on pr112494.c [PR112560]

2024-03-18 Thread Uros Bizjak

On Mon, Mar 18, 2024 at 3:51 PM Segher Boessenkool wrote: > > On Thu, Mar 07, 2024 at 11:46:54PM +0100, Uros Bizjak wrote: > > > Can't you just describe the dataflow then, without an unspec? An unspec > > > by definition does some (unspecified) operation on the dat

[PATCH] i386: Unify {general, timode}_scalar_chain::convert_op [PR111822]

2024-03-18 Thread Uros Bizjak

Recent PR111822 fix implemented REG_EH_REGION note copying to a STV converted preload instruction in general_scalar_chain::convert_op. However, the same issue remains in timode_scalar_chain::convert_op. Instead of copying the newly introduced code to timode_scalar_chain::convert_op, the patch uni

Re: [PATCH] testsuite: i386: Skip gcc.target/i386/avx512cd-vpbroadcastmb2q-2.c etc. with Solaris as [PR114150]

2024-03-21 Thread Uros Bizjak

On Thu, Mar 21, 2024 at 10:26 AM Rainer Orth wrote: > > Two avx512cd tests FAIL to assemble with the Solaris/x86 assembler: > > FAIL: gcc.target/i386/avx512cd-vpbroadcastmb2q-2.c (test for excess errors) > UNRESOLVED: gcc.target/i386/avx512cd-vpbroadcastmb2q-2.c compilation failed > to produce ex

Re: [PATCH] testsuite: Fix up ext-floating{3,12}.C on i686-linux

2024-03-27 Thread Uros Bizjak

On Wed, Mar 27, 2024 at 11:48 AM Jakub Jelinek wrote: > > Hi! > > These tests FAIL for quite a while on i686-linux since July last year, > likely r14-2628 . Since that patch gcc claims _Float16 and __bf16 > support even without -msse2 because some functions could be using > target attribute. > La

Combine patch ping

2024-04-01 Thread Uros Bizjak

Hello! I'd like to ping the https://gcc.gnu.org/pipermail/gcc-patches/2024-March/647634.html PR112560 P1 patch. Thanks, Uros.

Re: [PATCH] x86: Define __APX_F__ for -mapxf

2024-04-04 Thread Uros Bizjak

On Thu, Apr 4, 2024 at 5:08 PM H.J. Lu wrote: > > Define __APX_F__ when APX is enabled. > > gcc/ > > PR target/114587 > * config/i386/i386-c.cc (ix86_target_macros_internal): Define > __APX_F__ when APX is enabled. > > gcc/testsuite/ > > PR target/114587 > *

Re: [PATCH] x86: Use explicit shift count in double-precision shifts

2024-04-06 Thread Uros Bizjak

On Fri, Apr 5, 2024 at 5:56 PM H.J. Lu wrote: > > Don't use implicit shift count in double-precision shifts in AT&T syntax > since they aren't in Intel SDM. Keep the 's' modifier for backward > compatibility with inline asm statements. > > PR target/114590 > * config/i386/i386.md

Re: Combine patch ping

2024-04-06 Thread Uros Bizjak

On Mon, Apr 1, 2024 at 9:28 PM Uros Bizjak wrote: > I'd like to ping the > https://gcc.gnu.org/pipermail/gcc-patches/2024-March/647634.html > PR112560 P1 patch. If there are no further comments, I plan to commit the referred patch to the mainline on Wednesday. The latest

Re: Combine patch ping

2024-04-10 Thread Uros Bizjak

On Wed, Apr 10, 2024 at 7:56 PM Segher Boessenkool wrote: > > On Sun, Apr 07, 2024 at 08:31:38AM +0200, Uros Bizjak wrote: > > If there are no further comments, I plan to commit the referred patch > > to the mainline on Wednesday. The latest version can be considered an >

Re: Combine patch ping

2024-04-11 Thread Uros Bizjak

On Thu, Apr 11, 2024 at 4:02 PM Segher Boessenkool wrote: > > On Wed, Apr 10, 2024 at 08:32:39PM +0200, Uros Bizjak wrote: > > On Wed, Apr 10, 2024 at 7:56 PM Segher Boessenkool > > wrote: > > > This is never okay. You cannot commit a patch without approval, *eve

Re: [PATCH 1/2] target/113255 - avoid REG_POINTER on a pointer difference

2024-02-01 Thread Uros Bizjak

On Thu, Feb 1, 2024 at 3:18 PM Richard Biener wrote: > > The following avoids re-using a register holding a pointer (and > thus might be REG_POINTER) for the result of a pointer difference > computation. That might confuse heuristics in (broken) RTL alias > analysis which relies on REG_POINTER in

[committed] i386: Improve *cmp_doubleword splitter [PR113701]

2024-02-01 Thread Uros Bizjak

The fix for PR70321 introduced a splitter that split a doubleword comparison into a pair of XORs followed by an IOR to set the (zero) flags register. To help the reload, splitter forced SUBREG pieces of double-word input values to a pseudo, but this regressed gcc.target/i386/pr82580.c int f0 (U x

Re: [PATCH] testsuite: i386: Fix gcc.target/i386/pr71321.c on Solaris/x86

2024-02-02 Thread Uros Bizjak

On Fri, Feb 2, 2024 at 9:59 AM Rainer Orth wrote: > > gcc.target/i386/pr71321.c FAILs on 64-bit Solaris/x86 with the native > assembler: > > FAIL: gcc.target/i386/pr71321.c scan-assembler-not lea.*0 > > The problem is that /bin/as doesn't fully support cfi directives, so the > .eh_frame section i

Re: [x86_64 PATCH] PR target/113690: Fix-up MULT REG_EQUAL notes in STV.

2024-02-05 Thread Uros Bizjak

On Mon, Feb 5, 2024 at 1:24 AM Roger Sayle wrote: > > > This patch fixes PR target/113690, an ICE-on-valid regression on x86_64 > that exhibits with a specific combination of command line options. The > cause is that x86's scalar-to-vector pass converts a chain of instructions > from TImode to V1

Re: [PATCH] i386: Clear REG_UNUSED and REG_DEAD notes from the IL at the end of vzeroupper pass [PR113059]

2024-02-05 Thread Uros Bizjak

On Wed, Jan 31, 2024 at 9:23 AM Jakub Jelinek wrote: > > Hi! > > The move of the vzeroupper pass from after reload pass to after > postreload_cse helped only partially, CSE-like passes can still invalidate > those notes (especially REG_UNUSED) if they use some earlier register > holding some value

Re: [x86_64 PATCH] PR target/113690: Fix-up MULT REG_EQUAL notes in STV.

2024-02-05 Thread Uros Bizjak

On Mon, Feb 5, 2024 at 9:06 AM Uros Bizjak wrote: > > On Mon, Feb 5, 2024 at 1:24 AM Roger Sayle wrote: > > > > > > This patch fixes PR target/113690, an ICE-on-valid regression on x86_64 > > that exhibits with a specific combination of command line options. The

Re: [PATCH v5] x86-64: Find a scratch register for large model profiling

2024-02-05 Thread Uros Bizjak

On Fri, Feb 2, 2024 at 11:47 PM H.J. Lu wrote: > > Changes in v5: > > 1. Add pr113689-3.c. > 2. Use %r10 if ix86_profile_before_prologue () return true. > 3. Try a callee-saved register which has been saved on stack in the > prologue. > > Changes in v4: > > 1. Remove pr113689-3.c. > 2. Use df_get_

Re: [PATCH v6] x86-64: Find a scratch register for large model profiling

2024-02-05 Thread Uros Bizjak

On Mon, Feb 5, 2024 at 5:43 PM H.J. Lu wrote: > > Changes in v6: > > 1. Use ix86_save_reg and accessible_reg_set in > x86_64_select_profile_regnum. > 2. Construct a complete reg name in x86_function_profiler. > > Changes in v5: > > 1. Add pr113689-3.c. > 2. Use %r10 if ix86_profile_before_prologue

[committed] i386: psrlq is not used for PERM [PR113871]

2024-02-14 Thread Uros Bizjak

Introduce vec_shl_ and vec_shr_ expanders to improve '*a = __builtin_shufflevector(*a, (vect64){0}, 1, 2, 3, 4);' and '*a = __builtin_shufflevector((vect64){0}, *a, 3, 4, 5, 6);' shuffles. The generated code improves from: movzwl 6(%rdi), %eax movzwl 4(%rdi), %edx salq

[committed] testsuite: Fix a couple of x86 issues in gcc.dg/vect testsuite

2024-02-14 Thread Uros Bizjak

A compile-time test can use -march=skylake-avx512 for all x86 targets, but a runtime test needs to check avx512f effective target if the instructions can be assembled. The runtime test also needs to check if the target machine supports instruction set we have been compiled for. The testsuite uses

Re: PING: [PATCH] x86-64: Check R_X86_64_CODE_6_GOTTPOFF support

2024-02-23 Thread Uros Bizjak

On Fri, Feb 23, 2024 at 3:45 AM H.J. Lu wrote: > > On Thu, Feb 22, 2024 at 6:39 PM Hongtao Liu wrote: > > > > On Thu, Feb 22, 2024 at 10:33 PM H.J. Lu wrote: > > > > > > On Sun, Feb 18, 2024 at 8:02 AM H.J. Lu wrote: > > > > > > > > If assembler and linker supports > > > > > > > > add %reg1, na

Re: [PATCH] x86: Check interrupt instead of noreturn attribute

2024-02-25 Thread Uros Bizjak

On Sun, Feb 25, 2024 at 5:01 PM H.J. Lu wrote: > > ix86_set_func_type checks noreturn attribute to avoid incompatible > attribute error in LTO1 on interrupt functions. Since TREE_THIS_VOLATILE > is set also for _Noreturn without noreturn attribute, check interrupt > attribute for interrupt functi

Re: [PATCH v2] x86: Check interrupt instead of noreturn attribute

2024-02-26 Thread Uros Bizjak

On Sun, Feb 25, 2024 at 10:14 PM H.J. Lu wrote: > > ix86_set_func_type checks noreturn attribute to avoid incompatible > attribute error in LTO1 on interrupt functions. Since TREE_THIS_VOLATILE > is set also for _Noreturn without noreturn attribute, check interrupt > attribute for interrupt funct

Re: Patch ping^2

2024-02-26 Thread Uros Bizjak

On Mon, Feb 26, 2024 at 10:33 AM Jakub Jelinek wrote: > > Hi! > > I'd like to ping 2 patches: > https://gcc.gnu.org/pipermail/gcc-patches/2024-February/645326.html > i386: Enable _BitInt support on ia32 > > all the FAILs mentioned in that mail have been fixed by now. LGTM, based on HJ's advice.

[committed] i386: psrlq is not used for PERM [PR113871]

2024-02-27 Thread Uros Bizjak

Also handle V2BF mode. PR target/113871 gcc/ChangeLog: * config/i386/mmx.md (V248FI): Add V2BF mode. (V24FI_32): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/pr113871-5a.c: New test. * gcc.target/i386/pr113871-5b.c: New test. Bootstrapped and regression tested on x86_

Re: [PATCH] i386: Add "z" constraint for symbolic address/label reference [PR105576]

2024-01-10 Thread Uros Bizjak

On Thu, Jan 11, 2024 at 4:44 AM Fangrui Song wrote: > > Printing the raw symbol is useful in inline asm (e.g. in C++ to get the > mangled name). Similar constraints are available in other targets (e.g. > "S" for aarch64/riscv, "Cs" for m68k). > > There isn't a good way for x86 yet, e.g. "i" doesn

Re: [PATCH] i386: Add "z" constraint for symbolic address/label reference [PR105576]

2024-01-11 Thread Uros Bizjak

On Thu, Jan 11, 2024 at 9:33 AM Fangrui Song wrote: > > On 2024-01-11, Uros Bizjak wrote: > >On Thu, Jan 11, 2024 at 4:44 AM Fangrui Song wrote: > >> > >> Printing the raw symbol is useful in inline asm (e.g. in C++ to get the > >> mangled name). Sim

Re: [PATCH] i386: Add "Ws" constraint for symbolic address/label reference [PR105576]

2024-01-16 Thread Uros Bizjak

On Thu, Jan 11, 2024 at 7:24 PM Fangrui Song wrote: > > Printing the raw symbol is useful in inline asm (e.g. in C++ to get the > mangled name). Similar constraints are available in other targets (e.g. > "S" for aarch64/riscv, "Cs" for m68k). > > There isn't a good way for x86 yet, e.g. "i" doesn

Re: [PATCH] i386: Add -masm=intel profiling support [PR113122]

2024-01-18 Thread Uros Bizjak

On Thu, Jan 18, 2024 at 8:31 AM Jakub Jelinek wrote: > > Hi! > > x86_function_profiler emits assembly directly into file and only emits > AT&T syntax. The following patch adjusts it to emit MASM syntax > if -masm=intel. > As it doesn't use asm_fprintf, I can't use {|} syntax for the dialects. > >

Re: [middle-end PATCH] Prefer PLUS over IOR in RTL expansion of multi-word shifts/rotates.

2024-01-20 Thread Uros Bizjak

On Fri, Jan 19, 2024 at 5:50 PM Jeff Law wrote: > > > > On 1/19/24 09:05, Georg-Johann Lay wrote: > > > > > > Am 18.01.24 um 20:54 schrieb Roger Sayle: > >> > >> This patch tweaks RTL expansion of multi-word shifts and rotates to use > >> PLUS rather than IOR for disjunctive operations. During ex

Re: [PATCH] testsuite: i386: Fix gcc.target/i386/pr80833-1.c on 32-bit Solaris/x86

2024-01-24 Thread Uros Bizjak

On Wed, Jan 24, 2024 at 10:07 AM Rainer Orth wrote: > > gcc.target/i386/pr80833-1.c FAILs on 32-bit Solaris/x86 since 20220609: > > FAIL: gcc.target/i386/pr80833-1.c scan-assembler pextrd > > Unlike e.g. Linux/i686, 32-bit Solaris/x86 defaults to -mstackrealign, > so this patch overrides that to m

Re: Unreviewed patches

2024-01-31 Thread Uros Bizjak

On Wed, Jan 31, 2024 at 3:04 PM Rainer Orth wrote: > > Three patches have remained unreviewed for a week or more: > > c++: Fix g++.dg/ext/attr-section2.C etc. with Solaris/SPARC as > https://gcc.gnu.org/pipermail/gcc-patches/2024-January/643434.html > > This one may even be obviou

Re: [PATCH] testsuite: i386: Fix gcc.target/i386/pr38534-1.c etc. on Solaris/x86

2024-01-31 Thread Uros Bizjak

On Wed, Jan 31, 2024 at 2:02 PM Rainer Orth wrote: > > The gcc.target/i386/pr38534-1.c etc. tests FAIL on 32 and 64-bit > Solaris/x86: > > FAIL: gcc.target/i386/pr38534-1.c scan-assembler-not push > FAIL: gcc.target/i386/pr38534-2.c scan-assembler-not push > FAIL: gcc.target/i386/pr38534-3.c scan

Re: [PATCH] testsuite: i386: Fix gcc.target/i386/no-callee-saved-1.c etc. on Solaris/x86

2024-01-31 Thread Uros Bizjak

On Wed, Jan 31, 2024 at 1:57 PM Rainer Orth wrote: > > The gcc.target/i386/no-callee-saved-[12].c tests FAIL on Solaris/x86: > > FAIL: gcc.target/i386/no-callee-saved-1.c scan-assembler-not push > FAIL: gcc.target/i386/no-callee-saved-2.c scan-assembler-not push > > In both cases, the test expect

[committed] i386: Eliminate redundant compare between set{z, nz} and j{z, nz}

2023-12-18 Thread Uros Bizjak

Eliminate redundant compare between set{z,nz} and j{z,nz}: setz %al; test %al,%al; jz <...> -> setz %al; jnz <...> and setnz %al, test %al,%al; jz <...> -> setnz %al; jz <...>. We can use the original Zero-flag value instead of setting the temporary register and testing it for zero. gcc/ChangeLog

Re: [x86 PATCH] Improved TImode (128-bit) integer constants on x86_64.

2023-12-19 Thread Uros Bizjak

On Mon, Dec 18, 2023 at 6:18 PM Roger Sayle wrote: > > > This patch fixes two issues with the handling of 128-bit TImode integer > constants in the x86_64 backend. The main issue is that GCC always > tries to load 128-bit integer constants via broadcasts to vector SSE > registers, even if the res

Re: [PATCH] i386: Fix mmx.md signbit expanders [PR112816]

2023-12-19 Thread Uros Bizjak

On Tue, Dec 19, 2023 at 10:00 AM Jakub Jelinek wrote: > > Hi! > > Apparently when looking for "signbit2" vector expanders, I've only > looked at sse.md and forgot mmx.md, which has two further ones and the > following patch still ICEd. > > Fixed thusly, bootstrapped/regtested on x86_64-linux and i

Re: [PATCH] i386: Make most MD builtins nothrow, leaf [PR112962]

2023-12-20 Thread Uros Bizjak

On Wed, Dec 13, 2023 at 10:21 AM Jakub Jelinek wrote: > > Hi! > > The following patch makes most of x86 MD builtins nothrow,leaf > (like most middle-end builtins are). For -fnon-call-exceptions it > doesn't nothrow, better might be to still add it if the builtins > don't read or write memory and

[committed] i386: Fix shifts with high register input operand [PR113044]

2023-12-21 Thread Uros Bizjak

The move to the output operand should use high register input operand. PR target/113044 gcc/ChangeLog: * config/i386/i386.md (*ashlqi_ext_1): Move from the high register of the input operand. (*qi_ext_1): Ditto. gcc/testsuite/ChangeLog:

Re: [x86_PATCH] peephole2 to resolve failure of gcc.target/i386/pr43644-2.c

2023-12-28 Thread Uros Bizjak

On Fri, Dec 22, 2023 at 11:14 AM Roger Sayle wrote: > > > This patch resolves the failure of pr43644-2.c in the testsuite, a code > quality test I added back in July, that started failing as the code GCC > generates for 128-bit values (and their parameter passing) has been in > flux. After a few

[committed] i386: Cleanup ix86_expand_{unary|binary}_operator issues

2023-12-28 Thread Uros Bizjak

Move ix86_expand_unary_operator from i386.cc to i386-expand.cc, re-arrange prototypes and do some cosmetic changes with the usage of TARGET_APX_NDD. No functional changes. gcc/ChangeLog: * config/i386/i386.cc (ix86_unary_operator_ok): Move from here... * config/i386/i386-expand.cc (ix86_

[committed] i386: Fix TARGET_USE_VECTOR_FP_CONVERTS SF->DF float_extend splitter [PR113133]

2023-12-29 Thread Uros Bizjak

The post-reload splitter currently allows xmm16+ registers with TARGET_EVEX512. The splitter changes SFmode of the output operand to V4SFmode, but the vector mode is currently unsupported in xmm16+ without TARGET_AVX512VL. lowpart_subreg returns NULL_RTX in this case and the compilation fails with

Re: [x86_PATCH] peephole2 to resolve failure of gcc.target/i386/pr43644-2.c

2023-12-31 Thread Uros Bizjak

On Sun, Dec 31, 2023 at 4:56 PM Roger Sayle wrote: > > > Hi Uros, > > > From: Uros Bizjak > > Sent: 28 December 2023 10:33 > > On Fri, Dec 22, 2023 at 11:14 AM Roger Sayle > > wrote: > > > > > > This patch resolves the failure of pr43644-2

Patch ping: Fix for PR 112560

2024-01-02 Thread Uros Bizjak

Hello! I have sent an explanation on ICE in try_combine on pr112494.c [1],and an argument that explains why we can safely ignore non-COMPARISON_P mode changes [2]. Can we proceed with the proposed solution? [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638726.html [2] https://gcc.g

Re: [x86 PATCH] PR target/113231: Improved costs in Scalar-To-Vector (STV) pass.

2024-01-07 Thread Uros Bizjak

On Sat, Jan 6, 2024 at 2:30 PM Roger Sayle wrote: > > > This patch improves the cost/gain calculation used during the i386 backend's > SImode/DImode scalar-to-vector (STV) conversion pass. The current code > handles loads and stores, but doesn't consider that converting other > scalar operations

Re: [PATCH] Clarify -mmovbe documentation

2024-01-08 Thread Uros Bizjak

On Mon, Jan 8, 2024 at 10:56 AM Richard Biener wrote: > > It was noticed that -mmovbe doesn't use movbe for __builtin_bswap{32,64} > when not optimizing. The follownig adjusts the documentation to > say it will be used for optimizing and applies to all byte swaps, > not just those carried out via

[PATCH] match.pd: Convert {I, X}OR of two values ANDed with alien CSTs to PLUS [PR108477]

2024-01-08 Thread Uros Bizjak

Instead of converting XOR or PLUS of two values, ANDed with two constants that have no bits in common, to IOR expression, convert IOR or XOR of said two ANDed values to PLUS expression. If we consider the following testcase: --cut here-- unsigned int foo (unsigned int a, unsigned int b) { unsig

Re: [PATCH] match.pd: Convert {I, X}OR of two values ANDed with alien CSTs to PLUS [PR108477]

2024-01-08 Thread Uros Bizjak

On Mon, Jan 8, 2024 at 5:57 PM Andrew Pinski wrote: > > On Mon, Jan 8, 2024 at 6:44 AM Uros Bizjak wrote: > > > > Instead of converting XOR or PLUS of two values, ANDed with two constants > > that > > have no bits in common, to IOR expression, convert IOR or XOR o

Re: [PATCH] match.pd: Convert {I, X}OR of two values ANDed with alien CSTs to PLUS [PR108477]

2024-01-09 Thread Uros Bizjak

On Tue, Jan 9, 2024 at 9:58 AM Richard Biener wrote: > > On Mon, 8 Jan 2024, Uros Bizjak wrote: > > > On Mon, Jan 8, 2024 at 5:57?PM Andrew Pinski wrote: > > > > > > On Mon, Jan 8, 2024 at 6:44?AM Uros Bizjak wrote: > > > > > > > > Instea

Re: [PATCH] match.pd: Convert {I, X}OR of two values ANDed with alien CSTs to PLUS [PR108477]

2024-01-09 Thread Uros Bizjak

On Tue, Jan 9, 2024 at 10:44 AM Richard Biener wrote: > > On Tue, 9 Jan 2024, Uros Bizjak wrote: > > > On Tue, Jan 9, 2024 at 9:58?AM Richard Biener wrote: > > > > > > On Mon, 8 Jan 2024, Uros Bizjak wrote: > > > > > > > On Mon, Jan 8, 2024 a

< 1 2 3 4 5 6 7 8 9 10 >

101 - 200 of 6424 matches

Mail list logo