[PATCH, testsuite]: Use -O0 for gcc.dg/plugin/poly-int-07_plugin.c ...

2018-08-01 Thread Uros Bizjak
... as is the case with all other gcc.dg/plugin/poly-int-0{1,2,3,4,5,6}_plugin.c testcases. This lowers testcase wall time from 4min 45 sec to 1min 17sec on a slow target. 2018-08-01 Uros Bizjak * gcc.dg/plugin/poly-int-07_plugin.c (dg-options): Use -O0. Tested on alphaev68-linux-gnu

Re: [PATCH] i386: Always set cfun->machine->max_used_stack_alignment

2018-08-04 Thread Uros Bizjak
On Fri, Aug 3, 2018 at 12:55 AM, H.J. Lu wrote: > We should always set cfun->machine->max_used_stack_alignment if the > maximum stack slot alignment may be greater than 64 bits. > > Tested on i686 and x86-64. OK for master and backport for GCC 8? Can you explain why 64 bits, and what this value

Re: [PATCH] i386: Always set cfun->machine->max_used_stack_alignment

2018-08-04 Thread Uros Bizjak
On Sat, Aug 4, 2018 at 3:59 PM, H.J. Lu wrote: > On Sat, Aug 4, 2018 at 3:42 AM, Uros Bizjak wrote: >> On Fri, Aug 3, 2018 at 12:55 AM, H.J. Lu wrote: >>> We should always set cfun->machine->max_used_stack_alignment if the >>> maximum stack slot alig

Re: [PATCH] i386: Always set cfun->machine->max_used_stack_alignment

2018-08-04 Thread Uros Bizjak
On Sat, Aug 4, 2018 at 9:49 PM, H.J. Lu wrote: > On Sat, Aug 4, 2018 at 12:09 PM, Uros Bizjak wrote: >> On Sat, Aug 4, 2018 at 3:59 PM, H.J. Lu wrote: >>> On Sat, Aug 4, 2018 at 3:42 AM, Uros Bizjak wrote: >>>> On Fri, Aug 3, 2018 at 12:55 AM, H.J. Lu wrote: &

Re: [PATCH] i386: Always set cfun->machine->max_used_stack_alignment

2018-08-05 Thread Uros Bizjak
On Sun, Aug 5, 2018 at 12:48 AM, H.J. Lu wrote: > On Sat, Aug 04, 2018 at 11:48:15PM +0200, Uros Bizjak wrote: >> On Sat, Aug 4, 2018 at 9:49 PM, H.J. Lu wrote: >> > On Sat, Aug 4, 2018 at 12:09 PM, Uros Bizjak wrote: >> >> On Sat, Aug 4, 2018 at 3:59 PM, H.J. L

[PATCH, testsuite]: Fix g++.dg/torture/pr86763.C link failure for glibc < 2.17

2018-08-06 Thread Uros Bizjak
2018-08-06 Uros Bizjak * g++.dg/torture/pr86763.C (dg-additional-options): Add -lrt. Tested on CentOS 5.10 and Fedora 28. OK for mainline? Uros. Index: g++.dg/torture/pr86763.C === --- g++.dg/torture/pr86763.C(revision

Re: [PATCH, testsuite]: Fix g++.dg/torture/pr86763.C link failure for glibc < 2.17

2018-08-06 Thread Uros Bizjak
On Mon, Aug 6, 2018 at 5:23 PM, Jeff Law wrote: > On 08/06/2018 09:10 AM, Uros Bizjak wrote: >> 2018-08-06 Uros Bizjak >> >> * g++.dg/torture/pr86763.C (dg-additional-options): Add -lrt. >> >> Tested on CentOS 5.10 and Fedora 28. >> >> OK f

Re: [PATCH, testsuite]: Fix g++.dg/torture/pr86763.C link failure for glibc < 2.17

2018-08-06 Thread Uros Bizjak
On Mon, Aug 6, 2018 at 5:44 PM, Jeff Law wrote: > On 08/06/2018 09:33 AM, Uros Bizjak wrote: >> On Mon, Aug 6, 2018 at 5:23 PM, Jeff Law wrote: >>> On 08/06/2018 09:10 AM, Uros Bizjak wrote: >>>> 2018-08-06 Uros Bizjak >>>> >>>> * g++.

Re: [PATCH] i386: do not use SImode mul-highpart on 64-bit

2018-08-09 Thread Uros Bizjak
On Thu, Aug 9, 2018 at 5:00 PM, Alexander Monakov wrote: > Hello, > > on x86-64, 32-bit division by constants uses mulsi3_highpart pattern that > turns into 'mull ' instruction with source implicitly in eax and > result in edx:eax. However, using 64-bit multiplication with zero-extended > source

[RFC PATCH, i386]: Deprecate -mmitigate-rop

2018-08-10 Thread Uros Bizjak
This option is fairly ineffective, and in the light of CET, nobody seems interested to improve it. Deprecate the option, so it won't lure developers to the land of false security. 2018-08-10 Uros Bizjak * config/i386/i386.opt (mmitigate-rop): Mark as deprecated. * doc/invoke

Re: [PATCH][x86] Match movss and movsd "blend" instructions

2018-08-12 Thread Uros Bizjak
On Sat, Aug 11, 2018 at 11:54 AM, Allan Sandfeld Jensen wrote: > On Samstag, 11. August 2018 11:18:39 CEST Jakub Jelinek wrote: >> On Sat, Aug 11, 2018 at 10:59:26AM +0200, Allan Sandfeld Jensen wrote: >> > +/* A subroutine of ix86_expand_vec_perm_builtin_1. Try to implement D >> > + using movs

Re: [PATCH][x86] Match movss and movsd "blend" instructions

2018-08-15 Thread Uros Bizjak
>>>> + for (i = 1; i < nelt; ++i) { >>>> +{ >>>> + if (d->perm[i] != i + nelt - d->perm[0]) >>>> +return false; >>>> +} >>>> + } >>> >>> Extraneous {}s (both pairs, the

Re: [RFC PATCH, i386]: Deprecate -mmitigate-rop

2018-08-15 Thread Uros Bizjak
On Wed, Aug 15, 2018 at 5:56 AM, Jeff Law wrote: > On 08/10/2018 05:42 AM, Uros Bizjak wrote: >> This option is fairly ineffective, and in the light of CET, nobody >> seems interested to improve it. Deprecate the option, so it won't lure >> developers to the land of fal

[PATCH, testsuite]: Loosen scan-assembler strings in gcc.target/i386/avx{,2}-cvt-2.c

2018-08-16 Thread Uros Bizjak
Hello! These instructions can take memory operands and current scan-assembler strings were too tight to accept them. 2018-08-16 Uros Bizjak PR testsuite/86745 * gcc.target/i386/avx-cvt-2.c: Loosen scan-assembler strings. * gcc.target/i386/avx2-cvt-2.c: Ditto. Tested on x86_64

Re: [RFC][PATCH][mid-end] Optimize immediate choice in comparisons.

2018-08-18 Thread Uros Bizjak
Hello! >> gcc/testsuite/ >> Changelog for gcc/testsuite/Changelog >> 2018-08-14 Vlad Lazar >> >> * gcc.target/aarch64/imm_choice_comparison.c: New. >> >> gcc/ >> Changelog for gcc/Changelog >> 2018-08-14 Vlad Lazar >> * expmed.h (canonicalize_comparison): New declaration. >> * ex

[PATCH, i386]: FIx PR 86994, gcc.target/i386/20040112-1.c FAILs

2018-08-19 Thread Uros Bizjak
hat can be stuffed into immediate field of an insn gets cost 0, and everything else gets cost 1. This is not entirely correct, considering how return of 0 is treated, but it is a minimum change that gets the job done and doesn't regress the testsuite. If needed, we'll eventually refine it

Re: [PATCH] x86: Always update EH return address in word_mode

2018-08-20 Thread Uros Bizjak
On Mon, Aug 20, 2018 at 8:19 PM, H.J. Lu wrote: > On x86, return address is always popped in word_mode. eh_return needs > to put EH return address in word_mode on stack. > > Tested on x86-64 with x32. OK for trunk and release branches? OK. Perhaps the testcase should go into g++.dg/torture, sinc

[PATCH]: Remove remaining traces of MPX bounded pointers

2018-08-24 Thread Uros Bizjak
2018-08-23 Uros Bizjak * emit-rtl.c (init_emit_once): Do not emit MODE_POINTER_BOUNDS RTXes. * emit-rtl.h (rtl_data): Remove return_bnd. * explow.c (trunc_int_for_mode): Do not handle POINTER_BOUNDS_MODE_P. * function.c (diddle_return_value): Do not handle crtl->return_

Re: [PATCH 1/1] Move AESNI generation to Skylake and Goldmont

2018-08-29 Thread Uros Bizjak
On Thu, Aug 30, 2018 at 7:14 AM, Thiago Macieira wrote: > The instruction set first appeared with Westmere, but not all processors > in that and the next few generations have the instructions. According to > Wikipedia[1], the first generation in which all SKUs have AES > instructions are Skylake a

Re: [PATCH] Fix up -mno-xsave handling (PR target/87198)

2018-09-04 Thread Uros Bizjak
On Tue, Sep 4, 2018 at 4:28 PM, Jakub Jelinek wrote: > Hi! > > The -mxsave{opt,s,c} options turn on automatically -mxsave option and > the patterns rely on TARGET_XSAVE{OPT,S,C} implying TARGET_XSAVE, > but if somebody uses e.g. -mxsave{opt,s,c} -mno-xsave (or something that > implies > -mno-xsav

[PATCH, i386]: Fix PR81015, Bad codegen for __builtin_clz(unsigned short)

2017-06-08 Thread Uros Bizjak
Hello! Attached patch removes invalid substitution of zero-extended HImode operands with HImode operation. CLZ returns different value when operating on SImode value vs. HImode value. 2017-06-08 Uros Bizjak PR target/81015 Revert: 2016-12-14 Uros Bizjak PR target/59874

Re: [PATCH][X86] Fix rounding pattern similar to PR73350

2017-06-14 Thread Uros Bizjak
On Tue, Jun 13, 2017 at 1:37 PM, Koval, Julia wrote: > Thank you for your help. I fixed the test similar to existing sigaction tests. > > gcc/ > * config/i386/i386.c: Fix rounding expand for new pattern. > * config/i386/subst.md: Fix pattern (parallel -> unspec). > gcc/testsuite/ >

Re: [PATCH][X86] Fix rounding pattern similar to PR73350

2017-06-16 Thread Uros Bizjak
On Fri, Jun 16, 2017 at 8:46 AM, Koval, Julia wrote: > Hi, > > This test hangs on avx512er, maybe that's why: >> According to POSIX, the behavior of a process is undefined after it ignores >> a SIGFPE, SIGILL, or SIGSEGV signal that was not generated by kill(2) or >> raise(3). > > And volatile m

Re: [PATCH 1/2] i386: Consider Kaby Lake to be equivalent to Skylake

2017-06-18 Thread Uros Bizjak
On Fri, Jun 16, 2017 at 11:42 PM, Matt Turner wrote: > Currently -march=native selects -march=broadwell on Kaby Lake systems, > since its model numbers are missing from the switch statement. It falls > back to the default case and chooses -march=broadwell because of the > presence of the ADX instr

Re: [PATCH 2/2] i386: Assume Skylake for unknown models with clflushopt

2017-06-18 Thread Uros Bizjak
On Fri, Jun 16, 2017 at 11:42 PM, Matt Turner wrote: > gcc/ > * config/i386/driver-i386.c (host_detect_local_cpu): Assume > skylake for unknown models with clflushopt. Also OK. Thanks, Uros. > --- > gcc/config/i386/driver-i386.c | 3 +++ > 1 file changed, 3 insertions(+) > > di

Re: [PATCH] Fix x86 ICE with -mtune=amdfam10 -mno-sse2 (PR target/81121)

2017-06-19 Thread Uros Bizjak
On Mon, Jun 19, 2017 at 5:37 PM, Jakub Jelinek wrote: > Hi! > > This testcase started to ICE when PR70873 fix changed the splitter: > @@ -5153,11 +5147,11 @@ > ;; slots when !TARGET_INTER_UNIT_MOVES_TO_VEC disables the general_regs > ;; alternative in sse2_loadld. > (define_split > - [(set (ma

Re: RFC: stack/heap collision vulnerability and mitigation with GCC

2017-06-20 Thread Uros Bizjak
On Mon, Jun 19, 2017 at 7:51 PM, Jakub Jelinek wrote: > On Mon, Jun 19, 2017 at 11:45:13AM -0600, Jeff Law wrote: >> On 06/19/2017 11:29 AM, Jakub Jelinek wrote: >> > >> > Also, on i?86 orq $0, (%rsp) or orl $0, (%esp) is used to probe stack, >> > while it is shorter, is it actually faster or as s

Re: RFC: stack/heap collision vulnerability and mitigation with GCC

2017-06-20 Thread Uros Bizjak
On Tue, Jun 20, 2017 at 12:18 PM, Richard Biener wrote: > On Tue, Jun 20, 2017 at 10:03 AM, Uros Bizjak wrote: >> On Mon, Jun 19, 2017 at 7:51 PM, Jakub Jelinek wrote: >>> On Mon, Jun 19, 2017 at 11:45:13AM -0600, Jeff Law wrote: >>>> On 06/19/2017

Re: RFC: stack/heap collision vulnerability and mitigation with GCC

2017-06-20 Thread Uros Bizjak
On Tue, Jun 20, 2017 at 2:13 PM, Florian Weimer wrote: > On 06/20/2017 01:10 PM, Uros Bizjak wrote: > >> 74,99% a.outa.out [.] test_or >> 12,50% a.outa.out [.] test_movb >> 12,50% a.outa.out [.] test_movl > > Could

Re: RFC: stack/heap collision vulnerability and mitigation with GCC

2017-06-20 Thread Uros Bizjak
On Tue, Jun 20, 2017 at 2:17 PM, Uros Bizjak wrote: > On Tue, Jun 20, 2017 at 2:13 PM, Florian Weimer wrote: >> On 06/20/2017 01:10 PM, Uros Bizjak wrote: >> >>> 74,99% a.outa.out [.] test_or >>> 12,50% a.outa.out [.] test_

[PATCH, alpha]: Update libstdc++ baseline_symbols.txt

2017-06-20 Thread Uros Bizjak
2017-06-20 Uros Bizjak * config/abi/post/alpha-linux-gnu/baseline_symbols.txt: Update. Tested on alphaev68-linux-gnu and committed to mainline SVN. Uros. Index: config/abi/post/alpha-linux-gnu/baseline_symbols.txt

[PATCH, testsuite]: Fix gcc.target/i386/pr80732.c execution test failure

2017-06-20 Thread Uros Bizjak
2017-06-20 Uros Bizjak * gcc.target/i386/pr80732.c: Include fma4-check.h. (main): Renamed to ... (fma4_test): ... this. Tested on x86_64-linux-gnu and committed to mainline SVN. Uros. Index: gcc.target/i386/pr80732.c

[PATCH, alpha, go]: Introduce applyRelocationsALPHA

2017-06-20 Thread Uros Bizjak
This patch inroduces applyRelocationsALPHA to solve: FAIL: TestCgoConsistentResults FAIL: TestCgoPkgConfig FAIL: TestCgoHandlesWlORIGIN gotools errors. Bootstrapped and regression tested on alphaev68-linux-gnu. Uros. Index: go/debug/elf/file.go ==

Re: [i386] __builtin_ia32_stmxcsr could be pure

2017-06-21 Thread Uros Bizjak
Hello! > glibc marks fegetround as a pure function. On x86, people tend to use > _MM_GET_ROUNDING_MODE instead, which could benefit from the same. I think it > is safe, but > a second opinion would be welcome. I could have handled just this builtin, > but it seemed better to > provide def_builti

Re: [PATCH] Fix -Wmaybe-uninitialized warning on sse.md (PR target/81151)

2017-06-21 Thread Uros Bizjak
On Wed, Jun 21, 2017 at 8:27 PM, Jakub Jelinek wrote: > Hi! > > This expander has a gap in between the operands and match_dup indexes, > which results in genemit generating: > operand2 = operands[2]; > (void) operand2; > where operands[2] has not been initialized. > > Fixed thusly, bootstr

Re: [PATCH, alpha, go]: Introduce applyRelocationsALPHA

2017-06-22 Thread Uros Bizjak
On Thu, Jun 22, 2017 at 12:39 AM, Ian Lance Taylor wrote: > On Tue, Jun 20, 2017 at 12:46 PM, Uros Bizjak wrote: >> This patch inroduces applyRelocationsALPHA to solve: >> >> FAIL: TestCgoConsistentResults >> FAIL: TestCgoPkgConfig >> FAIL: TestCgoHan

Re: [PATCH] Fix PR81175, make gather builtins pure

2017-06-26 Thread Uros Bizjak
On Fri, Jun 23, 2017 at 3:22 PM, Richard Biener wrote: > On Fri, 23 Jun 2017, Marc Glisse wrote: > >> On Fri, 23 Jun 2017, Richard Biener wrote: >> >> > The vectorizer is confused about the spurious VDEFs that are caused >> > by gather vectorization so the following avoids them by making the >> >

[PATCH, alpha, go]: Remove PtraceRegs definition to restore bootstrap

2017-06-26 Thread Uros Bizjak
Hello! libgo is now able to automatically determine PtraceRegs. Attached patch removes duplicate manual definition from system dependent source. Bootstrapped and regression tested on alphaev68-linux-gnu. Uros. Index: go/syscall/syscall_linux_alpha.go =

Re: [PATCH] Fix PR81175, make gather builtins pure

2017-06-27 Thread Uros Bizjak
On Tue, Jun 27, 2017 at 12:02 PM, Jakub Jelinek wrote: > On Fri, Jun 23, 2017 at 02:54:35PM +0200, Richard Biener wrote: >> 2017-06-23 Richard Biener >> >> PR target/81175 >> * config/i386/i386.c (struct builtin_isa): Add pure_p member. >> (def_builtin2): Initialize pure_p. >>

Re: [PATCH] Fold (a > 0 ? 1.0 : -1.0) into copysign (1.0, a) and a * copysign (1.0, a) into abs(a)

2017-06-28 Thread Uros Bizjak
On Wed, Jun 28, 2017 at 9:37 AM, Jakub Jelinek wrote: > On Tue, Jun 27, 2017 at 10:52:47AM -0700, Andrew Pinski wrote: >> On Tue, Jun 27, 2017 at 7:56 AM, Richard Biener >> wrote: >> > On June 27, 2017 4:52:28 PM GMT+02:00, Tamar Christina >> > wrote: >> >>> >> +(for cmp (gt ge lt le) >> >>> >>

Re: [PATCH][x86] Add permutex[var]_epi[32,64] intrinsics

2017-06-28 Thread Uros Bizjak
On Wed, Jun 28, 2017 at 12:01 PM, Peryt, Sebastian wrote: > Hi, > > This patch adds missing intrinsics: > - _mm256_permutexvar_epi32 > - _mm256_permutex_epi64 > - _mm256_permutexvar_epi64 > > gcc/ > * config/i386/avx512vlintrin.h (_mm256_permutexvar_epi64, > _mm256

Re: gotools patch committed: Test runtime, misc/cgo/{test,testcarchive}

2017-06-29 Thread Uros Bizjak
Hello! > This patch to the gotools Makefile adds tests to `make check`. We now > test the runtime package using the newly built go tool, and test that > cgo works by running the misc/cgo/test and misc/cgo/testcarchive > tests. Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu. > Committed

Re: Patch ping (Re: [PATCH] Fix PR81175, make gather builtins pure)

2017-07-04 Thread Uros Bizjak
On Tue, Jul 4, 2017 at 10:35 AM, Jakub Jelinek wrote: > Hi! > > On Tue, Jun 27, 2017 at 12:27:25PM +0200, Jakub Jelinek wrote: >> Fixed thusly, ok for trunk? Perhaps we should add another testcase to check >> similarly gatherpf builtin without the lhs, but we'd need different options. > > I'd lik

[PATCH, i386] Fix PR 81294, _subborrow_u64 argument order inconsistent with intrinsic reference

2017-07-04 Thread Uros Bizjak
Hello! Apparently, Intel changed operand order with the new intrinsic reference release version. Attached patch updates gcc intrinsic headers accordingly. 2017-07-04 Uros Bizjak PR target/81294 * config/i386/adxintrin.h (_subborrow_u32): Swap _X and _Y arguments in the call to

[PATCH, i386]: Fix PR 81300, -fpeephole2 breaks __builtin_ia32_sbb_u64, _subborrow_u64 on AMD64

2017-07-04 Thread Uros Bizjak
Hello! Attached patch tightens peephole2 condition to prevent unwanted flags_reg clobbering by insn patterns, emitted by ix86_expand_clear. 2017-07-04 Uros Bizjak PR target/81300 * config/i386/i386.md (setcc + movzbl/and to xor + setcc peepholes): Require dead FLAGS_REG at the

[PATCH, i386]: Better fix for PR 80425

2017-11-07 Thread Uros Bizjak
Hello! New register allocator alternative decorations allows us to not penalize alternatives *unless* reload is required. The '$' is described as: '$' This constraint is analogous to '!' but it disparages severely the alternative only if the operand with the '$' needs a reload. and fit

Re: [PATCH] Fix up predicates for commutative vector comparison (PR target/82855)

2017-11-08 Thread Uros Bizjak
> The issues fixed by the previous patch together with this one result > in the testcase from the PR with -mtune=intel (for some reason with > generic tuning we decide to perform the 256-bit load as 2 128-bit loads and > don't merge that into 256-bit comparison operand, shall we change that?) > to

Re: [x86][patch] Add -march=cannonlake.

2017-11-08 Thread Uros Bizjak
On Wed, Nov 8, 2017 at 9:02 AM, Koval, Julia wrote: > Attachment got lost. > >> -Original Message- >> From: Koval, Julia >> Sent: Wednesday, November 08, 2017 9:01 AM >> To: 'GCC Patches' >> Cc: 'Uros Bizjak' ; 'Kirill Yu

Re: [PATCH] Add option to force indirect calls for x86

2017-11-08 Thread Uros Bizjak
> gcc/: > 2017-11-08 Andi Kleen > > * config/i386/i386.opt: Add -mforce-indirect-call. > * config/i386/predicates.md: Check for flag_force_indirect_call. > * doc/invoke.texi: Document -mforce-indirect-call > > gcc/testsuite/: > 2017-11-08 Andi Kleen > > * gcc.target/i386/force-indirect-call-1

[PATCH, testsuite]: Fix gcc.target/i386/force-indirect-call-?.c fallout

2017-11-10 Thread Uros Bizjak
2017-11-10 Uros Bizjak * gcc.target/i386/force-indirect-call-1.c: Merge scan strings. * gcc.target/i386/force-indirect-call-2.c: Ditto. Require fpic effective target. * gcc.target/i386/force-indirect-call-3.c: Ditto. Require lp64 effective target. Tested on x86_64-linux

Re: [x86][patch] Add -march=cannonlake.

2017-11-12 Thread Uros Bizjak
On Sat, Nov 11, 2017 at 10:10 PM, Koval, Julia wrote: > Hi Uros, > I fixed comments. > Btw, I haven't found skylake-avx512 in driver-i386.c at all. Is it intended > or should I add it? It looks like an oversight to me. If there are no "skylake-avx512" model, then the driver goes through "This is

Re: [patch][x86] -march=icelake

2017-11-12 Thread Uros Bizjak
On Sun, Nov 12, 2017 at 1:04 AM, Koval, Julia wrote: > Hi, this patch adds new option -march=icelake. Isasets defined in: > https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf > I didn't add arch code to driver-i386.c, bec

Re: [x86][patch] Add -march=cannonlake.

2017-11-13 Thread Uros Bizjak
On Mon, Nov 13, 2017 at 11:29 AM, Koval, Julia wrote: > Hi, here is followup patch to add skylake-avx512. > gcc/ > * config/i386/driver-i386.c (host_detect_local_cpu): Detect > skylake-avx512. OK. Thanks, Uros.

Re: [PATCH, i386] Refactor -mprefer-avx[128|256] options into common -mprefer-vector-width=[none|128|256|512]

2017-11-13 Thread Uros Bizjak
On Mon, Nov 13, 2017 at 6:25 PM, Shalnov, Sergey wrote: > Hi, > Modern architectures provides wider and wider vector registers. This patch > implements > common (in i386 arch) option to prefer vector register width for the > vectorizer. > Currently, GCC has "-mprefer-avx128" and "-mprefer-avx256

Re: [PATCH, i386] Refactor -mprefer-avx[128|256] options into common -mprefer-vector-width=[none|128|256|512]

2017-11-13 Thread Uros Bizjak
On Mon, Nov 13, 2017 at 9:13 PM, Uros Bizjak wrote: > On Mon, Nov 13, 2017 at 6:25 PM, Shalnov, Sergey > wrote: >> Hi, >> Modern architectures provides wider and wider vector registers. This patch >> implements >> common (in i386 arch) option to prefer

Re: [PATCH, i386] Refactor -mprefer-avx[128|256] options into common -mprefer-vector-width=[none|128|256|512]

2017-11-13 Thread Uros Bizjak
On Tue, Nov 14, 2017 at 12:14 AM, Joseph Myers wrote: > On Mon, 13 Nov 2017, Uros Bizjak wrote: > >> [BTW: --mprefer-avx128 should be marked RejectNegative from the >> beginning; let's just assume nobody uses it in its (somehow weird) >> negative "-mno-prefer-av

Re: [PATCH][i386] PR82941/PR82942 - Adding vzeroupper generation for SKX

2017-11-14 Thread Uros Bizjak
; Cc: Jakub Jelinek ; gcc-patches@gcc.gnu.org; Uros Bizjak >> ; Kirill Yukhin ; Lu, Hongjiu >> >> Subject: Re: [PATCH][i386] PR82941/PR82942 - Adding vzeroupper generation >> for SKX >> >> On Tue, Nov 14, 2017 at 3:18 AM, Peryt, Sebastian >> wrote: >

Re: [PATCH] i386: Update the default -mzeroupper setting

2017-11-15 Thread Uros Bizjak
On Wed, Nov 15, 2017 at 2:37 PM, H.J. Lu wrote: > -mzeroupper is specified to generate vzeroupper instruction. If it > isn't used, the default should depend on !TARGET_AVX512ER. Users can > always use -mzeroupper or -mno-zeroupper to override it. > > Sebastian, can you run the full test with it?

Re: [PATCH] i386: Update the default -mzeroupper setting

2017-11-15 Thread Uros Bizjak
On Wed, Nov 15, 2017 at 5:59 PM, H.J. Lu wrote: > On Wed, Nov 15, 2017 at 8:09 AM, Uros Bizjak wrote: >> On Wed, Nov 15, 2017 at 2:37 PM, H.J. Lu wrote: >>> -mzeroupper is specified to generate vzeroupper instruction. If it >>> isn't used, the default should de

Re: [PATCH] fix -mnop-mcount generate 5byte nop in 32bit.

2017-11-15 Thread Uros Bizjak
"1:\tnopl 0x01(%%eax,%%eax,1)\n"); /* 5 byte nop. */ Even the above change is not correct, since it will be assembled in a different way on 32 bit and 64 bit targets (size prefix will be added on 64 bit targets). Attached patch fixes this issue by emitting a stream of

Re: [patch][x86] skylake costs

2017-11-17 Thread Uros Bizjak
On Fri, Nov 17, 2017 at 10:18 AM, Koval, Julia wrote: > Hi, this patch introduces separate cost model for skylake-avx512. Ok for > trunk? > > gcc/ > * config/i386/i386.c (processor_target_table): Add skylake_cost for > skylake-avx512. > * config/i386/x86-tune-costs.h (skyl

[PATCH, i386]: Add bswaphi2 insn pattern

2017-11-20 Thread Uros Bizjak
Hello! Attached patch introduces bswaphi2 named insn pattern that results in movbe instruction. Without the patch, the following testcase:

Re: [PATCH, i386]: Add bswaphi2 insn pattern

2017-11-20 Thread Uros Bizjak
movw%si, (%rdi) and with patched compiler: movbe %si, (%rdi) 2017-11-20 Uros Bizjak * config/i386/i386.md (bswaphi2): New expander. (*bswaphi2_movbe): New insn pattern. (bswaphi -> rorhi pepehole2): New peephole pattern. testsuite/ChangeLog: 2017-11-20 Ur

Re: Adjust empty class parameter passing ABI (PR c++/60336)

2017-11-21 Thread Uros Bizjak
On Mon, Nov 20, 2017 at 4:51 PM, Marek Polacek wrote: > On Thu, Nov 16, 2017 at 02:20:59PM -0500, Jason Merrill wrote: >> On Thu, Nov 16, 2017 at 12:41 PM, Marek Polacek wrote: >> > On Tue, Nov 14, 2017 at 07:34:54AM +0100, Richard Biener wrote: >> >> On November 14, 2017 6:21:41 AM GMT+01:00, Ja

Re: [PATCH, i386] Refactor -mprefer-avx[128|256] options into common -mprefer-vector-width=[none|128|256|512]

2017-11-21 Thread Uros Bizjak
On Tue, Nov 21, 2017 at 4:50 PM, Shalnov, Sergey wrote: > Uros, > I did new patch with all comments addressed as proposed. > 1. old option -mprefer-avx128 is Alias(mprefer-vector-width=, 128, none) > 2. Simplified default initialization (as Bernhard proposed) > 3. Fixed documentation (proposed by

[PATCH, i386]: Improve movbe insn a bit

2017-11-21 Thread Uros Bizjak
2017-11-21 Uros Bizjak * config/i386/i386.md (*bswap2_movbe): Add integer suffix to movbe mnemonic. (*bswaphi2_movbe): Ditto. (bswaphi_lowpart): Merge with *bswaphi_lowpart_1. testsuite/ChangeLog: 2017-11-21 Uros Bizjak * gcc.target/i386/movbe-1.c: Update scan string

Re: [PATCH, i386] Refactor -mprefer-avx[128|256] options into common -mprefer-vector-width=[none|128|256|512]

2017-11-21 Thread Uros Bizjak
I have committed the attached patch. Uros. On Tue, Nov 21, 2017 at 6:18 PM, Shalnov, Sergey wrote: > Uros, > Yes, please. Thank you for your proposals and comments. > Please commit as you proposed. > Sergey > > -Original Message- > From: Uros Bizjak [mailto:ubiz

Re: [PATCH, i386] Fix behavior for –mprefer-vector-width= option

2017-11-22 Thread Uros Bizjak
On Wed, Nov 22, 2017 at 3:58 PM, Shalnov, Sergey wrote: > Hi, > This patch making –mprefer-vector-width= option inclusive. This means that > if we use –mprefer-vector-width=128 it should switch TARGET_PREFER_AVX128=ON > and TARGET_PREFER_AVX256=ON also. > It is minor change to generate “xmm” with

[PATCH, testsuite]: Remove bswap16, bswap32 and bswap64 effective targets

2017-11-22 Thread Uros Bizjak
These are the same as bswap effective target. Also, all listed targets are capable of int32plus. 2017-11-22 Uros Bizjak * lib/target-supports.exp (check_effective_target_bswap16): Remove (check_effective_target_bswap32): Ditto. (check_effective_target_bswap64): Ditto. * gcc.dg

Re: [i386] PR83109 [CET] improper code generation for builtin_longjmp with -fcf-protection -mcet

2017-11-27 Thread Uros Bizjak
On Sun, Nov 26, 2017 at 10:56 PM, Tsimbalist, Igor V wrote: > According to the description of inssp instruction from Intel CET it adusts > the shadow stack pointer (ssp) only by value in the range of [0..255]. As a > number of adjustment could be greater than 255 there should be a loop > generate

Re: [PATCH] Fix vec_concatv2di pattern for SSE4 (PR target/80819)

2017-11-29 Thread Uros Bizjak
On Wed, Nov 29, 2017 at 9:24 AM, Jakub Jelinek wrote: > Hi! > > Before r218303 we had just (=x,0,rm) alternative for SSE4 (no AVX), > that change turned it into (=Yr,0,*rm) and (=*x,0,rm) alternatives, > so that we avoid too many prefixes if possible. > The latter alternative is fine, we want the

Re: [Patch][x86, backport] Backport to GCC-7 vzeroupper patches

2017-11-29 Thread Uros Bizjak
On Wed, Nov 29, 2017 at 10:39 AM, Peryt, Sebastian wrote: > Hi, > > I'd like to ask for backporting to GCC-7 branch vzeroupper generation patches > from trunk, > that are resolving 3 PRs: > PR target/82941 > PR target/82942 > PR target/82990 > > Two patches were combined into one and rebased. Boo

Re: [Patch][x86, backport] Backport to GCC-6 vzeroupper patches

2017-11-29 Thread Uros Bizjak
On Wed, Nov 29, 2017 at 10:46 AM, Peryt, Sebastian wrote: > Hi, > > I'd like to ask for backporting to GCC-6 branch vzeroupper generation patches > from trunk, > that are resolving 3 PRs: > PR target/82941 > PR target/82942 > PR target/82990 > > Two patches were combined into one and rebased. Boo

Re: [PATCH, i386] Fix movdi_internal to return MODE_TI with AVX512

2017-11-29 Thread Uros Bizjak
On Wed, Nov 29, 2017 at 12:05 PM, Shalnov, Sergey wrote: > Hi, > I found wrong MODE_XI used in movdi_internal that cause zmm > Generation with "-march=skylake-avx512 -mprefer-vector-width=128" > options set. This patch fixes the mode and register type but keep using > AVX512 instruction set. IMO,

Re: [PATCH, i386] Fix movdi_internal to return MODE_TI with AVX512

2017-11-29 Thread Uros Bizjak
On Wed, Nov 29, 2017 at 1:10 PM, Uros Bizjak wrote: > On Wed, Nov 29, 2017 at 12:05 PM, Shalnov, Sergey > wrote: >> Hi, >> I found wrong MODE_XI used in movdi_internal that cause zmm >> Generation with "-march=skylake-avx512 -mprefer-vector-width=128" >>

Re: [PATCH] Fix i?86/x86_64 pre-SSE4.1 rint expansion (PR target/81906)

2017-12-07 Thread Uros Bizjak
On Thu, Dec 7, 2017 at 5:48 PM, Jakub Jelinek wrote: > Hi! > > As mentioned in the PR, the code emitted by ix86_expand_rint > doesn't work with rounding to +/- infinity. > This patch adjusts it if flag_rounding_math to do something that works > well even for that case (should be just one insn long

Re: [x86][patch] Fix clwb for skylake

2017-12-11 Thread Uros Bizjak
On Mon, Dec 11, 2017 at 9:34 AM, Koval, Julia wrote: > Hi Uros, Kirill, > According to isa-extensions doc CLWB appeared first in Skylake-avx512, but it > isn't in the PTA. This patch fixes it. Ok for trunk? Please also include ChangeLog entry in your patch submission. Uros.

Re: [x86][patch] Fix clwb for skylake

2017-12-12 Thread Uros Bizjak
On Tue, Dec 12, 2017 at 10:08 AM, Koval, Julia wrote: > Sorry, > > gcc/ > * config/i386/i386.c (PTA_SKYLAKE_AVX512): Add PTA_CLWB. > (PTA_CANNONLAKE): Remove PTA_CLWB. Approved and committed to mainline SVN. Thanks, Uros.

[PATCH, testsuite]: Avoid -Wformat warnings in guality.h

2017-12-17 Thread Uros Bizjak
ALITY_TEST ": %s is %lli\n", name, value); 2017-12-17 Uros Bizjak * gcc.dg/guality/guality.h (guality_check): Cast %lli arguments inf fprintf statements to long long int. Tested on x86_64-linux-gnu {,-m32} and committed to mainline SVN. Uros. Index: gcc.

Re: [patch][x86] -march=icelake

2017-12-19 Thread Uros Bizjak
On Mon, Dec 18, 2017 at 2:42 PM, Koval, Julia wrote: > Hi, I tried to replace 2 flags variable with c++ bitset(in patch attached). > What do you think? Hm, I'm not a c++ person, but I wonder about overhead and performance impact of this change. Maybe [] operator could be used instead of a dynami

Re: [PATCH] Fix some x86 OPTION_MASK_ISA* issues (PR target/83488)

2017-12-20 Thread Uros Bizjak
On Wed, Dec 20, 2017 at 8:37 PM, Jakub Jelinek wrote: > Hi! > > As you know, we ran out of ix86_isa_flags bitmask bits some time ago. > The testcase from the PR's #c0 (which I'm not adding into testsuite, because > it will be useless any time *.opt is modified with any of the > OPTION_MASK_ISA* bi

[PATCH, i386]: Fix PR83467, ICE: in assign_by_spills

2017-12-21 Thread Uros Bizjak
Hello! Attached patch fixes non-BMI2 shift define-and-split instructions that remove unnecessary masking of count operand by adding a register constraints that allows only CX hard register. 2017-12-21 Uros Bizjak PR target/83467 * config/i386/i386.md (*ashl3_mask): Add operand

Re: [PATCH/x86] Move mavx512vnni option from ix86_isa_flags2 to ix86_isa_flags.

2017-12-22 Thread Uros Bizjak
On Fri, Dec 22, 2017 at 11:35 AM, Tsimbalist, Igor V wrote: > This is a follow up patch for pr83488 to fix an error in setting > > OPTION_MASK_ISA_AVX512VNNI_SET and OPTION_MASK_ISA_AVX512F_SET bits. > > There were both set in ix86_isa_flags2 while being defined in > > different ISA sets. Addition

Re: [PATCH] Misc sse.md formatting fixes

2017-12-30 Thread Uros Bizjak
On Thu, Dec 28, 2017 at 10:06 AM, Jakub Jelinek wrote: > Hi! > > I've noticed various formatting issues in the recently added ISA support > patterns. No functional changes, bootstrapped/regtested on x86_64-linux and > i686-linux, ok for trunk? > > OT, wonder why we have any of the maskz and maskz

Re: [PATCH] Fix gcc.target/i386/avx512vpopcntdqvl-vpopcnt*-1.c FAILs

2017-12-30 Thread Uros Bizjak
On Thu, Dec 28, 2017 at 4:18 PM, Jakub Jelinek wrote: > Hi! > > Binutils had vpopcnt[dq] support since ~ January, but only for the 512-bit > instructions, only in ~ October further support for the AVX512VPOPCNTDQ | > AVX512VL instructions has been added. So, if one is using gas in between > those

Re: [PATCH] Fix gcc.target/i386/avx512vpopcntdqvl-vpopcnt*-1.c FAILs

2017-12-30 Thread Uros Bizjak
On Thu, Dec 28, 2017 at 4:18 PM, Jakub Jelinek wrote: > Hi! > > Binutils had vpopcnt[dq] support since ~ January, but only for the 512-bit > instructions, only in ~ October further support for the AVX512VPOPCNTDQ | > AVX512VL instructions has been added. So, if one is using gas in between > those

[PATCH, alpha]: Fix PR 83628, performance regression when accessing arrays on alpha

2018-01-04 Thread Uros Bizjak
n in the generic part of the compiler, since gcc-5 was able to simplify this combination. [1] https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01841.html 2018-01-04 Uros Bizjak PR target/83628 * config/alpha/alpha.md (*sadd): Use ASHIFT instead of MULT rtx. Update all corresponding split

Re: [PATCH] Fix ICE with -mmigitage-rop (PR target/83554)

2018-01-04 Thread Uros Bizjak
recognized because the second alternative is > disabled. > > Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for > trunk? > > 2018-01-04 Jakub Jelinek > Uros Bizjak > > PR target/83554 > * config/i386/i386.md

[PATCH, combine]: Use correct mode for ASHIFT in force_int_to_mode

2018-01-08 Thread Uros Bizjak
runcate to op_mode again, so it all looks like a typo to me. 2018-01-08 Uros Bizjak PR target/83628 * combine.c (force_int_to_mode) : Use mode instead of op_mode in the force_to_mode call. Together with a follow-up target patch, the patch fixes gcc.target/alpha/pr83628-2.c scan-as

[PATCH, alpha]: Improve fix for PR83628

2018-05-25 Thread Uros Bizjak
Hello! Attached patch improves fix for PR83628 by providing ashlsi3 pattern. This allows combiner to remove subregs of inner DImode ashift. 2018-05-25 Uros Bizjak PR target/83628 * config/alpha/alpha.md (ashlsi3): New insn pattern. (*ashlsi_se): Rename from *ashldi_se. Define as

Re: [PATCH] Rename ufloat to floatuns and ufix_trunc to fixuns_trunc in a few patterns (PR target/85918)

2018-05-26 Thread Uros Bizjak
On Fri, May 25, 2018 at 11:09 PM, Jakub Jelinek wrote: > Hi! > > The optab is looking for floatuns2 and > fixuns_trunc2, but some of the patterns are instead called > ufloat2 or ufix_trunc2 > and thus are only used from intrinsics. > > We can't change all spots, in two spots we have intentionally

Re: [PATCH] Introduce VEC_UNPACK_FIX_TRUNC_{LO,HI}_EXPR and VEC_PACK_FLOAT_EXPR, use it in x86 vectorization (PR target/85918)

2018-05-28 Thread Uros Bizjak
On Mon, May 28, 2018 at 11:58 AM, Jakub Jelinek wrote: > Hi! > > AVX512DQ and AVX512DQ/AVX512VL has instructions for vector float <-> > {,unsigned} long long conversions. The following patch adds the missing > tree codes, optabs and expanders to make this possible. > > Bootstrapped/regtested on x

[PATCH, i386]: Fix PR85950, Unsafe-math-optimizations regresses optimization using SSE4.1 roundss

2018-05-29 Thread Uros Bizjak
Hello! Attached patch enables l2 for TARGET_SSE4.1, and while there, also corrects operand 1 predicate of rounds{s,d} instruction. 2018-05-29 Uros Bizjak PR target/85950 * config/i386/i386.md (l2): Enable for TARGET_SSE4_1 and generate rounds{s,d} and cvtts{s,d}2si{,q

Re: [PATCH][x86] Remove duplicated headers includes

2018-05-30 Thread Uros Bizjak
On Wed, May 30, 2018 at 2:44 PM, Peryt, Sebastian wrote: > Hi, > > I have made some cleaning to remove redundancy in includes call of some of > the headers in x86intrin.h. > Removed headers were included in both x86intrin.h and immintrin.h which is > included into x86intrin.h. > > Is it ok for t

[PATCH, i386]: Remove concat_tg_mode mode attribute.

2018-05-31 Thread Uros Bizjak
No functional changes. 2018-05-31 Uros Bizjak * config/i386/sse.md (avx_vec_concat): Substitute concat_tg_mode mode attribute with xtg_mode. (avx512dq_broadcast_1): Ditto. (concat_tg_mode): Remove mode attribute. Bootstrapped and regression tested on x86_64-linux-gnu

[PATCH, i386]: __builtin_cpu_is() is not detecting bdver2 with Model = 0x02

2018-05-31 Thread Uros Bizjak
Hello! As reported in the PR, AMDFAM15H model 0x2 should return AMDFAM15H_BDVER2 subtype. 2018-05-31 Uros Bizjak PR target/85591 * config/i386/cpuinfo.c (get_amd_cpu): Return AMDFAM15H_BDVER2 for AMDFAM15H model 0x2. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32

Re: [PATCH] Optimize AVX512 vpcmpeq* against 0 into vptestnm* rather than vptestm* (PR target/85832, PR target/86036)

2018-06-04 Thread Uros Bizjak
On Mon, Jun 4, 2018 at 3:08 PM, Jakub Jelinek wrote: > Hi! > > On Wed, May 23, 2018 at 08:45:19AM +0200, Jakub Jelinek wrote: >> As mentioned in the PR, vptestm* instructions with the same input operand >> used >> twice perform the same comparison as vpcmpeq* against zero vector, with the >> adva

Re: [patch][i386] Tremont -march/-mtune options

2018-06-05 Thread Uros Bizjak
On Mon, Jun 4, 2018 at 1:48 PM, Makhotina, Olga wrote: > > Hi, > > This patch implements Tremont -march/-mtune. > > 2018-06-04 Olga Makhotina > > gcc/ > > * config.gcc: Support "tremont". > * config/i386/driver-i386.c (host_detect_local_cpu): Detect "tremont". > * config

[PATCH, i386]: Fix several "operand missing mode" build warnings in i386.md

2018-06-05 Thread Uros Bizjak
No functional changes. 2018-06-05 Uros Bizjak * config/i386/i386.md (simple_return_indirect_internal): New expander. (*simple_return_indirect_internal): Rename from simple_return_indirect_internal. Use W mode iterator. (rstorssp): New expander. (*rstorssp): Rename from

Re: [PATCH] Fix x86 setcc + movzbl peephole2s (PR target/86314)

2018-06-26 Thread Uros Bizjak
On Tue, Jun 26, 2018 at 12:52 PM, Jakub Jelinek wrote: > Hi! > > These peephole2s assume that when matching insns like: > [(parallel [(set (reg FLAGS_REG) (match_operand 0)) > (match_operand 4)]) > that operands[4] must be a set of a register with operands[0] > as SET_SRC, but that

<    5   6   7   8   9   10   11   12   13   14   >