... as is the case with all other
gcc.dg/plugin/poly-int-0{1,2,3,4,5,6}_plugin.c testcases. This lowers
testcase wall time from 4min 45 sec to 1min 17sec on a slow target.
2018-08-01 Uros Bizjak
* gcc.dg/plugin/poly-int-07_plugin.c (dg-options): Use -O0.
Tested on alphaev68-linux-gnu
On Fri, Aug 3, 2018 at 12:55 AM, H.J. Lu wrote:
> We should always set cfun->machine->max_used_stack_alignment if the
> maximum stack slot alignment may be greater than 64 bits.
>
> Tested on i686 and x86-64. OK for master and backport for GCC 8?
Can you explain why 64 bits, and what this value
On Sat, Aug 4, 2018 at 3:59 PM, H.J. Lu wrote:
> On Sat, Aug 4, 2018 at 3:42 AM, Uros Bizjak wrote:
>> On Fri, Aug 3, 2018 at 12:55 AM, H.J. Lu wrote:
>>> We should always set cfun->machine->max_used_stack_alignment if the
>>> maximum stack slot alig
On Sat, Aug 4, 2018 at 9:49 PM, H.J. Lu wrote:
> On Sat, Aug 4, 2018 at 12:09 PM, Uros Bizjak wrote:
>> On Sat, Aug 4, 2018 at 3:59 PM, H.J. Lu wrote:
>>> On Sat, Aug 4, 2018 at 3:42 AM, Uros Bizjak wrote:
>>>> On Fri, Aug 3, 2018 at 12:55 AM, H.J. Lu wrote:
&
On Sun, Aug 5, 2018 at 12:48 AM, H.J. Lu wrote:
> On Sat, Aug 04, 2018 at 11:48:15PM +0200, Uros Bizjak wrote:
>> On Sat, Aug 4, 2018 at 9:49 PM, H.J. Lu wrote:
>> > On Sat, Aug 4, 2018 at 12:09 PM, Uros Bizjak wrote:
>> >> On Sat, Aug 4, 2018 at 3:59 PM, H.J. L
2018-08-06 Uros Bizjak
* g++.dg/torture/pr86763.C (dg-additional-options): Add -lrt.
Tested on CentOS 5.10 and Fedora 28.
OK for mainline?
Uros.
Index: g++.dg/torture/pr86763.C
===
--- g++.dg/torture/pr86763.C(revision
On Mon, Aug 6, 2018 at 5:23 PM, Jeff Law wrote:
> On 08/06/2018 09:10 AM, Uros Bizjak wrote:
>> 2018-08-06 Uros Bizjak
>>
>> * g++.dg/torture/pr86763.C (dg-additional-options): Add -lrt.
>>
>> Tested on CentOS 5.10 and Fedora 28.
>>
>> OK f
On Mon, Aug 6, 2018 at 5:44 PM, Jeff Law wrote:
> On 08/06/2018 09:33 AM, Uros Bizjak wrote:
>> On Mon, Aug 6, 2018 at 5:23 PM, Jeff Law wrote:
>>> On 08/06/2018 09:10 AM, Uros Bizjak wrote:
>>>> 2018-08-06 Uros Bizjak
>>>>
>>>> * g++.
On Thu, Aug 9, 2018 at 5:00 PM, Alexander Monakov wrote:
> Hello,
>
> on x86-64, 32-bit division by constants uses mulsi3_highpart pattern that
> turns into 'mull ' instruction with source implicitly in eax and
> result in edx:eax. However, using 64-bit multiplication with zero-extended
> source
This option is fairly ineffective, and in the light of CET, nobody
seems interested to improve it. Deprecate the option, so it won't lure
developers to the land of false security.
2018-08-10 Uros Bizjak
* config/i386/i386.opt (mmitigate-rop): Mark as deprecated.
* doc/invoke
On Sat, Aug 11, 2018 at 11:54 AM, Allan Sandfeld Jensen
wrote:
> On Samstag, 11. August 2018 11:18:39 CEST Jakub Jelinek wrote:
>> On Sat, Aug 11, 2018 at 10:59:26AM +0200, Allan Sandfeld Jensen wrote:
>> > +/* A subroutine of ix86_expand_vec_perm_builtin_1. Try to implement D
>> > + using movs
>>>> + for (i = 1; i < nelt; ++i) {
>>>> +{
>>>> + if (d->perm[i] != i + nelt - d->perm[0])
>>>> +return false;
>>>> +}
>>>> + }
>>>
>>> Extraneous {}s (both pairs, the
On Wed, Aug 15, 2018 at 5:56 AM, Jeff Law wrote:
> On 08/10/2018 05:42 AM, Uros Bizjak wrote:
>> This option is fairly ineffective, and in the light of CET, nobody
>> seems interested to improve it. Deprecate the option, so it won't lure
>> developers to the land of fal
Hello!
These instructions can take memory operands and current scan-assembler
strings were too tight to accept them.
2018-08-16 Uros Bizjak
PR testsuite/86745
* gcc.target/i386/avx-cvt-2.c: Loosen scan-assembler strings.
* gcc.target/i386/avx2-cvt-2.c: Ditto.
Tested on x86_64
Hello!
>> gcc/testsuite/
>> Changelog for gcc/testsuite/Changelog
>> 2018-08-14 Vlad Lazar
>>
>> * gcc.target/aarch64/imm_choice_comparison.c: New.
>>
>> gcc/
>> Changelog for gcc/Changelog
>> 2018-08-14 Vlad Lazar
>> * expmed.h (canonicalize_comparison): New declaration.
>> * ex
hat
can be stuffed into immediate field of an insn gets cost 0, and
everything else gets cost 1. This is not entirely correct, considering
how return of 0 is treated, but it is a minimum change that gets the
job done and doesn't regress the testsuite. If needed, we'll
eventually refine it
On Mon, Aug 20, 2018 at 8:19 PM, H.J. Lu wrote:
> On x86, return address is always popped in word_mode. eh_return needs
> to put EH return address in word_mode on stack.
>
> Tested on x86-64 with x32. OK for trunk and release branches?
OK. Perhaps the testcase should go into g++.dg/torture, sinc
2018-08-23 Uros Bizjak
* emit-rtl.c (init_emit_once): Do not emit MODE_POINTER_BOUNDS RTXes.
* emit-rtl.h (rtl_data): Remove return_bnd.
* explow.c (trunc_int_for_mode): Do not handle POINTER_BOUNDS_MODE_P.
* function.c (diddle_return_value): Do not handle crtl->return_
On Thu, Aug 30, 2018 at 7:14 AM, Thiago Macieira
wrote:
> The instruction set first appeared with Westmere, but not all processors
> in that and the next few generations have the instructions. According to
> Wikipedia[1], the first generation in which all SKUs have AES
> instructions are Skylake a
On Tue, Sep 4, 2018 at 4:28 PM, Jakub Jelinek wrote:
> Hi!
>
> The -mxsave{opt,s,c} options turn on automatically -mxsave option and
> the patterns rely on TARGET_XSAVE{OPT,S,C} implying TARGET_XSAVE,
> but if somebody uses e.g. -mxsave{opt,s,c} -mno-xsave (or something that
> implies
> -mno-xsav
Hello!
Attached patch removes invalid substitution of zero-extended HImode
operands with HImode operation. CLZ returns different value when
operating on SImode value vs. HImode value.
2017-06-08 Uros Bizjak
PR target/81015
Revert:
2016-12-14 Uros Bizjak
PR target/59874
On Tue, Jun 13, 2017 at 1:37 PM, Koval, Julia wrote:
> Thank you for your help. I fixed the test similar to existing sigaction tests.
>
> gcc/
> * config/i386/i386.c: Fix rounding expand for new pattern.
> * config/i386/subst.md: Fix pattern (parallel -> unspec).
> gcc/testsuite/
>
On Fri, Jun 16, 2017 at 8:46 AM, Koval, Julia wrote:
> Hi,
>
> This test hangs on avx512er, maybe that's why:
>> According to POSIX, the behavior of a process is undefined after it ignores
>> a SIGFPE, SIGILL, or SIGSEGV signal that was not generated by kill(2) or
>> raise(3).
>
> And volatile m
On Fri, Jun 16, 2017 at 11:42 PM, Matt Turner wrote:
> Currently -march=native selects -march=broadwell on Kaby Lake systems,
> since its model numbers are missing from the switch statement. It falls
> back to the default case and chooses -march=broadwell because of the
> presence of the ADX instr
On Fri, Jun 16, 2017 at 11:42 PM, Matt Turner wrote:
> gcc/
> * config/i386/driver-i386.c (host_detect_local_cpu): Assume
> skylake for unknown models with clflushopt.
Also OK.
Thanks,
Uros.
> ---
> gcc/config/i386/driver-i386.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> di
On Mon, Jun 19, 2017 at 5:37 PM, Jakub Jelinek wrote:
> Hi!
>
> This testcase started to ICE when PR70873 fix changed the splitter:
> @@ -5153,11 +5147,11 @@
> ;; slots when !TARGET_INTER_UNIT_MOVES_TO_VEC disables the general_regs
> ;; alternative in sse2_loadld.
> (define_split
> - [(set (ma
On Mon, Jun 19, 2017 at 7:51 PM, Jakub Jelinek wrote:
> On Mon, Jun 19, 2017 at 11:45:13AM -0600, Jeff Law wrote:
>> On 06/19/2017 11:29 AM, Jakub Jelinek wrote:
>> >
>> > Also, on i?86 orq $0, (%rsp) or orl $0, (%esp) is used to probe stack,
>> > while it is shorter, is it actually faster or as s
On Tue, Jun 20, 2017 at 12:18 PM, Richard Biener
wrote:
> On Tue, Jun 20, 2017 at 10:03 AM, Uros Bizjak wrote:
>> On Mon, Jun 19, 2017 at 7:51 PM, Jakub Jelinek wrote:
>>> On Mon, Jun 19, 2017 at 11:45:13AM -0600, Jeff Law wrote:
>>>> On 06/19/2017
On Tue, Jun 20, 2017 at 2:13 PM, Florian Weimer wrote:
> On 06/20/2017 01:10 PM, Uros Bizjak wrote:
>
>> 74,99% a.outa.out [.] test_or
>> 12,50% a.outa.out [.] test_movb
>> 12,50% a.outa.out [.] test_movl
>
> Could
On Tue, Jun 20, 2017 at 2:17 PM, Uros Bizjak wrote:
> On Tue, Jun 20, 2017 at 2:13 PM, Florian Weimer wrote:
>> On 06/20/2017 01:10 PM, Uros Bizjak wrote:
>>
>>> 74,99% a.outa.out [.] test_or
>>> 12,50% a.outa.out [.] test_
2017-06-20 Uros Bizjak
* config/abi/post/alpha-linux-gnu/baseline_symbols.txt: Update.
Tested on alphaev68-linux-gnu and committed to mainline SVN.
Uros.
Index: config/abi/post/alpha-linux-gnu/baseline_symbols.txt
2017-06-20 Uros Bizjak
* gcc.target/i386/pr80732.c: Include fma4-check.h.
(main): Renamed to ...
(fma4_test): ... this.
Tested on x86_64-linux-gnu and committed to mainline SVN.
Uros.
Index: gcc.target/i386/pr80732.c
This patch inroduces applyRelocationsALPHA to solve:
FAIL: TestCgoConsistentResults
FAIL: TestCgoPkgConfig
FAIL: TestCgoHandlesWlORIGIN
gotools errors.
Bootstrapped and regression tested on alphaev68-linux-gnu.
Uros.
Index: go/debug/elf/file.go
==
Hello!
> glibc marks fegetround as a pure function. On x86, people tend to use
> _MM_GET_ROUNDING_MODE instead, which could benefit from the same. I think it
> is safe, but
> a second opinion would be welcome. I could have handled just this builtin,
> but it seemed better to
> provide def_builti
On Wed, Jun 21, 2017 at 8:27 PM, Jakub Jelinek wrote:
> Hi!
>
> This expander has a gap in between the operands and match_dup indexes,
> which results in genemit generating:
> operand2 = operands[2];
> (void) operand2;
> where operands[2] has not been initialized.
>
> Fixed thusly, bootstr
On Thu, Jun 22, 2017 at 12:39 AM, Ian Lance Taylor wrote:
> On Tue, Jun 20, 2017 at 12:46 PM, Uros Bizjak wrote:
>> This patch inroduces applyRelocationsALPHA to solve:
>>
>> FAIL: TestCgoConsistentResults
>> FAIL: TestCgoPkgConfig
>> FAIL: TestCgoHan
On Fri, Jun 23, 2017 at 3:22 PM, Richard Biener wrote:
> On Fri, 23 Jun 2017, Marc Glisse wrote:
>
>> On Fri, 23 Jun 2017, Richard Biener wrote:
>>
>> > The vectorizer is confused about the spurious VDEFs that are caused
>> > by gather vectorization so the following avoids them by making the
>> >
Hello!
libgo is now able to automatically determine PtraceRegs. Attached
patch removes duplicate manual definition from system dependent
source.
Bootstrapped and regression tested on alphaev68-linux-gnu.
Uros.
Index: go/syscall/syscall_linux_alpha.go
=
On Tue, Jun 27, 2017 at 12:02 PM, Jakub Jelinek wrote:
> On Fri, Jun 23, 2017 at 02:54:35PM +0200, Richard Biener wrote:
>> 2017-06-23 Richard Biener
>>
>> PR target/81175
>> * config/i386/i386.c (struct builtin_isa): Add pure_p member.
>> (def_builtin2): Initialize pure_p.
>>
On Wed, Jun 28, 2017 at 9:37 AM, Jakub Jelinek wrote:
> On Tue, Jun 27, 2017 at 10:52:47AM -0700, Andrew Pinski wrote:
>> On Tue, Jun 27, 2017 at 7:56 AM, Richard Biener
>> wrote:
>> > On June 27, 2017 4:52:28 PM GMT+02:00, Tamar Christina
>> > wrote:
>> >>> >> +(for cmp (gt ge lt le)
>> >>> >>
On Wed, Jun 28, 2017 at 12:01 PM, Peryt, Sebastian
wrote:
> Hi,
>
> This patch adds missing intrinsics:
> - _mm256_permutexvar_epi32
> - _mm256_permutex_epi64
> - _mm256_permutexvar_epi64
>
> gcc/
> * config/i386/avx512vlintrin.h (_mm256_permutexvar_epi64,
> _mm256
Hello!
> This patch to the gotools Makefile adds tests to `make check`. We now
> test the runtime package using the newly built go tool, and test that
> cgo works by running the misc/cgo/test and misc/cgo/testcarchive
> tests. Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.
> Committed
On Tue, Jul 4, 2017 at 10:35 AM, Jakub Jelinek wrote:
> Hi!
>
> On Tue, Jun 27, 2017 at 12:27:25PM +0200, Jakub Jelinek wrote:
>> Fixed thusly, ok for trunk? Perhaps we should add another testcase to check
>> similarly gatherpf builtin without the lhs, but we'd need different options.
>
> I'd lik
Hello!
Apparently, Intel changed operand order with the new intrinsic
reference release version. Attached patch updates gcc intrinsic
headers accordingly.
2017-07-04 Uros Bizjak
PR target/81294
* config/i386/adxintrin.h (_subborrow_u32): Swap _X and _Y
arguments in the call to
Hello!
Attached patch tightens peephole2 condition to prevent unwanted
flags_reg clobbering by insn patterns, emitted by ix86_expand_clear.
2017-07-04 Uros Bizjak
PR target/81300
* config/i386/i386.md (setcc + movzbl/and to xor + setcc peepholes):
Require dead FLAGS_REG at the
Hello!
New register allocator alternative decorations allows us to not
penalize alternatives *unless* reload is required. The '$' is
described as:
'$'
This constraint is analogous to '!' but it disparages severely the
alternative only if the operand with the '$' needs a reload.
and fit
> The issues fixed by the previous patch together with this one result
> in the testcase from the PR with -mtune=intel (for some reason with
> generic tuning we decide to perform the 256-bit load as 2 128-bit loads and
> don't merge that into 256-bit comparison operand, shall we change that?)
> to
On Wed, Nov 8, 2017 at 9:02 AM, Koval, Julia wrote:
> Attachment got lost.
>
>> -Original Message-
>> From: Koval, Julia
>> Sent: Wednesday, November 08, 2017 9:01 AM
>> To: 'GCC Patches'
>> Cc: 'Uros Bizjak' ; 'Kirill Yu
> gcc/:
> 2017-11-08 Andi Kleen
>
> * config/i386/i386.opt: Add -mforce-indirect-call.
> * config/i386/predicates.md: Check for flag_force_indirect_call.
> * doc/invoke.texi: Document -mforce-indirect-call
>
> gcc/testsuite/:
> 2017-11-08 Andi Kleen
>
> * gcc.target/i386/force-indirect-call-1
2017-11-10 Uros Bizjak
* gcc.target/i386/force-indirect-call-1.c: Merge scan strings.
* gcc.target/i386/force-indirect-call-2.c: Ditto.
Require fpic effective target.
* gcc.target/i386/force-indirect-call-3.c: Ditto.
Require lp64 effective target.
Tested on x86_64-linux
On Sat, Nov 11, 2017 at 10:10 PM, Koval, Julia wrote:
> Hi Uros,
> I fixed comments.
> Btw, I haven't found skylake-avx512 in driver-i386.c at all. Is it intended
> or should I add it?
It looks like an oversight to me. If there are no "skylake-avx512"
model, then the driver goes through "This is
On Sun, Nov 12, 2017 at 1:04 AM, Koval, Julia wrote:
> Hi, this patch adds new option -march=icelake. Isasets defined in:
> https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf
> I didn't add arch code to driver-i386.c, bec
On Mon, Nov 13, 2017 at 11:29 AM, Koval, Julia wrote:
> Hi, here is followup patch to add skylake-avx512.
> gcc/
> * config/i386/driver-i386.c (host_detect_local_cpu): Detect
> skylake-avx512.
OK.
Thanks,
Uros.
On Mon, Nov 13, 2017 at 6:25 PM, Shalnov, Sergey
wrote:
> Hi,
> Modern architectures provides wider and wider vector registers. This patch
> implements
> common (in i386 arch) option to prefer vector register width for the
> vectorizer.
> Currently, GCC has "-mprefer-avx128" and "-mprefer-avx256
On Mon, Nov 13, 2017 at 9:13 PM, Uros Bizjak wrote:
> On Mon, Nov 13, 2017 at 6:25 PM, Shalnov, Sergey
> wrote:
>> Hi,
>> Modern architectures provides wider and wider vector registers. This patch
>> implements
>> common (in i386 arch) option to prefer
On Tue, Nov 14, 2017 at 12:14 AM, Joseph Myers wrote:
> On Mon, 13 Nov 2017, Uros Bizjak wrote:
>
>> [BTW: --mprefer-avx128 should be marked RejectNegative from the
>> beginning; let's just assume nobody uses it in its (somehow weird)
>> negative "-mno-prefer-av
; Cc: Jakub Jelinek ; gcc-patches@gcc.gnu.org; Uros Bizjak
>> ; Kirill Yukhin ; Lu, Hongjiu
>>
>> Subject: Re: [PATCH][i386] PR82941/PR82942 - Adding vzeroupper generation
>> for SKX
>>
>> On Tue, Nov 14, 2017 at 3:18 AM, Peryt, Sebastian
>> wrote:
>
On Wed, Nov 15, 2017 at 2:37 PM, H.J. Lu wrote:
> -mzeroupper is specified to generate vzeroupper instruction. If it
> isn't used, the default should depend on !TARGET_AVX512ER. Users can
> always use -mzeroupper or -mno-zeroupper to override it.
>
> Sebastian, can you run the full test with it?
On Wed, Nov 15, 2017 at 5:59 PM, H.J. Lu wrote:
> On Wed, Nov 15, 2017 at 8:09 AM, Uros Bizjak wrote:
>> On Wed, Nov 15, 2017 at 2:37 PM, H.J. Lu wrote:
>>> -mzeroupper is specified to generate vzeroupper instruction. If it
>>> isn't used, the default should de
"1:\tnopl 0x01(%%eax,%%eax,1)\n"); /* 5 byte nop. */
Even the above change is not correct, since it will be assembled in a
different way on 32 bit and 64 bit targets (size prefix will be added
on 64 bit targets). Attached patch fixes this issue by emitting a
stream of
On Fri, Nov 17, 2017 at 10:18 AM, Koval, Julia wrote:
> Hi, this patch introduces separate cost model for skylake-avx512. Ok for
> trunk?
>
> gcc/
> * config/i386/i386.c (processor_target_table): Add skylake_cost for
> skylake-avx512.
> * config/i386/x86-tune-costs.h (skyl
Hello!
Attached patch introduces bswaphi2 named insn pattern that results in
movbe instruction.
Without the patch, the following testcase:
movw%si, (%rdi)
and with patched compiler:
movbe %si, (%rdi)
2017-11-20 Uros Bizjak
* config/i386/i386.md (bswaphi2): New expander.
(*bswaphi2_movbe): New insn pattern.
(bswaphi -> rorhi pepehole2): New peephole pattern.
testsuite/ChangeLog:
2017-11-20 Ur
On Mon, Nov 20, 2017 at 4:51 PM, Marek Polacek wrote:
> On Thu, Nov 16, 2017 at 02:20:59PM -0500, Jason Merrill wrote:
>> On Thu, Nov 16, 2017 at 12:41 PM, Marek Polacek wrote:
>> > On Tue, Nov 14, 2017 at 07:34:54AM +0100, Richard Biener wrote:
>> >> On November 14, 2017 6:21:41 AM GMT+01:00, Ja
On Tue, Nov 21, 2017 at 4:50 PM, Shalnov, Sergey
wrote:
> Uros,
> I did new patch with all comments addressed as proposed.
> 1. old option -mprefer-avx128 is Alias(mprefer-vector-width=, 128, none)
> 2. Simplified default initialization (as Bernhard proposed)
> 3. Fixed documentation (proposed by
2017-11-21 Uros Bizjak
* config/i386/i386.md (*bswap2_movbe): Add
integer suffix to movbe mnemonic.
(*bswaphi2_movbe): Ditto.
(bswaphi_lowpart): Merge with *bswaphi_lowpart_1.
testsuite/ChangeLog:
2017-11-21 Uros Bizjak
* gcc.target/i386/movbe-1.c: Update scan string
I have committed the attached patch.
Uros.
On Tue, Nov 21, 2017 at 6:18 PM, Shalnov, Sergey
wrote:
> Uros,
> Yes, please. Thank you for your proposals and comments.
> Please commit as you proposed.
> Sergey
>
> -Original Message-
> From: Uros Bizjak [mailto:ubiz
On Wed, Nov 22, 2017 at 3:58 PM, Shalnov, Sergey
wrote:
> Hi,
> This patch making –mprefer-vector-width= option inclusive. This means that
> if we use –mprefer-vector-width=128 it should switch TARGET_PREFER_AVX128=ON
> and TARGET_PREFER_AVX256=ON also.
> It is minor change to generate “xmm” with
These are the same as bswap effective target. Also, all listed targets
are capable of int32plus.
2017-11-22 Uros Bizjak
* lib/target-supports.exp (check_effective_target_bswap16): Remove
(check_effective_target_bswap32): Ditto.
(check_effective_target_bswap64): Ditto.
* gcc.dg
On Sun, Nov 26, 2017 at 10:56 PM, Tsimbalist, Igor V
wrote:
> According to the description of inssp instruction from Intel CET it adusts
> the shadow stack pointer (ssp) only by value in the range of [0..255]. As a
> number of adjustment could be greater than 255 there should be a loop
> generate
On Wed, Nov 29, 2017 at 9:24 AM, Jakub Jelinek wrote:
> Hi!
>
> Before r218303 we had just (=x,0,rm) alternative for SSE4 (no AVX),
> that change turned it into (=Yr,0,*rm) and (=*x,0,rm) alternatives,
> so that we avoid too many prefixes if possible.
> The latter alternative is fine, we want the
On Wed, Nov 29, 2017 at 10:39 AM, Peryt, Sebastian
wrote:
> Hi,
>
> I'd like to ask for backporting to GCC-7 branch vzeroupper generation patches
> from trunk,
> that are resolving 3 PRs:
> PR target/82941
> PR target/82942
> PR target/82990
>
> Two patches were combined into one and rebased. Boo
On Wed, Nov 29, 2017 at 10:46 AM, Peryt, Sebastian
wrote:
> Hi,
>
> I'd like to ask for backporting to GCC-6 branch vzeroupper generation patches
> from trunk,
> that are resolving 3 PRs:
> PR target/82941
> PR target/82942
> PR target/82990
>
> Two patches were combined into one and rebased. Boo
On Wed, Nov 29, 2017 at 12:05 PM, Shalnov, Sergey
wrote:
> Hi,
> I found wrong MODE_XI used in movdi_internal that cause zmm
> Generation with "-march=skylake-avx512 -mprefer-vector-width=128"
> options set. This patch fixes the mode and register type but keep using
> AVX512 instruction set.
IMO,
On Wed, Nov 29, 2017 at 1:10 PM, Uros Bizjak wrote:
> On Wed, Nov 29, 2017 at 12:05 PM, Shalnov, Sergey
> wrote:
>> Hi,
>> I found wrong MODE_XI used in movdi_internal that cause zmm
>> Generation with "-march=skylake-avx512 -mprefer-vector-width=128"
>>
On Thu, Dec 7, 2017 at 5:48 PM, Jakub Jelinek wrote:
> Hi!
>
> As mentioned in the PR, the code emitted by ix86_expand_rint
> doesn't work with rounding to +/- infinity.
> This patch adjusts it if flag_rounding_math to do something that works
> well even for that case (should be just one insn long
On Mon, Dec 11, 2017 at 9:34 AM, Koval, Julia wrote:
> Hi Uros, Kirill,
> According to isa-extensions doc CLWB appeared first in Skylake-avx512, but it
> isn't in the PTA. This patch fixes it. Ok for trunk?
Please also include ChangeLog entry in your patch submission.
Uros.
On Tue, Dec 12, 2017 at 10:08 AM, Koval, Julia wrote:
> Sorry,
>
> gcc/
> * config/i386/i386.c (PTA_SKYLAKE_AVX512): Add PTA_CLWB.
> (PTA_CANNONLAKE): Remove PTA_CLWB.
Approved and committed to mainline SVN.
Thanks,
Uros.
ALITY_TEST ": %s is %lli\n", name, value);
2017-12-17 Uros Bizjak
* gcc.dg/guality/guality.h (guality_check): Cast %lli arguments
inf fprintf statements to long long int.
Tested on x86_64-linux-gnu {,-m32} and committed to mainline SVN.
Uros.
Index: gcc.
On Mon, Dec 18, 2017 at 2:42 PM, Koval, Julia wrote:
> Hi, I tried to replace 2 flags variable with c++ bitset(in patch attached).
> What do you think?
Hm, I'm not a c++ person, but I wonder about overhead and performance
impact of this change. Maybe [] operator could be used instead of a
dynami
On Wed, Dec 20, 2017 at 8:37 PM, Jakub Jelinek wrote:
> Hi!
>
> As you know, we ran out of ix86_isa_flags bitmask bits some time ago.
> The testcase from the PR's #c0 (which I'm not adding into testsuite, because
> it will be useless any time *.opt is modified with any of the
> OPTION_MASK_ISA* bi
Hello!
Attached patch fixes non-BMI2 shift define-and-split instructions that
remove unnecessary masking of count operand by adding a register
constraints that allows only CX hard register.
2017-12-21 Uros Bizjak
PR target/83467
* config/i386/i386.md (*ashl3_mask): Add operand
On Fri, Dec 22, 2017 at 11:35 AM, Tsimbalist, Igor V
wrote:
> This is a follow up patch for pr83488 to fix an error in setting
>
> OPTION_MASK_ISA_AVX512VNNI_SET and OPTION_MASK_ISA_AVX512F_SET bits.
>
> There were both set in ix86_isa_flags2 while being defined in
>
> different ISA sets. Addition
On Thu, Dec 28, 2017 at 10:06 AM, Jakub Jelinek wrote:
> Hi!
>
> I've noticed various formatting issues in the recently added ISA support
> patterns. No functional changes, bootstrapped/regtested on x86_64-linux and
> i686-linux, ok for trunk?
>
> OT, wonder why we have any of the maskz and maskz
On Thu, Dec 28, 2017 at 4:18 PM, Jakub Jelinek wrote:
> Hi!
>
> Binutils had vpopcnt[dq] support since ~ January, but only for the 512-bit
> instructions, only in ~ October further support for the AVX512VPOPCNTDQ |
> AVX512VL instructions has been added. So, if one is using gas in between
> those
On Thu, Dec 28, 2017 at 4:18 PM, Jakub Jelinek wrote:
> Hi!
>
> Binutils had vpopcnt[dq] support since ~ January, but only for the 512-bit
> instructions, only in ~ October further support for the AVX512VPOPCNTDQ |
> AVX512VL instructions has been added. So, if one is using gas in between
> those
n in the generic part of the compiler,
since gcc-5 was able to simplify this combination.
[1] https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01841.html
2018-01-04 Uros Bizjak
PR target/83628
* config/alpha/alpha.md (*sadd): Use ASHIFT
instead of MULT rtx. Update all corresponding split
recognized because the second alternative is
> disabled.
>
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk?
>
> 2018-01-04 Jakub Jelinek
> Uros Bizjak
>
> PR target/83554
> * config/i386/i386.md
runcate to op_mode again, so it all looks like a typo to me.
2018-01-08 Uros Bizjak
PR target/83628
* combine.c (force_int_to_mode) : Use mode instead of
op_mode in the force_to_mode call.
Together with a follow-up target patch, the patch fixes
gcc.target/alpha/pr83628-2.c scan-as
Hello!
Attached patch improves fix for PR83628 by providing ashlsi3 pattern.
This allows combiner to remove subregs of inner DImode ashift.
2018-05-25 Uros Bizjak
PR target/83628
* config/alpha/alpha.md (ashlsi3): New insn pattern.
(*ashlsi_se): Rename from *ashldi_se. Define as
On Fri, May 25, 2018 at 11:09 PM, Jakub Jelinek wrote:
> Hi!
>
> The optab is looking for floatuns2 and
> fixuns_trunc2, but some of the patterns are instead called
> ufloat2 or ufix_trunc2
> and thus are only used from intrinsics.
>
> We can't change all spots, in two spots we have intentionally
On Mon, May 28, 2018 at 11:58 AM, Jakub Jelinek wrote:
> Hi!
>
> AVX512DQ and AVX512DQ/AVX512VL has instructions for vector float <->
> {,unsigned} long long conversions. The following patch adds the missing
> tree codes, optabs and expanders to make this possible.
>
> Bootstrapped/regtested on x
Hello!
Attached patch enables l2 for
TARGET_SSE4.1, and while there, also corrects operand 1 predicate of
rounds{s,d} instruction.
2018-05-29 Uros Bizjak
PR target/85950
* config/i386/i386.md (l2):
Enable for TARGET_SSE4_1 and generate rounds{s,d} and cvtts{s,d}2si{,q
On Wed, May 30, 2018 at 2:44 PM, Peryt, Sebastian
wrote:
> Hi,
>
> I have made some cleaning to remove redundancy in includes call of some of
> the headers in x86intrin.h.
> Removed headers were included in both x86intrin.h and immintrin.h which is
> included into x86intrin.h.
>
> Is it ok for t
No functional changes.
2018-05-31 Uros Bizjak
* config/i386/sse.md (avx_vec_concat):
Substitute concat_tg_mode mode attribute with xtg_mode.
(avx512dq_broadcast_1): Ditto.
(concat_tg_mode): Remove mode attribute.
Bootstrapped and regression tested on x86_64-linux-gnu
Hello!
As reported in the PR, AMDFAM15H model 0x2 should return
AMDFAM15H_BDVER2 subtype.
2018-05-31 Uros Bizjak
PR target/85591
* config/i386/cpuinfo.c (get_amd_cpu): Return
AMDFAM15H_BDVER2 for AMDFAM15H model 0x2.
Bootstrapped and regression tested on x86_64-linux-gnu {,-m32
On Mon, Jun 4, 2018 at 3:08 PM, Jakub Jelinek wrote:
> Hi!
>
> On Wed, May 23, 2018 at 08:45:19AM +0200, Jakub Jelinek wrote:
>> As mentioned in the PR, vptestm* instructions with the same input operand
>> used
>> twice perform the same comparison as vpcmpeq* against zero vector, with the
>> adva
On Mon, Jun 4, 2018 at 1:48 PM, Makhotina, Olga
wrote:
>
> Hi,
>
> This patch implements Tremont -march/-mtune.
>
> 2018-06-04 Olga Makhotina
>
> gcc/
>
> * config.gcc: Support "tremont".
> * config/i386/driver-i386.c (host_detect_local_cpu): Detect "tremont".
> * config
No functional changes.
2018-06-05 Uros Bizjak
* config/i386/i386.md (simple_return_indirect_internal): New expander.
(*simple_return_indirect_internal): Rename from
simple_return_indirect_internal. Use W mode iterator.
(rstorssp): New expander.
(*rstorssp): Rename from
On Tue, Jun 26, 2018 at 12:52 PM, Jakub Jelinek wrote:
> Hi!
>
> These peephole2s assume that when matching insns like:
> [(parallel [(set (reg FLAGS_REG) (match_operand 0))
> (match_operand 4)])
> that operands[4] must be a set of a register with operands[0]
> as SET_SRC, but that
901 - 1000 of 6424 matches
Mail list logo