Re: [PATCH/RFC] combine: Tweak the condition of last_set invalidation

2021-01-15 Thread Kewen.Lin via Gcc-patches
Hi Segher, Thanks for the comments! on 2021/1/15 上午8:22, Segher Boessenkool wrote: > Hi! > > On Wed, Dec 16, 2020 at 04:49:49PM +0800, Kewen.Lin wrote: >> When I was investigating unsigned int vec_init issue on Power, >> I happened to find there seems something we can enhance in how >> combine p

[PATCH] vect: Use factored nloads for load cost modeling [PR82255]

2021-01-15 Thread Kewen.Lin via Gcc-patches
Hi, This patch follows Richard's suggestion in the thread discussion[1], it's to factor out the nloads computation in vectorizable_load for strided access, to ensure we can obtain the consistent information when estimating the costs. btw, the reason why I didn't try to save the information into s

Re: The performance data for two different implementation of new security feature -ftrivial-auto-var-init

2021-01-15 Thread Richard Biener
On Thu, 14 Jan 2021, Qing Zhao wrote: Hi,  More data on code size and compilation time with CPU2017: Compilation time data:   the numbers are the slowdown against the default “no”: benchmarks  A/no D/no                          500.perlbench_r 5.19% 1.95% 502.gcc_r 0.46% -0.23% 505.

Re: [PATCH] c-family, v2: Improve MEM_REF printing for diagnostics [PR98597]

2021-01-15 Thread Jakub Jelinek via Gcc-patches
On Thu, Jan 14, 2021 at 07:26:36PM +0100, Jakub Jelinek via Gcc-patches wrote: > Is this ok for trunk if it passes bootstrap/regtest? So, x86_64-linux bootstrap unfortunately broke due to the -march=i486 changes, but at least i686-linux bootstrap succeeded and shows 2 regressions. One is on g++.d

[PATCH][testsuite] (committed) Fix sed script errors in complex tests

2021-01-15 Thread Tamar Christina via Gcc-patches
Hi All, I ran sed script late over the tests which accidentally introduced a syntax error in the tests. This fixes it. Committed under the obvious rule. Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/testsuite/ChangeLog: * gcc.dg/vect/complex/com

Re: [PATCH 2/3] arm: Auto-vectorization for MVE: vshl

2021-01-15 Thread Christophe Lyon via Gcc-patches
ping^3? On Thu, 7 Jan 2021 at 13:20, Christophe Lyon wrote: > > ping^2? > > On Wed, 30 Dec 2020 at 11:34, Christophe Lyon > wrote: > > > > ping? > > > > On Thu, 17 Dec 2020 at 18:48, Christophe Lyon > > wrote: > > > > > > This patch enables MVE vshlq instructions for auto-vectorization. > > > >

Re: [PATCH 3/3] arm: Auto-vectorization for MVE: vshr

2021-01-15 Thread Christophe Lyon via Gcc-patches
ping^3? On Thu, 7 Jan 2021 at 13:20, Christophe Lyon wrote: > > ping^2? > > On Wed, 30 Dec 2020 at 11:34, Christophe Lyon > wrote: > > > > ping? > > > > On Thu, 17 Dec 2020 at 18:48, Christophe Lyon > > wrote: > > > > > > This patch enables MVE vshr instructions for auto-vectorization. New > >

RE: [PATCH 2/3] arm: Auto-vectorization for MVE: vshl

2021-01-15 Thread Kyrylo Tkachov via Gcc-patches
> -Original Message- > From: Gcc-patches On Behalf Of > Christophe Lyon via Gcc-patches > Sent: 17 December 2020 17:48 > To: gcc-patches@gcc.gnu.org > Subject: [PATCH 2/3] arm: Auto-vectorization for MVE: vshl > > This patch enables MVE vshlq instructions for auto-vectorization. > > T

RE: [PATCH 3/3] arm: Auto-vectorization for MVE: vshr

2021-01-15 Thread Kyrylo Tkachov via Gcc-patches
> -Original Message- > From: Gcc-patches On Behalf Of > Christophe Lyon via Gcc-patches > Sent: 17 December 2020 17:48 > To: gcc-patches@gcc.gnu.org > Subject: [PATCH 3/3] arm: Auto-vectorization for MVE: vshr > > This patch enables MVE vshr instructions for auto-vectorization. New >

Re: Add dg-require-wchars to libstdc++ testsuite

2021-01-15 Thread Jonathan Wakely via Gcc-patches
On Thu, 14 Jan 2021, 22:22 Alexandre Oliva, wrote: > On Jan 14, 2021, Jonathan Wakely wrote: > > > The problem is that uses wchar_t in default > > template arguments: > > > I think we should fix the header, not disable tests that don't use > > that default template argument. The attached patch

[PATCH] tree-optimization/98685 - fix placement of extern converts

2021-01-15 Thread Richard Biener
Avoid advancing to the next stmt when inserting at region boundary and deal with a vector def being not the only child. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. 2021-01-15 Richard Biener PR tree-optimization/98685 * tree-vect-slp.c (vect_schedule_slp_node):

Re: [PATCH] c-family, v2: Improve MEM_REF printing for diagnostics [PR98597]

2021-01-15 Thread Richard Biener
On Thu, 14 Jan 2021, Jakub Jelinek wrote: > On Thu, Jan 14, 2021 at 10:49:42AM -0700, Martin Sebor wrote: > > > In the light of Martins patch this is probably reasonable but still > > > the general direction is wrong (which is why I didn't approve Martins > > > original patch). I'm also somewhat

Re: [PATCH 2/3] arm: Auto-vectorization for MVE: vshl

2021-01-15 Thread Christophe Lyon via Gcc-patches
On Fri, 15 Jan 2021 at 10:42, Kyrylo Tkachov wrote: > > > > > -Original Message- > > From: Gcc-patches On Behalf Of > > Christophe Lyon via Gcc-patches > > Sent: 17 December 2020 17:48 > > To: gcc-patches@gcc.gnu.org > > Subject: [PATCH 2/3] arm: Auto-vectorization for MVE: vshl > > > > T

Re: [PATCH] c-family, v2: Improve MEM_REF printing for diagnostics [PR98597]

2021-01-15 Thread Richard Biener
On Fri, 15 Jan 2021, Jakub Jelinek wrote: > On Thu, Jan 14, 2021 at 07:26:36PM +0100, Jakub Jelinek via Gcc-patches wrote: > > Is this ok for trunk if it passes bootstrap/regtest? > > So, x86_64-linux bootstrap unfortunately broke due to the -march=i486 > changes, but at least i686-linux bootstra

Re: [PATCH] arm: Implement vceqq_p64, vceqz_p64 and vceqzq_p64 intrinsics

2021-01-15 Thread Christophe Lyon via Gcc-patches
ping? On Fri, 6 Nov 2020 at 16:22, Christophe Lyon wrote: > > On Thu, 5 Nov 2020 at 12:55, Christophe Lyon > wrote: > > > > On Thu, 5 Nov 2020 at 10:36, Kyrylo Tkachov wrote: > > > > > > H, Christophe, > > > > > > > -Original Message- > > > > From: Gcc-patches On Behalf Of > > > > Chr

RE: [PATCH] arm: Implement vceqq_p64, vceqz_p64 and vceqzq_p64 intrinsics

2021-01-15 Thread Kyrylo Tkachov via Gcc-patches
> -Original Message- > From: Christophe Lyon > Sent: 06 November 2020 15:23 > To: Kyrylo Tkachov > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH] arm: Implement vceqq_p64, vceqz_p64 and vceqzq_p64 > intrinsics > > On Thu, 5 Nov 2020 at 12:55, Christophe Lyon > wrote: > > > > On Th

[PATCH v5 01/33] Add and restructure function declaration macros

2021-01-15 Thread Daniel Engel
Most of these changes support subsequent patches in this series. Particularly, the FUNC_START macro becomes part of a new macro chain: * FUNC_ENTRY Common global symbol directives * FUNC_START_SECTION FUNC_ENTRY to start a new * FUNC_START FUNC_START_SECTION <

[PATCH v5 00/33] libgcc: Thumb-1 Floating-Point Library for Cortex M0

2021-01-15 Thread Daniel Engel
Changes since v4: * Revised all commit messages per GCC standard form. * Split preamble patch 1 into 4 distinct changes. * Flattened previously-created directory "bits" * Added patch to fix unified syntax compiler warnings. * Moved CFI macro changes to preamble patch 1. * Added interim copyrig

[PATCH v5 02/33] Rename THUMB_FUNC_START to THUMB_FUNC_ENTRY

2021-01-15 Thread Daniel Engel
Since THUMB_FUNC_START does not insert the ".text" directive, it aligns more closely with the new FUNC_ENTRY maro and is renamed accordingly. THUMB_FUNC_START usage has been universally synonymous with the ".force_thumb" directive, so this is now folded into the definition. Usage of ".force_thumb"

[PATCH v5 03/33] Fix syntax warnings on conditional instructions

2021-01-15 Thread Daniel Engel
gcc/libgcc/ChangeLog: 2021-01-14 Daniel Engel * config/arm/lib1funcs.S (RETLDM, ARM_DIV_BODY, ARM_MOD_BODY, _interwork_call_via_lr): Moved condition code after the flags update specifier "s". (ARM_FUNC_START, THUMB_LDIV0): Removed redundant ".syntax". --- libgcc/c

[PATCH v5 04/33] Reorganize LIB1ASMFUNCS object wrapper macros

2021-01-15 Thread Daniel Engel
This will make it easier to isolate changes in subsequent patches. gcc/libgcc/ChangeLog: 2021-01-14 Daniel Engel * config/arm/t-elf (LIB1ASMFUNCS): Split macros into logical groups. --- libgcc/config/arm/t-elf | 66 + 1 file changed, 53 insertions

[PATCH v5 05/33] Add the __HAVE_FEATURE_IT and IT() macros

2021-01-15 Thread Daniel Engel
These macros complement and extend the existing do_it() macro. Together, they streamline the process of optimizing short branchless contitional sequences to support ARM, Thumb-2, and Thumb-1. The inherent architecture limitations of Thumb-1 means that writing assembly code is somewhat more tedious

[PATCH v5 06/33] Refactor 'clz' functions into a new file

2021-01-15 Thread Daniel Engel
This will make it easier to isolate changes in subsequent patches. gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/lib1funcs.S (__clzsi2i, __clzdi2): Moved to ... * config/arm/clz2.S: New file. --- libgcc/config/arm/clz2.S | 145 ++

[PATCH v5 07/33] Refactor 'ctz' functions into a new file

2021-01-15 Thread Daniel Engel
This will make it easier to isolate changes in subsequent patches. gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/lib1funcs.S (__ctzsi2): Moved to ... * config/arm/ctz2.S: New file. --- libgcc/config/arm/ctz2.S | 86 +++ libgcc/co

[PATCH v5 08/33] Refactor 64-bit shift functions into a new file

2021-01-15 Thread Daniel Engel
This will make it easier to isolate changes in subsequent patches. gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/lib1funcs.S (__ashldi3, __ashrdi3, __lshldi3): Moved to ... * config/arm/eabi/lshift.S: New file. --- libgcc/config/arm/eabi/lshift.S | 123 +

[PATCH v5 09/33] Import 'clz' functions from the CM0 library

2021-01-15 Thread Daniel Engel
On architectures without __ARM_FEATURE_CLZ, this version combines __clzdi2() with __clzsi2() into a single object with an efficient tail call. Also, this version merges the formerly separate Thumb and ARM code implementations into a unified instruction sequence. This change significantly improves

[PATCH v5 10/33] Import 'ctz' functions from the CM0 library

2021-01-15 Thread Daniel Engel
This version combines __ctzdi2() with __ctzsi2() into a single object with an efficient tail call. The former implementation of __ctzdi2() was in C. On architectures without __ARM_FEATURE_CLZ, this version merges the formerly separate Thumb and ARM code sequences into a unified instruction sequen

[PATCH v5 11/33] Import 64-bit shift functions from the CM0 library

2021-01-15 Thread Daniel Engel
The Thumb versions of these functions are each 1-2 instructions smaller and faster, and branchless when the IT instruction is available. The ARM versions were converted to the "xxl/xxh" big-endian register naming convention, but are otherwise unchanged. gcc/libgcc/ChangeLog: 2021-01-13 Daniel Eng

[PATCH v5 12/33] Import 'clrsb' functions from the CM0 library

2021-01-15 Thread Daniel Engel
This implementation provides an efficient tail call to __clzsi2(), making the functions rather smaller and faster than the C versions. gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/bits/clz2.S (__clrsbsi2, __clrsbdi2): Added new functions. * config/arm/t-elf

[PATCH v5 13/33] Import 'ffs' functions from the CM0 library

2021-01-15 Thread Daniel Engel
This implementation provides an efficient tail call to __clzdi2(), making the functions rather smaller and faster than the C versions. gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/bits/ctz2.S (__ffssi2, __ffsdi2): New functions. * config/arm/t-elf (LIB1ASMFUNCS): Ad

[PATCH v5 14/33] Import 'parity' functions from the CM0 library

2021-01-15 Thread Daniel Engel
The functional overlap between the single- and double-word functions makes functions makes this implementation about half the size of the C functions if both functions are linked in the same application. gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/parity.S: New file for __

[PATCH v5 15/33] Import 'popcnt' functions from the CM0 library

2021-01-15 Thread Daniel Engel
The functional overlap between the single- and double-word functions makes this implementation about 30% smaller than the C functions if both functions are linked together in the same appliation. gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/popcnt.S (__popcountsi, __popcoun

[PATCH v5 16/33] Refactor Thumb-1 64-bit comparison into a new file

2021-01-15 Thread Daniel Engel
This will make it easier to isolate changes in subsequent patches. gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/bpabi-v6m.S (__aeabi_lcmp, __aeabi_ulcmp): Moved to ... * config/arm/eabi/lcmp.S: New file. * config/arm/lib1funcs.S: #include eabi/lcmp.S. --- l

[PATCH v5 17/33] Import 64-bit comparison from CM0 library

2021-01-15 Thread Daniel Engel
These are 2-5 instructions smaller and just as fast. Branches are minimized, which will allow easier adaptation to Thumb-2/ARM mode. gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/eabi/lcmp.S (__aeabi_lcmp, __aeabi_ulcmp): Replaced; add macro configuration to build _

[PATCH v5 18/33] Merge Thumb-2 optimizations for 64-bit comparison

2021-01-15 Thread Daniel Engel
This effectively merges support for all architecture variants into a common function path with appropriate build conditions. ARM performance is 1-2 instructions faster; Thumb-2 is about 50% faster. gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/bpabi.S (__aeabi_lcmp, __aeabi_

[PATCH v5 19/33] Import 32-bit division from the CM0 library

2021-01-15 Thread Daniel Engel
gcc/libgcc/ChangeLog: 2021-01-07 Daniel Engel * config/arm/eabi/idiv.S: New file for __udivsi3() and __divsi3(). * config/arm/lib1funcs.S: #include eabi/idiv.S (v6m only). --- libgcc/config/arm/eabi/idiv.S | 299 ++ libgcc/config/arm/lib1funcs.S |

[PATCH v5 20/33] Refactor Thumb-1 64-bit division into a new file

2021-01-15 Thread Daniel Engel
gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/bpabi-v6m.S (__aeabi_ldivmod/ldivmod): Moved to ... * config/arm/eabi/ldiv.S: New file. * config/arm/lib1funcs.S: #include eabi/ldiv.S (v6m only). --- libgcc/config/arm/bpabi-v6m.S | 81 -

[PATCH v5 21/33] Import 64-bit division from the CM0 library

2021-01-15 Thread Daniel Engel
gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/bpabi.c: Deleted unused file. * config/arm/eabi/ldiv.S (__aeabi_ldivmod, __aeabi_uldivmod): Replaced wrapper functions with a complete implementation. * config/arm/t-bpabi (LIB2ADD_ST): Removed bpabi.c.

[PATCH v5 22/33] Import integer multiplication from the CM0 library

2021-01-15 Thread Daniel Engel
gcc/libgcc/ChangeLog: 2021-01-07 Daniel Engel * config/arm/eabi/lmul.S: New file for __muldi3(), __mulsidi3(), and __umulsidi3(). * config/arm/lib1funcs.S: #eabi/lmul.S (v6m only). * config/arm/t-elf: Add the new objects to LIB1ASMFUNCS. --- libgcc/config/arm/eab

[PATCH v5 23/33] Refactor Thumb-1 float comparison into a new file

2021-01-15 Thread Daniel Engel
gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/bpabi-v6m.S (__aeabi_cfcmpeq, __aeabi_cfcmple, __aeabi_cfrcmple, __aeabi_fcmpeq, __aeabi_fcmple, aeabi_fcmple, __aeabi_fcmpgt, aeabi_fcmpge): Moved to ... * config/arm/eabi/fcmp.S: New file. * confi

[PATCH v5 24/33] Import float comparison from the CM0 library

2021-01-15 Thread Daniel Engel
These functions are significantly smaller and faster than the wrapper functions and soft-float implementation they replace. Using the first comparison operator (e.g. '<=') in any program costs about 70 bytes initially, but every additional operator incrementally adds just 4 bytes. NOTE: It seems

[PATCH v5 25/33] Refactor Thumb-1 float subtraction into a new file

2021-01-15 Thread Daniel Engel
This will make it easier to isolate changes in subsequent patches. gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/bpabi-v6m.S (__aeabi_frsub): Moved to ... * config/arm/eabi/fadd.S: New file. * config/arm/lib1funcs.S: #include eabi/fadd.S (v6m only). --- libg

[PATCH v5 26/33] Import float addition and subtraction from the CM0 library

2021-01-15 Thread Daniel Engel
Since this is the first import of single-precision functions, some common parsing and formatting routines are also included. These common rotines will be referenced by other functions in subsequent commits. However, even if the size penalty is accounted entirely to __addsf3(), the total compiled s

[PATCH v5 27/33] Import float multiplication from the CM0 library

2021-01-15 Thread Daniel Engel
gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/eabi/fmul.S (__mulsf3): New file. * config/arm/lib1funcs.S: #include eabi/fmul.S (v6m only). * config/arm/t-elf (LIB1ASMFUNCS): Moved _mulsf3 to global scope (this object was previously blocked on v6m build

[PATCH v5 28/33] Import float division from the CM0 library

2021-01-15 Thread Daniel Engel
gcc/libgcc/ChangeLog: 2021-01-08 Daniel Engel * config/arm/eabi/fdiv.S (__divsf3, __fp_divloopf): New file. * config/arm/lib1funcs.S: #include eabi/fdiv.S (v6m only). * config/arm/t-elf (LIB1ASMFUNCS): Added _divsf3 and _fp_divloopf. --- libgcc/config/arm/eabi/fdiv.S | 26

[PATCH v5 29/33] Import integer-to-float conversion from the CM0 library

2021-01-15 Thread Daniel Engel
gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/bpabi-lib.h (__floatdisf, __floatundisf): Remove obsolete RENAME_LIBRARY directives. * config/arm/eabi/ffloat.S (__aeabi_i2f, __aeabi_l2f, __aeabi_ui2f, __aeabi_ul2f): New file. * config/arm/lib1fun

[PATCH v5 31/33] Import float<->double conversion from the CM0 library

2021-01-15 Thread Daniel Engel
gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/eabi/fcast.S (__aeabi_d2f, __aeabi_f2d): New file. * config/arm/lib1funcs.S: #include eabi/fcast.S (v6m only). * config/arm/t-elf (LIB1ASMFUNCS): Added _arm_d2f and _arm_f2d. --- libgcc/config/arm/eabi/fcast.S | 2

[PATCH v5 30/33] Import float-to-integer conversion from the CM0 library

2021-01-15 Thread Daniel Engel
gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/bpabi-lib.h (muldi3): Removed duplicate. (fixunssfsi) Removed obsolete RENAME_LIBRARY directive. * config/arm/eabi/ffixed.S (__aeabi_f2iz, __aeabi_f2uiz, __aeabi_f2lz, __aeabi_f2ulz): New file. * co

[PATCH v5 33/33] Drop single-precision Thumb-1 soft-float functions

2021-01-15 Thread Daniel Engel
With the complete CM0 library integrated, regression testing showed new failures with the message "compilation failed to produce executable": gcc.dg/fixed-point/convert-float-1.c gcc.dg/fixed-point/convert-float-3.c gcc.dg/fixed-point/convert-sat.c Investigating, this appears to be ca

[PATCH v5 32/33] Import float<->__fp16 conversion from the CM0 library

2021-01-15 Thread Daniel Engel
gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/eabi/fcast.S (__aeabi_h2f, __aeabi_f2h): Added functions. * config/arm/fp16 (__gnu_f2h_ieee, __gnu_h2f_ieee, __gnu_f2h_alternative, __gnu_h2f_alternative): Disable build for v6m multilibs. * config/arm/t-b

Re: [PATCH] x86: Error on -fcf-protection with incompatible target

2021-01-15 Thread Matthias Klose
On 1/14/21 4:18 PM, H.J. Lu via Gcc-patches wrote: > On Thu, Jan 14, 2021 at 6:51 AM Uros Bizjak wrote: >> >> On Thu, Jan 14, 2021 at 3:05 PM H.J. Lu wrote: >>> >>> -fcf-protection with CF_BRANCH inserts ENDBR32 at function entries. >>> ENDBR32 is NOP only on 64-bit processors and 32-bit TARGET_C

Re: [PATCH v3] libgcc: Thumb-1 Floating-Point Library for Cortex M0

2021-01-15 Thread Daniel Engel
Hi Christophe, On Mon, Jan 11, 2021, at 8:39 AM, Christophe Lyon wrote: > On Mon, 11 Jan 2021 at 17:18, Daniel Engel wrote: > > > > On Mon, Jan 11, 2021, at 8:07 AM, Christophe Lyon wrote: > > > On Sat, 9 Jan 2021 at 14:09, Christophe Lyon > > > wrote: > > > > > > > > On Sat, 9 Jan 2021 at 13:2

[committed][OG10] Fix offload dwarf info

2021-01-15 Thread Andrew Stubbs
This patch corrects a problem in which GDB ignores the debug info for offload kernel entry functions because they're represented as nested functions inside a function that does not exist on the accelerator device (only on the host). The fix is to add a notional code range to the non-existent p

[committed][OG10] amdgcn: Fix DWARF variables with alloca

2021-01-15 Thread Andrew Stubbs
This patch fixes DWARF frame calculations for functions that use alloca on AMD GCN. Like many other platforms, it achieves this by switching to frame-pointer mode for this function. The frame pointer is necessary for debugability only, so if the user specifies -fomit-frame-pointer then this

[PATCH] libatomic, libgomp, libitc: Fix bootstrap [PR70454]

2021-01-15 Thread Jakub Jelinek via Gcc-patches
On Thu, Jan 14, 2021 at 04:08:20PM -0800, H.J. Lu wrote: > Here is the updated patch. OK for master? Here is my version of the entire patch. Bootstrapped/regtested on x86_64-linux and i686-linux and additionally tested with i686-linux --with-arch=i386 and x86_64-linux --with-arch_32=i386 (non-bo

Re: [EXTERNAL] Re: [PATCH][tree-optimization]Optimize combination of comparisons to dec+compare

2021-01-15 Thread Richard Biener via Gcc-patches
On Thu, Jan 14, 2021 at 10:04 PM Eugene Rozenfeld wrote: > > I got more feedback for the patch from Gabriel Ravier and Jakub Jelinek in > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96674 and re-worked it > accordingly. > > The changes from the previous patch are: > 1. Switched the tests to use

Re: [PATCH] vect: Use factored nloads for load cost modeling [PR82255]

2021-01-15 Thread Richard Biener via Gcc-patches
On Fri, Jan 15, 2021 at 9:11 AM Kewen.Lin wrote: > > Hi, > > This patch follows Richard's suggestion in the thread discussion[1], > it's to factor out the nloads computation in vectorizable_load for > strided access, to ensure we can obtain the consistent information > when estimating the costs. >

Re: [PATCH] libatomic, libgomp, libitc: Fix bootstrap [PR70454]

2021-01-15 Thread Richard Biener
On Fri, 15 Jan 2021, Jakub Jelinek wrote: > On Thu, Jan 14, 2021 at 04:08:20PM -0800, H.J. Lu wrote: > > Here is the updated patch. OK for master? > > Here is my version of the entire patch. > > Bootstrapped/regtested on x86_64-linux and i686-linux and additionally > tested with i686-linux --wi

[committed][OG10] amdgcn: DWARF address spaces

2021-01-15 Thread Andrew Stubbs
This patch implements DWARF address spaces for pointers to LDS, etc., on AMD GCN. The address space mappings are defined by AMD in their DWARF proposals, and the LLVM implementation. ROCGDB does not actually support this feature yet, I don't believe, but will do so soonish. Committed to de

[committed][OG10] DWARF address space for variables

2021-01-15 Thread Andrew Stubbs
This patch adds DWARF support for "local" variables that are actually located in a different address space. This situation occurs for variables shared between all the worker threads of an OpenACC gang. On AMD GCN the variables are allocated to the low-latency LDS memory associated with each ph

Re: [PATCH] Add pytest for a GCOV test-case

2021-01-15 Thread Rainer Orth
Hi Martin, * If we now have an (even optional) dependency on python/pytest, this (with the exact versions and use) needs to be documented in install.texi. >>> >>> Done that. >> +be installed. Some optional tests also require Python3 and pytest >> module. >> It would be bette

Re: [PATCH v3] libgcc: Thumb-1 Floating-Point Library for Cortex M0

2021-01-15 Thread Christophe Lyon via Gcc-patches
On Fri, 15 Jan 2021 at 12:39, Daniel Engel wrote: > > Hi Christophe, > > On Mon, Jan 11, 2021, at 8:39 AM, Christophe Lyon wrote: > > On Mon, 11 Jan 2021 at 17:18, Daniel Engel wrote: > > > > > > On Mon, Jan 11, 2021, at 8:07 AM, Christophe Lyon wrote: > > > > On Sat, 9 Jan 2021 at 14:09, Christo

[PATCH] testsuite/96098 - remove redundant testcase

2021-01-15 Thread Richard Biener
The testcase morphed in a way no longer testing what it was originally supposed to do and slightly altering it shows the original issue isn't fixed (anymore). The limit as set as result of PR91403 (and dups) prevents the issue for larger arrays but the testcase has double a[128][128]; which resu

Re: [PATCH] libatomic, libgomp, libitc: Fix bootstrap [PR70454]

2021-01-15 Thread H.J. Lu via Gcc-patches
On Fri, Jan 15, 2021 at 4:07 AM Richard Biener wrote: > > On Fri, 15 Jan 2021, Jakub Jelinek wrote: > > > On Thu, Jan 14, 2021 at 04:08:20PM -0800, H.J. Lu wrote: > > > Here is the updated patch. OK for master? > > > > Here is my version of the entire patch. > > > > Bootstrapped/regtested on x86_

[PATCH] testsuite/96147 - remove scanning for ! vect_hw_misalign

2021-01-15 Thread Richard Biener
This removes scanning that's too difficult to get correct for all targets, leaving the correctness test for them and keeping the vectorization capability check to vect_hw_misalign targets. Pused. 2021-01-15 Richard Biener PR testsuite/96147 * gcc.dg/vect/slp-43.c: Remove ! vec

[PATCH] testsuite/96147 - key scanning on vect_hw_misalign

2021-01-15 Thread Richard Biener
gcc.dg/vect/slp-45.c failed to key the vectorization capability scanning on vect_hw_misalign. Since the stores are strided they cannot be (all) analyzed to be aligned. Pushed. 2021-01-15 Richard Biener PR testsuite/96147 * gcc.dg/vect/slp-45.c: Key scanning on vect_hw

[PATCH] testsuite/96147 - align vector access

2021-01-15 Thread Richard Biener
This aligns p so that the testcase is meaningful for targets without a hw misaligned access. Pushed. 2021-01-15 Richard Biener PR testsuite/96147 * gcc.dg/vect/bb-slp-32.c: Align p. --- gcc/testsuite/gcc.dg/vect/bb-slp-32.c | 1 + 1 file changed, 1 insertion(+) diff --git a/

[PATCH] testsuite/96147 - scan for vectorized load

2021-01-15 Thread Richard Biener
This changes gcc.dg/vect/bb-slp-9.c to scan for a vectorized load instead of a vectorized BB which then correctly captures the unaligned load we try to test and not some intermediate built from scalar vector. Pushed. 2021-01-15 Richard Biener PR testsuite/96147 * gcc.dg/vect/b

Re: [PATCH] Add pytest for a GCOV test-case

2021-01-15 Thread Martin Liška
On 1/15/21 1:28 PM, Rainer Orth wrote: Hi Martin, * If we now have an (even optional) dependency on python/pytest, this (with the exact versions and use) needs to be documented in install.texi. Done that. +be installed. Some optional tests also require Python3 and pytest module. It

[PATCH] tree-optimization/96376 - do check alignment for invariant loads

2021-01-15 Thread Richard Biener
The testcases show that we fail to disregard alignment for invariant loads. The patch handles them like we handle gather and scatter. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. 2021-01-15 Richard Biener PR tree-optimization/96376 * tree-vect-stmts.c (get_l

[COMMITTED] IBM Z: Fix linking to libatomic in target test cases

2021-01-15 Thread Marius Hillenbrand via Gcc-patches
Regtested on s390x-linux-gnu. Approved offline by Andreas Krebbel. Pushed. >8--->8--->8->8--->8--->8- One of the test cases failed to link because of missing paths to libatomic. Reuse procedures in lib/atomic-dg.exp to gather these paths. gcc/testsuite/ChangeLog:

Re: [PATCH] [WIP] openmp: Add OpenMP 5.0 task detach clause support

2021-01-15 Thread Kwok Cheung Yeung
On 10/12/2020 2:38 pm, Jakub Jelinek wrote: On Wed, Dec 09, 2020 at 05:37:24PM +, Kwok Cheung Yeung wrote: --- a/gcc/c/c-typeck.c +++ b/gcc/c/c-typeck.c @@ -14942,6 +14942,11 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) pc = &OMP_CLAUSE_CHAIN (c); c

Re: The performance data for two different implementation of new security feature -ftrivial-auto-var-init

2021-01-15 Thread Qing Zhao via Gcc-patches
> On Jan 15, 2021, at 2:11 AM, Richard Biener wrote: > > > > On Thu, 14 Jan 2021, Qing Zhao wrote: > >> Hi, >> More data on code size and compilation time with CPU2017: >> Compilation time data: the numbers are the slowdown against the >> default “no”: >> benchmarks A/no D/no >>

Re: BoF DWARF5 patches (25% .debug section size reduction)

2021-01-15 Thread Jakub Jelinek via Gcc-patches
On Sun, Nov 15, 2020 at 11:41:24PM +0100, Mark Wielaard wrote: > On Tue, 2020-09-29 at 15:56 +0200, Mark Wielaard wrote: > > On Thu, 2020-09-10 at 13:16 +0200, Jakub Jelinek wrote: > > > On Wed, Sep 09, 2020 at 09:57:54PM +0200, Mark Wielaard wrote: > > > > --- a/gcc/doc/invoke.texi > > > > +++ b/g

Re: Add dg-require-wchars to libstdc++ testsuite

2021-01-15 Thread Alexandre Oliva
On Jan 15, 2021, Jonathan Wakely wrote: > On Thu, 14 Jan 2021, 22:22 Alexandre Oliva, wrote: >> ... it is definitely the case that the target currently defines wchar_t, >> and it even offers wchar.h and a lot of (maybe all?) wcs* functions. >> This was likely not the case when the patch was firs

[PATCH] c++: Fix up potential_constant_expression_1 FOR/WHILE_STMT handling [PR98672]

2021-01-15 Thread Jakub Jelinek via Gcc-patches
Hi! The following testcase is rejected even when it is valid. The problem is that potential_constant_expression_1 doesn't have the accurate *jump_target tracking cxx_eval_* has, and when the loop has a condition that isn't guaranteed to be always true, the body isn't walked at all. That is mostly

Re: [PATCH] c++: ICE with constrained placeholder return type [PR98346]

2021-01-15 Thread Patrick Palka via Gcc-patches
On Mon, 11 Jan 2021, Jason Merrill wrote: > On 1/7/21 4:06 PM, Patrick Palka wrote: > > This is essentially a followup to r11-3714 -- we ICEing from another > > "unguarded" call to build_concept_check, this time in do_auto_deduction, > > due to the presence of templated trees when !processing_temp

[pushed] rtl-ssa: Fix a silly typo

2021-01-15 Thread Richard Sandiford via Gcc-patches
s/ref/reg/ on a previously unused function name. Sorry for the blunder. Tested on aarch64-linux-gnu, aarch64_be-elf and x86_64-linux-gnu, pushed as obvious. Richard gcc/ * rtl-ssa/functions.h (function_info::ref_defs): Rename to... (function_info::reg_defs): ...this. *

[pushed] recog: Fix insn_change_watermark destructor

2021-01-15 Thread Richard Sandiford via Gcc-patches
Noticed while working on something else that the insn_change_watermark destructor could call cancel_changes for changes that no longer exist. The loop in cancel_changes is a nop in that case, but: num_changes = num; can mess things up. I think this would only affect nested uses of insn_change_

[pushed] aarch64: Add a minipass for fusing CC insns [PR88836]

2021-01-15 Thread Richard Sandiford via Gcc-patches
This patch adds a small target-specific pass to remove redundant SVE PTEST instructions. There are two important uses of this: - Removing PTESTs after WHILELOs (PR88836). The original testcase no longer exhibits the problem due to more recent optimisations, but it can still be seen in simple

c++: Fix langspecs with -fsyntax-only [PR98591]

2021-01-15 Thread Nathan Sidwell
-fsyntax-only is handled specially in the driver and causes it to add '-o /dev/null' (or a suitable OS-specific variant thereof). PCH is handled in the language driver. I'd not sufficiently protected the -fmodule-only action of adding a dummy assembler from the actions of -fsyntax-only, so

preprocessor: Make quoting : [PR 95253]

2021-01-15 Thread Nathan Sidwell
I changed the quoting of ':', this restores it. Make doesn't need ':' quoting (in a filename). PR preprocessor/95253 libcpp/ * mkdeps.c (munge): Do not escape ':'. -- Nathan Sidwell diff --git i/libcpp/mkdeps.c w/libcpp/mkdeps.c index 471e449a19d..1867e0089d

Re: [PATCH] [WIP] openmp: Add OpenMP 5.0 task detach clause support

2021-01-15 Thread Kwok Cheung Yeung
On 15/01/2021 3:07 pm, Kwok Cheung Yeung wrote: I have tested bootstrapping on x86_64 (no offloading) with no issues, and running the libgomp testsuite with Nvidia offloading shows no regressions. I have also tested all the gomp.exp tests in the main gcc testsuite, also with no issues. I am cur

[PATCH] i386: Use cpp_define_formatted for __SIZEOF_FLOAT80__ definition

2021-01-15 Thread Uros Bizjak via Gcc-patches
2021-01-15 Uroš Bizjak gcc/ * config/i386/i386-c.c (ix86_target_macros): Use cpp_define_formatted for __SIZEOF_FLOAT80__ definition. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Pushed to mainline. Uros. diff --git a/gcc/config/i386/i386-c.c b/gcc/config/i386/i386-

Re: The performance data for two different implementation of new security feature -ftrivial-auto-var-init

2021-01-15 Thread Richard Biener
On January 15, 2021 5:16:40 PM GMT+01:00, Qing Zhao wrote: > > >> On Jan 15, 2021, at 2:11 AM, Richard Biener >wrote: >> >> >> >> On Thu, 14 Jan 2021, Qing Zhao wrote: >> >>> Hi, >>> More data on code size and compilation time with CPU2017: >>> Compilation time data: the numbers a

Re: [PATCH]AArch64: Add NEON, SVE and SVE2 RTL patterns for Multiply, FMS and FMA.

2021-01-15 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: > Hi All, > > This adds implementation for the optabs for complex operations. With this the > following C code: > > void g (float complex a[restrict N], float complex b[restrict N], > float complex c[restrict N]) > { > for (int i=0; i < N; i++) > c[i]

Re: [patch] gcc.dg/analyzer tests: relax dependency on alloca.h

2021-01-15 Thread Alexandre Oliva
On Jan 15, 2021, Olivier Hainque wrote: > On 14 Jan 2021, at 22:13, Alexandre Oliva wrote: >> Would you mind if I submitted an alternate patch to do so? > Not at all, thanks for your feedback and for proposing > an alternative! Here's the modified patch. Regstrapped on x86_64-linux-gnu, also

Re: The performance data for two different implementation of new security feature -ftrivial-auto-var-init

2021-01-15 Thread Qing Zhao via Gcc-patches
> On Jan 15, 2021, at 11:22 AM, Richard Biener wrote: > > On January 15, 2021 5:16:40 PM GMT+01:00, Qing Zhao > wrote: >> >> >>> On Jan 15, 2021, at 2:11 AM, Richard Biener >> wrote: >>> >>> >>> >>> On Thu, 14 Jan 2021, Qing Zhao wrote: >>> Hi, M

Re: [PATCH] [WIP] openmp: Add OpenMP 5.0 task detach clause support

2021-01-15 Thread Jakub Jelinek via Gcc-patches
On Fri, Jan 15, 2021 at 04:58:25PM +, Kwok Cheung Yeung wrote: > On 15/01/2021 3:07 pm, Kwok Cheung Yeung wrote: > > I have tested bootstrapping on x86_64 (no offloading) with no issues, > > and running the libgomp testsuite with Nvidia offloading shows no > > regressions. I have also tested al

Re: [patch] gcc.dg/analyzer tests: relax dependency on alloca.h

2021-01-15 Thread David Malcolm via Gcc-patches
On Fri, 2021-01-15 at 14:45 -0300, Alexandre Oliva wrote: > On Jan 15, 2021, Olivier Hainque wrote: > > > On 14 Jan 2021, at 22:13, Alexandre Oliva > > wrote: > > > Would you mind if I submitted an alternate patch to do so? > > Not at all, thanks for your feedback and for proposing > > an altern

Re: [PATCH 1/3] PowerPC: Add long double target-supports.

2021-01-15 Thread Joseph Myers
On Thu, 14 Jan 2021, Michael Meissner via Gcc-patches wrote: > +return [check_runtime_nocache ppc_long_double_ovveride_ibm128 { > +return [check_runtime_nocache ppc_long_double_ovveride_ieee128 { > +return [check_runtime_nocache ppc_long_double_ovveride_64bit { All these places have

[PATCH] aarch64: Implement vmlsl[_high]* intrinsics using builtins

2021-01-15 Thread Kyrylo Tkachov via Gcc-patches
Hi all, This patch reimplements some more intrinsics using RTL builtins in the straightforward way. Thankfully most of the RTL infrastructure is already in place for it. Bootstrapped and tested on aarch64-none-linux-gnu. Pushing to trunk. Thanks, Kyrill gcc/ * config/aarch64/aarch64-si

[PATCH] match.pd: Optimize (x < 0) ^ (y < 0) to (x ^ y) < 0 etc. [PR96681]

2021-01-15 Thread Jakub Jelinek via Gcc-patches
Hi! This patch simplifies comparisons that test the sign bit xored together. If the comparisons are both < 0 or both >= 0, then we should xor the operands together and compare the result to < 0, if the comparisons are different, we should compare to >= 0. Bootstrapped/regtested on x86_64-linux an

[committed] testsuite: Add testcase coverage for already fixed [PR96671]

2021-01-15 Thread Jakub Jelinek via Gcc-patches
Hi! The fix for this PR didn't come with any test coverage, I've added tests that make sure we optimize it no matter what order of the x ^ y ^ z operands is used. Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk. 2021-01-15 Jakub Jelinek PR tree-optimization/

[committed] bootstrap: fix failing diagnostic selftest on Windows [PR98696]

2021-01-15 Thread David Malcolm via Gcc-patches
In one of the selftests in g:f10960558540636800cf5d3d6355969621fbc17e I didn't consider that paths can contain backslashes, which happens for the tempfiles on Windows hosts. Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Confirmed by the reporter as fixing the issue on Windows. Pu

[PATCH] strlen: Return TODO_update_address_taken when memcmp has been optimized [PR96271]

2021-01-15 Thread Jakub Jelinek via Gcc-patches
Hi! On the following testcase, handle_builtin_memcmp in the strlen pass folds the memcmp into comparison of two MEM_REFs. But nothing triggers updating of addressable vars afterwards, so even when the parameters are no longer address taken, we force the parameters to stack and back anyway. The f

[PATCH] match.pd: Generalize the PR64309 simplifications [PR96669]

2021-01-15 Thread Jakub Jelinek via Gcc-patches
Hi! The following patch generalizes the PR64309 simplifications, so that instead of working only with constants 1 and 1 it works with any two power of two constants, and works also for right shift (in that case it rules out the first one being negative, as it is arithmetic shift then). Bootstrapp

[pushed] c++: Fix list-init of array of no-copy type [PR63707]

2021-01-15 Thread Jason Merrill via Gcc-patches
build_vec_init_elt models initialization from some arbitrary object of the type, i.e. copy, but in the case of list-initialization we don't do a copy from the elements, we initialize them directly. Tested x86_64-pc-linux-gnu, applying to trunk. And 9/10, soon. gcc/cp/ChangeLog: PR c++/6

Re: [PATCH v5] rs6000, vector integer multiply/divide/modulo instructions

2021-01-15 Thread Segher Boessenkool
Hi! On Wed, Jan 13, 2021 at 02:15:04PM -0800, Carl Love wrote: > The patch was compiled and tested on: > >powerpc64le-unknown-linux-gnu (Power 8 BE) (I assume you mean powerpc64-linux instead?) > > Presumably it is safe (no side affects) when adding V4SI and V2DI here, > > with respect to o

[pushed] c++: Improve copy elision for base subobjects [PR98642]

2021-01-15 Thread Jason Merrill via Gcc-patches
Three patches: 1) Rewrite a complete constructor call to call a base constructor if we're eliding a copy into a base subobject. 2) Elide the copy from a prvalue built for list-initialization into a base subobject. 3) Elide other copies from prvalues representing a constructor call into base su

  1   2   >