Ping: [PATCH 0/2] asm() operand 'c' modifier handling

2025-05-02 Thread Jan Beulich
On 21.11.2024 13:03, Jan Beulich wrote: > Documentation is pretty clear here: "Require a constant operand and > print the constant expression with no punctuation." See the patches for > further details. > > 1: fix asm() operand 'c' modifier handling >

[PATCH 2/2] x86: fix asm() operand 'c' modifier handling

2024-11-21 Thread Jan Beulich
Documentation is pretty clear here: "Require a constant operand and print the constant expression with no punctuation"; the internal use for condition codes is entirely undocumented. IOW any constant value of whatever kind, magnitude, or sign ought to be acceptable as long as it's expressable. Wire

[PATCH 1/2] fix asm() operand 'c' modifier handling

2024-11-21 Thread Jan Beulich
Documentation is pretty clear here: "Require a constant operand and print the constant expression with no punctuation." IOW any integer value of whatever magnitude or sign ought to be acceptable. --- RFC: If this (doing the change in generic code) is the way to go, in a few cases arch-specific

[PATCH 0/2] asm() operand 'c' modifier handling

2024-11-21 Thread Jan Beulich
Documentation is pretty clear here: "Require a constant operand and print the constant expression with no punctuation." See the patches for further details. 1: fix asm() operand 'c' modifier handling 2: x86: fix asm() operand 'c' modifier handling Technically for x86 the 2nd patch alone ought to

Re: Ping: [PATCH] gcc/doc: adjust __builtin_choose_expr() description

2024-10-09 Thread Jan Beulich
On 08.10.2024 17:38, Sandra Loosemore wrote: > On 10/8/24 09:35, Jan Beulich wrote: >> On 08.10.2024 17:30, Sandra Loosemore wrote: >>> [snip] >>> >>> Hmmm, looking at the complete documentation for this built-in, and the >>> code, I think I&

Re: Ping: [PATCH] gcc/doc: adjust __builtin_choose_expr() description

2024-10-08 Thread Jan Beulich
On 08.10.2024 17:30, Sandra Loosemore wrote: > On 10/8/24 08:12, Jan Beulich wrote: >> On 19.06.2024 16:01, Jan Beulich wrote: >>> Present wording has misled people to believe the ?: operator would be >>> evaluating all three of the involved expressions. >>> &g

Ping: [PATCH] gcc/doc: adjust __builtin_choose_expr() description

2024-10-08 Thread Jan Beulich
On 19.06.2024 16:01, Jan Beulich wrote: > Present wording has misled people to believe the ?: operator would be > evaluating all three of the involved expressions. > > gcc/ > > * doc/extend.texi: Clarify __builtin_choose_expr() similarity to > the ?: operator.

Re: [PATCH v2] x86/{,V}AES: adjust when to force EVEX encoding

2024-10-08 Thread Jan Beulich
On 08.10.2024 08:54, Hongtao Liu wrote: > On Mon, Sep 30, 2024 at 3:33 PM Jan Beulich wrote: >> >> Commit a79d13a01f8c ("i386: Fix aes/vaes patterns [PR114576]") correctly >> said "..., but we need to emit {evex} prefix in the assembly if AES ISA >> i

[PATCH v2] x86/{,V}AES: adjust when to force EVEX encoding

2024-09-30 Thread Jan Beulich
Commit a79d13a01f8c ("i386: Fix aes/vaes patterns [PR114576]") correctly said "..., but we need to emit {evex} prefix in the assembly if AES ISA is not enabled". Yet it did so only for the TARGET_AES insns. Going from the alternative chosen in the TARGET_VAES insns isn't quite right: If AES is (als

Re: [PATCH] x86/{,V}AES: adjust when to force EVEX encoding

2024-09-25 Thread Jan Beulich
On 25.09.2024 10:52, Hongtao Liu wrote: > On Wed, Sep 25, 2024 at 3:55 PM Jan Beulich wrote: >> >> On 25.09.2024 09:38, Hongtao Liu wrote: >>> On Wed, Sep 25, 2024 at 2:56 PM Jan Beulich wrote: >>>> >>>> Commit a79d13a01f8c ("i386: Fix aes/vae

Re: [PATCH] x86/{,V}AES: adjust when to force EVEX encoding

2024-09-25 Thread Jan Beulich
On 25.09.2024 09:38, Hongtao Liu wrote: > On Wed, Sep 25, 2024 at 2:56 PM Jan Beulich wrote: >> >> Commit a79d13a01f8c ("i386: Fix aes/vaes patterns [PR114576]") correctly >> said "..., but we need to emit {evex} prefix in the assembly if AES ISA >> i

[PATCH] x86/{,V}AES: adjust when to force EVEX encoding

2024-09-24 Thread Jan Beulich
Commit a79d13a01f8c ("i386: Fix aes/vaes patterns [PR114576]") correctly said "..., but we need to emit {evex} prefix in the assembly if AES ISA is not enabled". Yet it did so only for the TARGET_AES insns. Going from the alternative chosen in the TARGET_VAES insns is wrong for two reasons: - if, w

[PATCH] gcc/doc: adjust __builtin_choose_expr() description

2024-06-19 Thread Jan Beulich
Present wording has misled people to believe the ?: operator would be evaluating all three of the involved expressions. gcc/ * doc/extend.texi: Clarify __builtin_choose_expr() similarity to the ?: operator. --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -14962,9 +14962,9

[PATCH] configure: adjustments for building with in-tree binutils

2024-06-12 Thread Jan Beulich
For one setting ld_ver in a conditional (no in-tree ld) when it's used, for x86 at least, in unconditional ways can't be quite right. And then prefixing relative paths to binaries with ${objdir}/, when ${objdir} nowadays resolves to just .libs, can at best be a leftover that wasn't properly cleaned

[PATCH] libgcc/aarch64: also provide AT_HWCAP2 fallback

2024-05-29 Thread Jan Beulich
Much like AT_HWCAP is already provided in case the platform headers don't have the value (yet). libgcc/ * config/aarch64/cpuinfo.c: Provide AT_HWCAP2. --- Observed as build failure with 14.1.0, so may want backporting there. --- a/libgcc/config/aarch64/cpuinfo.c +++ b/libgcc/config/aarch

Re: [PATCH] binutils: v2: experimental use of libdiagnostics in gas

2023-11-21 Thread Jan Beulich
On 21.11.2023 23:20, David Malcolm wrote: > @@ -101,6 +109,29 @@ had_warnings (void) >return warning_count; > } > > +#if USE_LIBDIAGNOSTICS > +static diagnostic_manager *diag_mgr; > +#endif > + > +void messages_init (void) > +{ > +#if USE_LIBDIAGNOSTICS > + diag_mgr = diagnostic_manager_new

Re: [PATCH] binutils: experimental use of libdiagnostics in gas

2023-11-07 Thread Jan Beulich
On 07.11.2023 15:32, David Malcolm wrote: > On Tue, 2023-11-07 at 11:03 +0100, Jan Beulich wrote: >> On 06.11.2023 23:29, David Malcolm wrote: >>> All of the locations are just lines; does gas do column numbers at >>> all? >>> (or ranges?) >> >> It

Re: [PATCH] binutils: experimental use of libdiagnostics in gas

2023-11-07 Thread Jan Beulich
On 06.11.2023 23:29, David Malcolm wrote: > Here's a patch for gas in binutils that makes it use libdiagnostics > (with some nasty hardcoded paths to specific places on my hard drive > to make it easier to develop the API). > > For now this hardcodes adding two sinks: a text sink on stderr, and >

Re: [PATCH 5/5] x86: yet more PR target/100711-like splitting

2023-11-06 Thread Jan Beulich
On 25.06.2023 08:41, Hongtao Liu wrote: > On Sun, Jun 25, 2023 at 2:35 PM Hongtao Liu wrote: >> >> On Sun, Jun 25, 2023 at 2:25 PM Jan Beulich wrote: >>> >>> On 25.06.2023 07:12, Hongtao Liu wrote: >>>> On Wed, Jun 21, 2023 at 2:

Re: Intel AVX10.1 Compiler Design and Support

2023-08-10 Thread Jan Beulich via Gcc-patches
On 10.08.2023 15:12, Phoebe Wang wrote: >> The psABI should have some simple rule covering all of the above I think. > > psABI has a rule for the case doesn't mean the rule is a well defined ABI > in practice. A well defined ABI should guarantee 1) interlinkable across > different compile options

Re: Intel AVX10.1 Compiler Design and Support

2023-08-09 Thread Jan Beulich via Gcc-patches
On 09.08.2023 09:38, Hongtao Liu wrote: > On Wed, Aug 9, 2023 at 3:17 PM Jan Beulich wrote: >> >> On 09.08.2023 04:14, Hongtao Liu wrote: >>> On Wed, Aug 9, 2023 at 9:21 AM Hongtao Liu wrote: >>>> >>>> On Wed, Aug 9, 2023 at 3:55 AM Joseph Myers

Re: Intel AVX10.1 Compiler Design and Support

2023-08-09 Thread Jan Beulich via Gcc-patches
On 09.08.2023 04:14, Hongtao Liu wrote: > On Wed, Aug 9, 2023 at 9:21 AM Hongtao Liu wrote: >> >> On Wed, Aug 9, 2023 at 3:55 AM Joseph Myers wrote: >>> >>> Do you have any comments on the interaction of AVX10 with the >>> micro-architecture levels defined in the ABI (and supported with >>> glibc

[PATCH 10/10] x86: drop redundant "prefix_data16" attributes

2023-08-03 Thread Jan Beulich via Gcc-patches
The attribute defaults to 1 for TI-mode insns of type sselog, sselog1, sseiadd, sseimul, and sseishft. In *v8hi3 [smaxmin] and *v16qi3 [umaxmin] also drop the similarly stray "prefix_extra" at this occasion. These two max/min flavors are encoded in 0f space. gcc/ * config/i386/mmx.md (*m

[PATCH 08/10] x86: add missing "prefix" attribute to VF{,C}MULC

2023-08-03 Thread Jan Beulich via Gcc-patches
gcc/ * config/i386/sse.md (__): Add "prefix" attribute. (avx512fp16_sh_v8hf): Likewise. --- Talking of "prefix": Shouldn't at least V32HF and V32BF have it also default to "evex"? (It won't matter right here, but it may matter elsewhere.) --- a/gcc/config/

[PATCH 06/10] x86: drop stray "prefix_extra"

2023-08-03 Thread Jan Beulich via Gcc-patches
While the attribute is relevant for legacy- and VEX-encoded insns, it is of no relevance for EVEX-encoded ones. While there in avx512dq_broadcast_1 add the missing "length_immediate". gcc/ * config/i386/sse.md (*_eq3_1): Drop "prefix_extra". (avx512dq_vextract64x2

[PATCH 05/10] x86: replace/correct bogus "prefix_extra"

2023-08-03 Thread Jan Beulich via Gcc-patches
In the rdrand and rdseed cases "prefix_0f" is meant instead. For mmx_floatv2siv2sf2 1 is correct only for the first alternative. For the integer min/max cases 1 uniformly applies to legacy and VEX encodings (the UB and SW variants are dealt with separately anyway). Same for {,V}MOVNTDQA. Unlike {,

[PATCH 09/10] x86: correct "length_immediate" in a few cases

2023-08-03 Thread Jan Beulich via Gcc-patches
When first added explicitly in 3ddffba914b2 ("i386.md (sse4_1_round2): Add avx512f alternative"), "*" should not have been used for the pre-existing alternative. The attribute was plain missing. Subsequent changes adding more alternatives then generously extended the bogus pattern. Apparently some

[PATCH 07/10] x86: add (adjust) XOP insn attributes

2023-08-03 Thread Jan Beulich via Gcc-patches
Many were lacking "prefix" and "prefix_extra", some had a bogus value of 2 for "prefix_extra" (presumably inherited from their SSE5 counterparts, which are long gone) and a meaningless "prefix_data16" one. Where missing, "mode" attributes are also added. (Note that "sse4arg" and "ssemuladd" ones do

[PATCH 03/10] x86: "ssemuladd" adjustments

2023-08-03 Thread Jan Beulich via Gcc-patches
They're all VEX3- (also covering XOP) or EVEX-encoded. Express that in the default calculation of "prefix". FMA4 insns also all have a 1-byte immediate operand. Where the default calculation is not sufficient / applicable, add explicit "prefix" attributes. While there also add a "mode" attribute t

[PATCH 04/10] x86: "prefix_extra" can't really be "2"

2023-08-03 Thread Jan Beulich via Gcc-patches
In the three remaining instances separate "prefix_0f" and "prefix_rep" are what is wanted instead. gcc/ * config/i386/i386.md (rdbase): Add "prefix_0f" and "prefix_rep". Drop "prefix_extra". (wrbase): Likewise. (ptwrite): Likewise. --- a/gcc/config/i386/i386.md ++

[PATCH 02/10] x86: "sse4arg" adjustments

2023-08-03 Thread Jan Beulich via Gcc-patches
Record common properties in other attributes' default calculations: There's always a 1-byte immediate, and they're always encoded in a VEX3- like manner (note that "prefix_extra" already evaluates to 1 in this case). The drop now (or already previously) redundant explicit attributes, adding "mode"

[PATCH 01/10] x86: "prefix_extra" tidying

2023-08-03 Thread Jan Beulich via Gcc-patches
Drop SSE5 leftovers from both its comment and its default calculation. A value of 2 simply cannot occur anymore. Instead extend the comment to mention the use of the attribute in "length_vex", clarifying why "prefix_extra" can actually be meaningful on VEX-encoded insns despite those not having any

[PATCH 00/10] x86: (mainly) "prefix_extra" adjustments

2023-08-03 Thread Jan Beulich via Gcc-patches
Having noticed various bogus uses, I thought I'd go through and audit them all. This is the result, with some other attributes also adjusted as noticed in the process. (I think this tidying also is a good thing to have ahead of APX further complicating insn length calculations.) 01: "prefix_extra"

[PATCH] MAINTAINERS: correct my email address

2023-08-01 Thread Jan Beulich via Gcc-patches
-Jan Beulich +Jan Beulich David Billinghurst Tomas Bily Laurynas Biveinis

[PATCH RESEND] libatomic: drop redundant all-multi command

2023-07-31 Thread Jan Beulich via Gcc-patches
./multilib.am already specifies this same command, and make warns about the earlier one being ignored when seeing the later one. All that needs retaining to still satisfy the preceding comment is the extra dependency. libatomic/ * Makefile.am (all-multi): Drop commands. * Makefile

[PATCH] x86: fold two of vec_dupv2df's alternatives

2023-07-31 Thread Jan Beulich via Gcc-patches
By using Yvm in the source, both can be expressed in one. gcc/ * sse.md (vec_dupv2df): Fold the middle two of the alternatives. --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -13784,21 +13784,20 @@ (set_attr "mode" "DF,DF,V1DF,V1DF,V1DF,V2DF,V1DF,V1DF,V1DF")])

Re: [PATCH] x86: slightly enhance "vec_dupv2df"

2023-07-16 Thread Jan Beulich via Gcc-patches
On 17.07.2023 08:09, Hongtao Liu wrote: > On Fri, Jul 14, 2023 at 5:40 PM Jan Beulich via Gcc-patches > wrote: >> >> Introduce a new alternative permitting all 32 registers to be used as >> source without AVX512VL, by broadcasting to the full 512 bits in that >> cas

Re: [PATCH] x86: replace "extendhfdf2" expander

2023-07-14 Thread Jan Beulich via Gcc-patches
On 14.07.2023 12:10, Uros Bizjak wrote: > On Fri, Jul 14, 2023 at 11:44 AM Jan Beulich wrote: >> >> The corresponding insn serves this purpose quite fine, and leads to >> slightly less (generated) code. All we need is the insn to not have a >> leading * in its name,

[PATCH] x86: replace "extendhfdf2" expander

2023-07-14 Thread Jan Beulich via Gcc-patches
The corresponding insn serves this purpose quite fine, and leads to slightly less (generated) code. All we need is the insn to not have a leading * in its name, while retaining that * for "extendhfsf2". Introduce a mode attribute in exchange to achieve that. gcc/ * config/i386/i386.md (ex

[PATCH] x86: avoid maybe_gen_...()

2023-07-14 Thread Jan Beulich via Gcc-patches
In the (however unlikely) event that no insn can be found for the requested mode, using maybe_gen_...() without (really) checking its result for being a null rtx would lead to silent bad code generation. gcc/ * config/i386/i386-expand.cc (ix86_expand_vector_init_duplicate): Use ge

[PATCH] x86: slightly enhance "vec_dupv2df"

2023-07-14 Thread Jan Beulich via Gcc-patches
Introduce a new alternative permitting all 32 registers to be used as source without AVX512VL, by broadcasting to the full 512 bits in that case. (The insn would also permit all registers to be used as destination, but V2DFmode doesn't.) gcc/ * config/i386/sse.md (vec_dupv2df): Add new AV

Re: [PATCH] x86: improve fast bfloat->float conversion

2023-07-11 Thread Jan Beulich via Gcc-patches
On 11.07.2023 08:45, Liu, Hongtao wrote: >> -Original Message- >> From: Jan Beulich >> Sent: Tuesday, July 11, 2023 2:08 PM >> >> There's nothing AVX512BW-ish in here, so no reason to use Yw as the >> constraints for the AVX alternative. Furthermo

[PATCH] x86: improve fast bfloat->float conversion

2023-07-10 Thread Jan Beulich via Gcc-patches
There's nothing AVX512BW-ish in here, so no reason to use Yw as the constraints for the AVX alternative. Furthermore by using the 512-bit form of VPSSLD (in a new alternative) all 32 registers can be used directly by the insn without AVX512VL needing to be enabled. Also adjust the originally last

[PATCH v3] x86: make better use of VBROADCASTSS / VPBROADCASTD

2023-07-10 Thread Jan Beulich via Gcc-patches
... in vec_dupv4sf / *vec_dupv4si. The respective broadcast insns are never longer (yet sometimes shorter) than the corresponding VSHUFPS / VPSHUFD, due to the immediate operand of the shuffle insns balancing the (uniform) need for VEX3 in the broadcast ones. When EVEX encoding is respective the br

Re: [r14-2314 Regression] FAIL: gcc.target/i386/pr100711-2.c scan-assembler-times vpandn 8 on Linux/x86_64

2023-07-07 Thread Jan Beulich via Gcc-patches
On 07.07.2023 09:46, Hongtao Liu wrote: > On Fri, Jul 7, 2023 at 3:18 PM Jan Beulich via Gcc-regression > wrote: >> >> On 06.07.2023 13:57, haochen.jiang wrote: >>> On Linux/x86_64, >>> >>> e007369c8b67bcabd57c4fed8cff2a

Re: [r14-2310 Regression] FAIL: gcc.target/i386/pr53652-1.c scan-assembler-times pandn[ \\t] 2 on Linux/x86_64

2023-07-07 Thread Jan Beulich via Gcc-patches
On 07.07.2023 09:30, Hongtao Liu wrote: > On Fri, Jul 7, 2023 at 3:13 PM Jan Beulich via Gcc-regression > wrote: >> >> On 06.07.2023 13:57, haochen.jiang wrote: >>> On Linux/x86_64, >>> >>> 2d11c99dfca3cc603dbbfafb3afc41

Re: [r14-2314 Regression] FAIL: gcc.target/i386/pr100711-2.c scan-assembler-times vpandn 8 on Linux/x86_64

2023-07-07 Thread Jan Beulich via Gcc-patches
On 06.07.2023 13:57, haochen.jiang wrote: > On Linux/x86_64, > > e007369c8b67bcabd57c4fed8cff2a6db82e78e6 is the first bad commit > commit e007369c8b67bcabd57c4fed8cff2a6db82e78e6 > Author: Jan Beulich > Date: Wed Jul 5 09:49:16 2023 +0200 > > x86: yet more PR targ

Re: [r14-2310 Regression] FAIL: gcc.target/i386/pr53652-1.c scan-assembler-times pandn[ \\t] 2 on Linux/x86_64

2023-07-07 Thread Jan Beulich via Gcc-patches
On 06.07.2023 13:57, haochen.jiang wrote: > On Linux/x86_64, > > 2d11c99dfca3cc603dbbfafb3afc41689a68e40f is the first bad commit > commit 2d11c99dfca3cc603dbbfafb3afc41689a68e40f > Author: Jan Beulich > Date: Wed Jul 5 09:41:09 2023 +0200 > > x86: use VPTERNLOG

Re: [PATCH 2/2] x86: slightly correct / simplify *vec_extractv2ti

2023-07-05 Thread Jan Beulich via Gcc-patches
On 05.07.2023 10:47, Hongtao Liu wrote: > On Wed, Jul 5, 2023 at 4:01 PM Jan Beulich via Gcc-patches > wrote: >> >> V2TImode values cannot appear in the upper 16 YMM registers without >> AVX512VL being enabled. Therefore forcing 512-bit mode (also not >> reflect

Re: [PATCH 1/2] x86: correct / simplify @vec_extract_hi_ and vec_extract_hi_v32qi

2023-07-05 Thread Jan Beulich via Gcc-patches
On 05.07.2023 10:40, Hongtao Liu wrote: > On Wed, Jul 5, 2023 at 4:00 PM Jan Beulich via Gcc-patches > wrote: >> >> The middle alternative each was unusable without enabling AVX512DQ (in >> addition to AVX512VL), which is entirely unrelated here. The last >> alter

[PATCH 2/2] x86: slightly correct / simplify *vec_extractv2ti

2023-07-05 Thread Jan Beulich via Gcc-patches
V2TImode values cannot appear in the upper 16 YMM registers without AVX512VL being enabled. Therefore forcing 512-bit mode (also not reflected in the "mode" attribute) is pointless. gcc/ * config/i386/sse.md (*vec_extractv2ti): Drop g modifiers. --- a/gcc/config/i386/sse.md +++ b/gcc/con

[PATCH 1/2] x86: correct / simplify @vec_extract_hi_ and vec_extract_hi_v32qi

2023-07-05 Thread Jan Beulich via Gcc-patches
The middle alternative each was unusable without enabling AVX512DQ (in addition to AVX512VL), which is entirely unrelated here. The last alternative is usable with AVX512VL only (due to type restrictions on what may be put in the upper 16 YMM registers), and hence is pointlessly forcing 512-bit mod

[PATCH 0/2] x86: vec_extract_* adjustments

2023-07-05 Thread Jan Beulich via Gcc-patches
1: correct / simplify @vec_extract_hi_ and vec_extract_hi_v32qi 2: slightly correct / simplify *vec_extractv2ti Jan

[PATCH] x86: suppress avx512f-copysign.c testcase for 32-bit

2023-07-05 Thread Jan Beulich via Gcc-patches
The test installed by "x86: make VPTERNLOG* usable on less than 512-bit operands with just AVX512F" won't succeed on 32-bit, for floating point operations being done there (by default) without using SIMD insns. gcc/testsuite/ * gcc.target/i386/avx512f-copysign.c: Suppress for 32-bit. --- C

Re: [PATCH v3] x86: make VPTERNLOG* usable on less than 512-bit operands with just AVX512F

2023-07-04 Thread Jan Beulich via Gcc-patches
On 27.06.2023 07:11, Hongtao Liu wrote: > On Tue, Jun 20, 2023 at 5:34 PM Hongtao Liu wrote: >> >> On Tue, Jun 20, 2023 at 5:03 PM Jan Beulich wrote: >>> >>> On 20.06.2023 10:33, Hongtao Liu wrote: >>>> On Tue, Jun 20, 2023 at 3:07 PM Jan Beulich via

Re: [PATCH 1/5] x86: use VPTERNLOG for further bitwise two-vector operations

2023-06-25 Thread Jan Beulich via Gcc-patches
On 25.06.2023 09:30, Hongtao Liu wrote: > On Sun, Jun 25, 2023 at 3:23 PM Hongtao Liu wrote: >> >> On Sun, Jun 25, 2023 at 3:13 PM Hongtao Liu wrote: >>> >>> On Sun, Jun 25, 2023 at 1:52 PM Jan Beulich wrote: >>>> >>>> On 25.06.2023

Re: [PATCH 5/5] x86: yet more PR target/100711-like splitting

2023-06-24 Thread Jan Beulich via Gcc-patches
On 25.06.2023 07:12, Hongtao Liu wrote: > On Wed, Jun 21, 2023 at 2:29 PM Jan Beulich via Gcc-patches > wrote: >> >> --- >> For the purpose here (and elsewhere) bcst_vector_operand() (really: >> bcst_mem_operand()) isn't permissive enough: We'd want it

Re: [PATCH 4/5] x86: further PR target/100711-like splitting

2023-06-24 Thread Jan Beulich via Gcc-patches
On 25.06.2023 07:06, Hongtao Liu wrote: > On Wed, Jun 21, 2023 at 2:28 PM Jan Beulich via Gcc-patches > wrote: >> >> With respective two-operand bitwise operations now expressable by a >> single VPTERNLOG, add splitters to also deal with ior and xor >> counterparts

Re: [PATCH 1/5] x86: use VPTERNLOG for further bitwise two-vector operations

2023-06-24 Thread Jan Beulich via Gcc-patches
On 25.06.2023 06:42, Hongtao Liu wrote: > On Wed, Jun 21, 2023 at 2:26 PM Jan Beulich via Gcc-patches > wrote: >> >> +(define_code_iterator andor [and ior]) >> +(define_code_attr nlogic [(and "nor") (ior "nand")]) >> +(define

Re: [PATCH v2] x86: make better use of VBROADCASTSS / VPBROADCASTD

2023-06-21 Thread Jan Beulich via Gcc-patches
On 21.06.2023 09:44, Jan Beulich wrote: > On 21.06.2023 09:37, Hongtao Liu wrote: >> On Wed, Jun 21, 2023 at 2:06 PM Jan Beulich via Gcc-patches >> wrote: >>> >>> Isn't prefix_extra use bogus here? What extra prefix does vbroadcastss >> According

Re: [PATCH v2] x86: make better use of VBROADCASTSS / VPBROADCASTD

2023-06-21 Thread Jan Beulich via Gcc-patches
On 21.06.2023 09:37, Hongtao Liu wrote: > On Wed, Jun 21, 2023 at 2:06 PM Jan Beulich via Gcc-patches > wrote: >> >> Is there a reason why vec_dupv4sf uses sseshuf1 for its shuffle >> alternatives, but *vec_dupv4si uses sselog1? I'd be happy to correct >> t

[PATCH 5/5] x86: yet more PR target/100711-like splitting

2023-06-20 Thread Jan Beulich via Gcc-patches
Following two-operand bitwise operations, add another splitter to also deal with not followed by broadcast all on its own, which can be expressed as simple embedded broadcast instead once a broadcast operand is actually permitted in the respective insn. While there also permit a broadcast operand i

[PATCH 4/5] x86: further PR target/100711-like splitting

2023-06-20 Thread Jan Beulich via Gcc-patches
With respective two-operand bitwise operations now expressable by a single VPTERNLOG, add splitters to also deal with ior and xor counterparts of the original and-only case. Note that the splitters need to be separate, as the placement of "not" differs in the final insns (*iornot3, *xnor3) which ar

[PATCH 3/5] x86: allow memory operand for AVX2 splitter for PR target/100711

2023-06-20 Thread Jan Beulich via Gcc-patches
The intended broadcast (with AVX512) can very well be done right from memory. gcc/ * config/i386/sse.md: Permit non-immediate operand 1 in AVX2 form of splitter for PR target/100711. --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -17356,7 +17356,7 @@ (and:VI

[PATCH 2/5] x86: use VPTERNLOG also for certain andnot forms

2023-06-20 Thread Jan Beulich via Gcc-patches
When it's the memory operand which is to be inverted, using VPANDN* requires a further load instruction. The same can be achieved by a single VPTERNLOG*. Add two new alternatives (for plain memory and embedded broadcast), adjusting the predicate for the first operand accordingly. Two pre-existing

[PATCH 1/5] x86: use VPTERNLOG for further bitwise two-vector operations

2023-06-20 Thread Jan Beulich via Gcc-patches
All combinations of and, ior, xor, and not involving two operands can be expressed that way in a single insn. gcc/ PR target/93768 * config/i386/i386.cc (ix86_rtx_costs): Further special-case bitwise vector operations. * config/i386/sse.md (*iornot3): New insn.

[PATCH 0/5] x86: make better use of VPTERNLOG{D,Q}

2023-06-20 Thread Jan Beulich via Gcc-patches
While there are some quite sophisticated 4-operand expanders, 2-operand binary logic which can't be expressed by just VPAND, VPANDN, VPOR, or VPXOR doesn't utilize this insn to carry out such operations in a single insn. Therefore the first two patches address one of the sub-aspects of PR target/93

[PATCH v2] x86: make better use of VBROADCASTSS / VPBROADCASTD

2023-06-20 Thread Jan Beulich via Gcc-patches
... in vec_dupv4sf / *vec_dupv4si. The respective broadcast insns are never longer (yet sometimes shorter) than the corresponding VSHUFPS / VPSHUFD, due to the immediate operand of the shuffle insns balancing the possible need for VEX3 in the broadcast ones. When EVEX encoding is required the broad

[PATCH] x86: add -mprefer-vector-width=512 to new avx512f-dupv2di.c testcase

2023-06-20 Thread Jan Beulich via Gcc-patches
This is to cover testing also being done with -march=cascadelake. --- Committing as obvious. --- a/gcc/testsuite/gcc.target/i386/avx512f-dupv2di.c +++ b/gcc/testsuite/gcc.target/i386/avx512f-dupv2di.c @@ -1,5 +1,5 @@ /* { dg-do compile { target { ! ia32 } } } */ -/* { dg-options "-mavx512f -mno-a

Re: [PATCH v3] x86: make VPTERNLOG* usable on less than 512-bit operands with just AVX512F

2023-06-20 Thread Jan Beulich via Gcc-patches
On 20.06.2023 10:33, Hongtao Liu wrote: > On Tue, Jun 20, 2023 at 3:07 PM Jan Beulich via Gcc-patches > wrote: >> >> I guess the underlying pattern, going along the lines of what >> one_cmpl2 uses, can be applied elsewhere >> as well. > That should be guarded

[PATCH v3] x86: make VPTERNLOG* usable on less than 512-bit operands with just AVX512F

2023-06-20 Thread Jan Beulich via Gcc-patches
There's no reason to constrain this to AVX512VL, unless instructed so by -mprefer-vector-width=, as the wider operation is unusable for more narrow operands only when the possible memory source is a non-broadcast one. This way even the scalar copysign3 can benefit from the operation being a single-

Re: [PATCH v2] x86: make VPTERNLOG* usable on less than 512-bit operands with just AVX512F

2023-06-19 Thread Jan Beulich via Gcc-patches
On 19.06.2023 04:07, Liu, Hongtao wrote: >> -Original Message- >> From: Jan Beulich >> Sent: Friday, June 16, 2023 2:22 PM >> >> --- a/gcc/config/i386/sse.md >> +++ b/gcc/config/i386/sse.md >> @@ -12597,11 +12597,11 @@ >> (set_attr

[PATCH v2] x86: make VPTERNLOG* usable on less than 512-bit operands with just AVX512F

2023-06-15 Thread Jan Beulich via Gcc-patches
There's no reason to constrain this to AVX512VL, unless instructed so by -mprefer-vector-width=, as the wider operation is unusable for more narrow operands only when the possible memory source is a non-broadcast one. This way even the scalar copysign3 can benefit from the operation being a single-

[PATCH v2] x86: correct and improve "*vec_dupv2di"

2023-06-15 Thread Jan Beulich via Gcc-patches
The input constraint for the %vmovddup alternative was wrong, as the upper 16 XMM registers require AVX512VL to be used with this insn. To compensate, introduce a new alternative permitting all 32 registers, by broadcasting to the full 512 bits in that case if AVX512VL is not available. gcc/

Re: [PATCH] x86: correct and improve "*vec_dupv2di"

2023-06-15 Thread Jan Beulich via Gcc-patches
On 15.06.2023 09:45, Hongtao Liu wrote: > On Thu, Jun 15, 2023 at 3:07 PM Uros Bizjak via Gcc-patches > wrote: >> On Thu, Jun 15, 2023 at 8:03 AM Jan Beulich via Gcc-patches >> wrote: >>> +case 3: >>> + return "%vmovddup\t{%1, %0|%0, %1}"; &

Re: [PATCH] x86: make better use of VBROADCASTSS / VPBROADCASTD

2023-06-14 Thread Jan Beulich via Gcc-patches
On 15.06.2023 07:23, Hongtao Liu wrote: > On Wed, Jun 14, 2023 at 5:03 PM Jan Beulich wrote: >> >> On 14.06.2023 09:41, Hongtao Liu wrote: >>> On Wed, Jun 14, 2023 at 1:58 PM Jan Beulich via Gcc-patches >>> wrote: >>>> >>>> ... in vec_d

[PATCH] x86: correct and improve "*vec_dupv2di"

2023-06-14 Thread Jan Beulich via Gcc-patches
The input constraint for the %vmovddup alternative was wrong, as the upper 16 XMM registers require AVX512VL to be used with this insn. To compensate, introduce a new alternative permitting all 32 registers, by broadcasting to the full 512 bits in that case if AVX512VL is not available. gcc/

Re: [PATCH] x86: make VPTERNLOG* usable on less than 512-bit operands with just AVX512F

2023-06-14 Thread Jan Beulich via Gcc-patches
On 14.06.2023 10:10, Hongtao Liu wrote: > On Wed, Jun 14, 2023 at 1:59 PM Jan Beulich via Gcc-patches > wrote: >> >> There's no reason to constrain this to AVX512VL, as the wider operation >> is not usable for more narrow operands only when the possible memor

Re: [PATCH] x86: make better use of VBROADCASTSS / VPBROADCASTD

2023-06-14 Thread Jan Beulich via Gcc-patches
On 14.06.2023 09:41, Hongtao Liu wrote: > On Wed, Jun 14, 2023 at 1:58 PM Jan Beulich via Gcc-patches > wrote: >> >> ... in vec_dupv4sf / *vec_dupv4si. The respective broadcast insns are >> never longer (yet sometimes shorter) than the corresponding VSHUFPS / >>

[PATCH] x86: make VPTERNLOG* usable on less than 512-bit operands with just AVX512F

2023-06-13 Thread Jan Beulich via Gcc-patches
There's no reason to constrain this to AVX512VL, as the wider operation is not usable for more narrow operands only when the possible memory source is a non-broadcast one. This way even the scalar copysign3 can benefit from the operation being a single-insn one (leaving aside moves which the compil

[PATCH] x86: make better use of VBROADCASTSS / VPBROADCASTD

2023-06-13 Thread Jan Beulich via Gcc-patches
... in vec_dupv4sf / *vec_dupv4si. The respective broadcast insns are never longer (yet sometimes shorter) than the corresponding VSHUFPS / VPSHUFD, due to the immediate operand of the shuffle insns balancing the need for VEX3 in the broadcast ones. When EVEX encoding is required the broadcast insn

[PATCH] x86: add Bk and Br to comment list B's sub-chars

2023-06-13 Thread Jan Beulich via Gcc-patches
gcc/ * config/i386/constraints.md: Mention k and r for B. --- a/gcc/config/i386/constraints.md +++ b/gcc/config/i386/constraints.md @@ -162,7 +162,9 @@ ;; g GOT memory operand. ;; m Vector memory operand ;; c Constant memory operand +;; k TLS address that allows insn using non-

[PATCH] x86/AVX512: use VMOVDDUP for broadcast to V2DF

2023-06-13 Thread Jan Beulich via Gcc-patches
Like is already the case for the AVX/AVX2 form, VMOVDDUP - acting on double precision floating values - is more appropriate to use here, and it can also result in shorter insn encodings when source is memory or %xmm0...%xmm7, and no masking is applied (in allowing a 2-byte VEX prefix then instead o

Re: [PATCH v3] i386: Allow -mlarge-data-threshold with -mcmodel=large

2023-06-12 Thread Jan Beulich via Gcc-patches
On 13.06.2023 05:28, Fangrui Song wrote: > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/large-data.c > @@ -0,0 +1,13 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target lp64 } */ > +/* { dg-options "-O2 -mcmodel=large -mlarge-data-threshold=4" } */ > +/* { dg-final { scan-assem

Re: [PATCH v2] i386: Allow -mlarge-data-threshold with -mcmodel=large

2023-05-26 Thread Jan Beulich via Gcc-patches
On 25.05.2023 18:11, Fangrui Song wrote: > On 2023-05-25, Jan Beulich wrote: >> On 25.05.2023 17:16, Fangrui Song wrote: >>> --- a/gcc/doc/invoke.texi >>> +++ b/gcc/doc/invoke.texi >>> @@ -32942,9 +32942,10 @@ the cache line size. @samp{compat} is the de

Re: [PATCH v2] i386: Allow -mlarge-data-threshold with -mcmodel=large

2023-05-25 Thread Jan Beulich via Gcc-patches
On 25.05.2023 17:16, Fangrui Song wrote: > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -32942,9 +32942,10 @@ the cache line size. @samp{compat} is the default. > > @opindex mlarge-data-threshold > @item -mlarge-data-threshold=@var{threshold} > -When @option{-mcmodel=medium} is s

Re: Ping: [PATCH] testsuite/C++: suppress filename canonicalization in module tests

2023-04-27 Thread Jan Beulich via Gcc-patches
On 28.04.2023 00:24, Nathan Sidwell wrote: > On 4/25/23 11:04, Jan Beulich wrote: >> On 28.06.2022 16:06, Jan Beulich wrote: >>> The pathname underneath gcm.cache/ is determined from the effective name >>> used for the main input file of a particular module. Whe

Re: [PATCH] testsuite: adjust NOP expectations for RISC-V

2023-04-27 Thread Jan Beulich via Gcc-patches
On 26.04.2023 17:45, Palmer Dabbelt wrote: > On Wed, 26 Apr 2023 08:26:26 PDT (-0700), gcc-patches@gcc.gnu.org wrote: >> >> >> On 4/25/23 08:50, Jan Beulich via Gcc-patches wrote: >>> RISC-V will emit ".option nopic" when -fno-pie is in effect, which >>

Ping: [PATCH] testsuite/C++: suppress filename canonicalization in module tests

2023-04-25 Thread Jan Beulich via Gcc-patches
On 28.06.2022 16:06, Jan Beulich wrote: > The pathname underneath gcm.cache/ is determined from the effective name > used for the main input file of a particular module. When modules are > built, no canonicalization occurs for the main input file. Hence the > module file wouldn

[PATCH v2] testsuite/C++: cope with IPv6 being unavailable

2023-04-25 Thread Jan Beulich via Gcc-patches
When IPv6 is disabled in the kernel, the error message coming back from Cody::OpenInet6() is different from the sole so far expected one. --- v2: Re-base. --- a/gcc/testsuite/g++.dg/modules/bad-mapper-3.C +++ b/gcc/testsuite/g++.dg/modules/bad-mapper-3.C @@ -1,6 +1,6 @@ // { dg-additional-option

[PATCH] testsuite: adjust NOP expectations for RISC-V

2023-04-25 Thread Jan Beulich via Gcc-patches
RISC-V will emit ".option nopic" when -fno-pie is in effect, which matches the generic pattern. Just like done for Alpha, special-case RISC-V. --- A couple more targets look to be affected as well, simply because their "no-operation" insn doesn't match the expectation. With the apparently necessary

[PATCH] testsuite/C++: cope with IPv6 being unavailable

2022-06-28 Thread Jan Beulich via Gcc-patches
When IPv6 is disabled in the kernel, the error message coming back from Cody::OpenInet6() is different from the sole so far expected one. gcc/testsuite/ * g++.dg/modules/bad-mapper-3.C: Relax failure pattern. --- a/gcc/testsuite/g++.dg/modules/bad-mapper-3.C +++ b/gcc/testsuite/g++.dg/mo

[PATCH] testsuite/C++: suppress filename canonicalization in module tests

2022-06-28 Thread Jan Beulich via Gcc-patches
The pathname underneath gcm.cache/ is determined from the effective name used for the main input file of a particular module. When modules are built, no canonicalization occurs for the main input file. Hence the module file wouldn't be found if a different (the canonicalized) file name was used whe

Ping: [PATCH] libatomic: drop redundant all-multi command

2022-06-27 Thread Jan Beulich via Gcc-patches
On 27.05.2022 10:01, Jan Beulich wrote: > ./multilib.am already specifies this same command, and make warns about > the earlier one being ignored when seeing the later one. All that needs > retaining to still satisfy the preceding comment is the extra > dependency. &g

[PATCH] testsuite/ix86: SSE2 is a prereq to _Float16 use

2022-06-27 Thread Jan Beulich via Gcc-patches
When enabling AVX512FP via attribute or pragma, the _Float16 type would remain unavailable when at initialization time SSE2 wouldn't be seen as available for use. While this may hint at a wider underlying issue (like the feature, the type may want providing dynamically, albeit this may be challengi

[PATCH] testsuite/ix86: prune MMX ABI warning

2022-06-27 Thread Jan Beulich via Gcc-patches
So far on 32-bit hosts this test failed (for both C and C++) because of the ABI change warning occurring without (explictly) enabling MMX. gcc/testsuite/ * c-c++-common/torture/builtin-shufflevector-2.c: Prune ix86 MMX ABI warning. --- a/gcc/testsuite/c-c++-common/torture/builtin

Re: [PATCH] configure: arrange to use appropriate objcopy

2022-06-07 Thread Jan Beulich via Gcc-patches
On 07.06.2022 09:41, Jakub Jelinek wrote: > On Tue, Jun 07, 2022 at 08:12:26AM +0200, Jan Beulich via Gcc-patches wrote: >>> This regressed >>> Executing on host: /home/jakub/src/gcc/obj44/gcc/xgcc >>> -B/home/jakub/src/gcc/obj44/gcc/ -fdiagnostics-plain-output

Re: [PATCH] configure: arrange to use appropriate objcopy

2022-06-06 Thread Jan Beulich via Gcc-patches
On 04.06.2022 10:32, Jakub Jelinek wrote: > On Thu, Jun 02, 2022 at 05:32:10PM +0200, Jan Beulich via Gcc-patches wrote: >> Using the system objcopy is wrong when other configure checks have >> probed a different set of binutils (I've noticed the problem on a system >> wh

[PATCH] configure: arrange to use appropriate objcopy

2022-06-02 Thread Jan Beulich via Gcc-patches
Using the system objcopy is wrong when other configure checks have probed a different set of binutils (I've noticed the problem on a system where the base objcopy can't deal with compressed debug sections). Arrange for the matching one to be picked up, first and foremost if an "in tree" one is avai

[PATCH] x86-64: make "length_vex" also account for VEX.B use by register operand

2022-06-02 Thread Jan Beulich via Gcc-patches
The length attribute ought to be "the (bounding maximum) length of an instruction" according to the comment next to its definition. A register operand encoded using the ModR/M.rm field will additionally use VEX.B for encoding the highest bit of the register number. Hence for the high 8 GPR register

  1   2   >