Re: [PATCH, V2] Do not enable -mblock-ops-vector-pair.

2022-08-03 Thread Segher Boessenkool
Hi Mike, On Mon, Jul 25, 2022 at 04:15:05PM -0400, Michael Meissner wrote: > Testing has shown that using the load vector pair and store vector pair > instructions for block moves has some performance issues on power10. > This patch eliminates the code setting -mblock-ops-vector-pair. If you > w

Re: [PATCH, rs6000] Correct return value of check_p9modulo_hw_available

2022-08-04 Thread Segher Boessenkool
Hi! On Thu, Aug 04, 2022 at 05:55:20PM +0800, HAO CHEN GUI wrote: > This patch corrects return value of check_p9modulo_hw_available. It should > return 0 when p9modulo is supported. It would be harder to make such mistakes if it used exit() explicitly, so that the reader is reminded the shell s

Re: [PATCH, rs6000] TARGET_MADDLD should include TARGET_POWERPC64

2022-08-04 Thread Segher Boessenkool
Hi! On Thu, Aug 04, 2022 at 11:17:48AM +0800, HAO CHEN GUI wrote: > On 4/8/2022 上午 12:54, Segher Boessenkool wrote: > > Hrm. But the maddld insn is useful for SImode as well, in 32-bit mode, > > it is just its name that is a bit confusing then. Sorry for confusing > > thin

Re: [PATCH 0/5] IEEE 128-bit built-in overload support.

2022-08-05 Thread Segher Boessenkool
On Thu, Jul 28, 2022 at 12:43:49AM -0400, Michael Meissner wrote: > These patches lay the foundation for a set of follow-on patches that will > change the internal handling of 128-bit floating point types in GCC. In the > future patches, I hope to change the compiler to always use KFmode for the >

Re: [PATCH] rs6000: Rework ELFv2 support for -fpatchable-function-entry* [PR99888]

2022-08-09 Thread Segher Boessenkool
Hi! > + /* As ELFv2 ABI shows, the allowable bytes past the global entry > + point are 0, 4, 8, 16, 32 and 64. Considering there are two > + non-prefixed instructions for global entry (8 bytes), the count > + for patchable NOPs before local entry would be 2, 6 and

Re: [PATCH] rs6000: Rework ELFv2 support for -fpatchable-function-entry* [PR99888]

2022-08-09 Thread Segher Boessenkool
Hi! On Tue, Aug 09, 2022 at 08:51:59PM +0800, Kewen.Lin wrote: > on 2022/8/9 18:35, Segher Boessenkool wrote: > >> +/* As ELFv2 ABI shows, the allowable bytes past the global entry > >> + point are 0, 4, 8, 16, 32 and 64. Considering there are two > >> +

Re: [PATCH v2, rs6000] Add multiply-add expand pattern [PR103109]

2022-08-09 Thread Segher Boessenkool
On Tue, Aug 09, 2022 at 11:14:16AM +0800, Kewen.Lin wrote: > on 2022/8/8 14:04, HAO CHEN GUI wrote: > > +/* { dg-do run { target { has_arch_ppc64 } } } */ > > +/* { dg-options "-O2 -mdejagnu-cpu=power9 -save-temps" } */ > > +/* { dg-require-effective-target int128 } */ > > +/* { dg-require-effectiv

Re: [PATCH v2, rs6000] Add multiply-add expand pattern [PR103109]

2022-08-09 Thread Segher Boessenkool
Hi! On Mon, Aug 08, 2022 at 02:04:07PM +0800, HAO CHEN GUI wrote: > This patch adds an expand and several insns for multiply-add with three > 64bit operands. Also for maddld for 32-bit operands. >"maddld %0,%1,%2,%3" >[(set_attr "type" "mul")]) I suppose attr "size" isn't relevant for

Re: [PATCH] rs6000: Fix incorrect RTL for Power LE when removing the UNSPECS [PR106069]

2022-08-09 Thread Segher Boessenkool
Hi! On Tue, Aug 09, 2022 at 11:01:05AM +0800, Kewen.Lin wrote: > on 2022/8/8 11:42, Xionghu Luo wrote: > > Regression tested pass for Power8{LE,BE}{32,64} and Power{9,10}LE{32,64} > > Sorry, no -m32 for LE testing. You can use -m32 on powerpc64le-*, but the default configuration disallows it. T

Re: [PATCH] rs6000: Enable generate const through pli+pli+rldimi

2022-08-10 Thread Segher Boessenkool
Hi! On Wed, Aug 10, 2022 at 03:11:23PM +0800, Jiufu Guo wrote: > As mentioned in PR106550, since pli could support 34bits immediate, we could > use less instructions(3insn would be ok) to build 64bits constant with pli. > > For example, for constant 0x020805006106003, we could generate it with: >

Re: [PATCH 0/5] IEEE 128-bit built-in overload support.

2022-08-10 Thread Segher Boessenkool
On Wed, Aug 10, 2022 at 02:23:27AM -0400, Michael Meissner wrote: > On Fri, Aug 05, 2022 at 01:19:05PM -0500, Segher Boessenkool wrote: > > On Thu, Jul 28, 2022 at 12:43:49AM -0400, Michael Meissner wrote: > > > These patches lay the foundation for a set of follow-on patches that

Re: [PATCH v2] rs6000: Fix incorrect RTL for Power LE when removing the UNSPECS [PR106069]

2022-08-10 Thread Segher Boessenkool
On Wed, Aug 10, 2022 at 02:39:02PM +0800, Xionghu Luo wrote: > On 2022/8/9 11:01, Kewen.Lin wrote: > >I have some concern on those changed "altivec_*_direct", IMHO the suffix > >"_direct" is normally to indicate the define_insn is mapped to the > >corresponding hw insn directly. With this change,

Re: [PATCH v3] Modify combine pattern by a pseudo AND with its nonzero bits [PR93453]

2022-08-10 Thread Segher Boessenkool
Hi! Sorry for the tardiness. On Fri, Jul 22, 2022 at 03:07:55PM +0800, HAO CHEN GUI wrote: > This patch creates a new function - change_pseudo_and_mask. If recog fails, > the function converts a single pseudo to the pseudo AND with a mask if the > outer operator is IOR/XOR/PLUS and inner operat

Re: [PATCH] rs6000: Enable generate const through pli+pli+rldimi

2022-08-11 Thread Segher Boessenkool
Hi! On Thu, Aug 11, 2022 at 08:52:49PM +0800, Jiufu Guo wrote: > Segher Boessenkool writes: > > On Wed, Aug 10, 2022 at 03:11:23PM +0800, Jiufu Guo wrote: > >> @@ -9659,7 +9659,7 @@ (define_split > >> ;; When non-easy constants can go in the TOC, this should us

Re: [PATCH V3 1/4] rs6000: build constant via li;rotldi

2023-06-16 Thread Segher Boessenkool
Hi! On Fri, Jun 16, 2023 at 04:34:12PM +0800, Jiufu Guo wrote: > +/* Check if value C can be built by 2 instructions: one is 'li', another is > + rotldi. > + > + If so, *SHIFT is set to the shift operand of rotldi(rldicl), and *MASK > + is set to -1, and return true. Return false otherwise.

Re: [PATCH, V6] Fix power10 fusion and -fstack-protector, PR target/105325

2023-06-20 Thread Segher Boessenkool
Hi! The patch looks great now, thanks you! But the commit message needs some work: First off, the subject, which is a short (50 character max!) summary of what the patch is about. Fix power10 fusion and -fstack-protector, PR target/105325 There is absolutely nothing to do with stack protector,

Re: [PATCH] rs6000: Don't ICE when generating vector pair load/store insns [PR110411]

2023-07-06 Thread Segher Boessenkool
Hi! On Wed, Jul 05, 2023 at 05:21:18PM +0530, P Jeevitha wrote: > The following patch has been bootstrapped and regtested on powerpc64le-linux. > > while generating vector pairs of load & store instruction, the src address > was treated as an altivec type and that type of address is invalid for

Re: [PATCH] rs6000: Don't ICE when generating vector pair load/store insns [PR110411]

2023-07-06 Thread Segher Boessenkool
On Thu, Jul 06, 2023 at 02:48:19PM -0500, Peter Bergner wrote: > On 7/6/23 12:33 PM, Segher Boessenkool wrote: > > On Wed, Jul 05, 2023 at 05:21:18PM +0530, P Jeevitha wrote: > >> --- a/gcc/config/rs6000/rs6000.cc > >> +++ b/gcc/config/rs6000/rs6000

Re: [PATCH] Fix typo in insn name.

2023-07-10 Thread Segher Boessenkool
Hi! On Mon, Jul 10, 2023 at 03:59:44PM -0400, Michael Meissner wrote: > In doing other work, I noticed that there was an insn: > > vsx_extract_v4sf__load > > Which did not have an iterator. I removed the useless . This patch does that, you mean. > --- a/gcc/config/rs6000/vsx.md > +++ b/

Re: [PATCH v2] swap: Fix incorrect lane extraction by vec_extract() [PR106770]

2023-10-28 Thread Segher Boessenkool
Hi! Please say "rs6000/p8swap:" in the subject, not "swap:" :-) On Sun, Sep 10, 2023 at 10:58:32PM +0530, Surya Kumari Jangala wrote: > Another issue with always handling swappable instructions is that it is > incorrect to do so in webs where loads/stores on quad word aligned > addresses are chan

Re: [PATCH] rs6000, Add missing overloaded bcd builtin tests

2023-10-31 Thread Segher Boessenkool
On Tue, Oct 31, 2023 at 08:31:25AM -0700, Carl Love wrote: > > I just found that actually they have the test coverage, because we > > have > > > > #define __builtin_bcdcmpeq(a,b) __builtin_vec_bcdsub_eq(a,b,0) > > #define __builtin_bcdcmpgt(a,b) __builtin_vec_bcdsub_gt(a,b,0) > > #define __bui

Re: [committed] powerpc: Fix up expansion for WIDEN_MULT_PLUS_EXPR [PR108787]

2023-02-15 Thread Segher Boessenkool
Hi! On Wed, Feb 15, 2023 at 10:18:29AM +0100, Jakub Jelinek wrote: > If we wanted to get back the signed op1 * op2 + op3 all in the DImode > into TImode op0, we'd need to introduce a new tree code next to > WIDEN_MULT_PLUS_EXPR and maddMN4 expander, because I'm afraid it can't > be done at expansi

Re: [PATCH] rs6000: Fix vector parity support [PR108699]

2023-02-16 Thread Segher Boessenkool
Hi! On Thu, Feb 16, 2023 at 05:23:40PM +0800, Kewen.Lin wrote: > This patch is to fix the handling with one more pre-insn > vpopcntb. It also fixes an oversight having V8HI in VEC_IP, > replaces VParity with VEC_IP, and adjusts the existing > UNSPEC_PARITY to a more meaningful name UNSPEC_PARITYB

Re: [PATCH] rs6000: Fix vector parity support [PR108699]

2023-02-16 Thread Segher Boessenkool
Hi! On Thu, Feb 16, 2023 at 08:06:02PM +0800, Kewen.Lin wrote: > on 2023/2/16 19:14, Segher Boessenkool wrote: > > On Thu, Feb 16, 2023 at 05:23:40PM +0800, Kewen.Lin wrote: > >> This patch is to fix the handling with one more pre-insn > >> vpopcntb. It also fixes

Re: [PATCH] rs6000: fmr gets used instead of faster xxlor [PR93571]

2023-02-17 Thread Segher Boessenkool
Hi! On Fri, Feb 17, 2023 at 10:28:41PM +0530, Ajit Agarwal wrote: > This patch replaces fmr instruction (6 cycles) with xxlor instruction ( 2 > cycles) > Bootstrapped and regtested on powerpc64-linux-gnu. You tested this on a CPU that does have VSX. It is incorrect on other (older) CPUs. > ---

Re: [PATCH] rs6000: Fix vector parity support [PR108699]

2023-02-20 Thread Segher Boessenkool
Hi! On Fri, Feb 17, 2023 at 11:33:16AM +0800, Kewen.Lin wrote: > on 2023/2/16 23:10, Segher Boessenkool wrote: > > No, you are right that the semantics are pretty much the same. Please > > just keep UNSPEC_PARITY everywhere. > > OK, since it has UNSPEC, I would hope the re

Re: [PATCH] rs6000: fmr gets used instead of faster xxlor [PR93571]

2023-02-21 Thread Segher Boessenkool
Hi! On Tue, Feb 21, 2023 at 02:18:25PM +0530, Ajit Agarwal wrote: > This patch replaces fmr instruction 6 cycles with 2 cycles xxlor instruction > for p7 and p8 architecture. > > I have implemented with switch and cases otherwise it is difficult to > accommodate > xxlor with p7 and p8 and fmr fo

Re: [PATCH] rs6000: fmr gets used instead of faster xxlor [PR93571]

2023-02-21 Thread Segher Boessenkool
On Tue, Feb 21, 2023 at 06:00:52PM +0530, Ajit Agarwal wrote: > On 21/02/23 4:34 pm, Segher Boessenkool wrote: > > Please domn't use a switch, it isn't needed. Instead use the "isa" > > attribute (with p7v here), and put the preferred alternative first. >

Re: [PATCH] rs6000: fmr gets used instead of faster xxlor [PR93571]

2023-02-24 Thread Segher Boessenkool
Hi! For future patches: please don't send patches as replies to existing threads. Just start a new thread for a new patch (series). You can mark it as [PATCH v2] in the subject, if you want. On Fri, Feb 24, 2023 at 01:41:49PM +0530, Ajit Agarwal wrote: > Here is the patch that uses xxlor instea

Re: [PATCH] swap: Fix incorrect lane extraction by vec_extract() [PR106770]

2023-02-27 Thread Segher Boessenkool
Hi! On Wed, Jan 04, 2023 at 01:58:19PM +0530, Surya Kumari Jangala wrote: > In the routine rs6000_analyze_swaps(), special handling of swappable > instructions is done even if the webs that contain the swappable > instructions are not optimized, i.e., the webs do not contain any > permuting load/s

Re: [PATCH] Fix RTL simplifications of FFS, POPCOUNT and PARITY.

2023-02-27 Thread Segher Boessenkool
Hi! On Sun, Feb 26, 2023 at 01:10:41PM -, Roger Sayle wrote: > This patch teaches simplify-rtx.cc to err on the side of caution, by never > creating (new) FFS, POPCOUNT or PARITY rtx with mismatched modes, > matching the documentation. > * simplify-rtx.cc (simplify_unary_operation_1)

Re: [PATCH, rs6000] Tweak modulo define_insns to eliminate register copy

2023-02-27 Thread Segher Boessenkool
Hi! On Mon, Feb 27, 2023 at 09:11:37AM -0600, Pat Haugen wrote: > The define_insns for the modulo operation currently force the target > register > to a distinct reg in preparation for a possible future peephole combining > div/mod. But this can lead to cases of a needless copy being inserted. Fi

Re: [PATCH, rs6000] Tweak modulo define_insns to eliminate register copy

2023-02-27 Thread Segher Boessenkool
Hi! On Mon, Feb 27, 2023 at 02:12:23PM -0600, Pat Haugen wrote: > On 2/27/23 11:08 AM, Segher Boessenkool wrote: > >On Mon, Feb 27, 2023 at 09:11:37AM -0600, Pat Haugen wrote: > >>The define_insns for the modulo operation currently force the target > >>regist

Re: [PATCH] optabs: Fix up expand_doubleword_shift_condmove for shift_mask == 0 [PR108803]

2023-02-27 Thread Segher Boessenkool
On Mon, Feb 27, 2023 at 09:54:06PM +0100, Jakub Jelinek wrote: > Even if the target-independent code doesn't know what the target dependent > code will do, I don't see how it could emit it safely. I always understood RTL to not have anything like C "undefined behavior", but be closer in general (e

Re: [PATCH] optabs: Fix up expand_doubleword_shift_condmove for shift_mask == 0 [PR108803]

2023-02-27 Thread Segher Boessenkool
Hi! On Mon, Feb 27, 2023 at 08:11:09PM +0100, Jakub Jelinek wrote: > (insn 52 48 53 2 (set (reg:CC 66 cc) > (compare:CC (reg:SI 130) > (const_int 0 [0]))) "pr108803.c":12:25 437 {cmpsi} > (expr_list:REG_DEAD (reg:SI 130) > (expr_list:REG_EQUAL (compare:CC (const_in

Re: [PATCH, rs6000] Tweak modulo define_insns to eliminate register copy

2023-02-27 Thread Segher Boessenkool
On Mon, Feb 27, 2023 at 04:03:56PM -0600, Pat Haugen wrote: > On 2/27/23 2:53 PM, Segher Boessenkool wrote: > >"Slightly". It takes 12 cycles for the two in parallel (64-bit, p9), > >but 17 cycles for the "cheaper" sequence (divd+mulld+subf, 12+5+2). It >

Re: [PATCH, V3] PR 107299, GCC does not build on PowerPC when long double is IEEE 128-bit

2023-03-02 Thread Segher Boessenkool
Hi! On Wed, Dec 14, 2022 at 03:29:02PM -0500, Michael Meissner wrote: > These 3 patches fix the problems with building GCC on PowerPC systems when > long > double is configured to use the IEEE 128-bit format. If you are strictly trying to fix a bootstrap problem, you should say so: it should be

Re: [PATCH] swap: Fix incorrect lane extraction by vec_extract() [PR106770]

2023-03-03 Thread Segher Boessenkool
Hi! On Fri, Mar 03, 2023 at 04:29:57PM +0530, Surya Kumari Jangala wrote: > On 27/02/23 9:58 pm, Segher Boessenkool wrote: > > On Wed, Jan 04, 2023 at 01:58:19PM +0530, Surya Kumari Jangala wrote: > >> + register swaps of permuting loads/stores have been removed. */ >

Re: [PATCH 1/2] PR target/107299: Fix build issue when long double is IEEE 128-bit

2023-03-03 Thread Segher Boessenkool
Hi! On Fri, Feb 03, 2023 at 12:49:12AM -0500, Michael Meissner wrote: > This patch updates the IEEE 128-bit types used in libgcc. > > At the moment, we cannot build GCC when the target uses IEEE 128-bit long > doubles, such as building the compiler for a native Fedora 36 system. The > build dies

Re: [PATCH 2/2] Rework 128-bit complex multiply and divide.

2023-03-03 Thread Segher Boessenkool
Hi! On Fri, Feb 03, 2023 at 12:53:05AM -0500, Michael Meissner wrote: > This patch reworks how the complex multiply and divide built-in functions are > done. > I tested all 3 patchs for PR target/107299 on: Is this part of the proposed commit message? As Ke Wen pointed out, it is wrong. Most o

Re: [PATCH] PR rtl-optimization/106594: Preserve zero_extend in combine when cheap.

2023-03-04 Thread Segher Boessenkool
On Sat, Mar 04, 2023 at 06:32:15PM -, Roger Sayle wrote: > This patch addresses PR rtl-optimization/106594, a P1 performance > regression affecting aarch64. > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check, both with and without --target_board=unix{

Re: [PATCH] PR rtl-optimization/106594: Preserve zero_extend in combine when cheap.

2023-03-05 Thread Segher Boessenkool
Hi! On Sun, Mar 05, 2023 at 08:43:20PM +, Tamar Christina wrote: > > On 3/5/23 12:28, Tamar Christina via Gcc-patches wrote: > > > The regression was reported during stage-1. A patch was provided during > > stage 1 and the discussions around combine stalled. > > > > > > The regression for AArc

Re: [PATCH] PR rtl-optimization/106594: Preserve zero_extend in combine when cheap.

2023-03-06 Thread Segher Boessenkool
Hi! On Sun, Mar 05, 2023 at 03:33:40PM -0600, Segher Boessenkool wrote: > On Sun, Mar 05, 2023 at 08:43:20PM +, Tamar Christina wrote: > Yes, *look* better: I have seen no proof or indication that this would ("looks", I cannot type, sorry) > actually generate better cod

Re: [PATCH] combine: Try harder to form zero_extends [PR106594]

2023-03-06 Thread Segher Boessenkool
On Mon, Mar 06, 2023 at 12:47:06PM +, Richard Sandiford wrote: > How about the patch below? What about it? What would make it any better than the previous? Oh, and please do not send new patches in old threads :-( Segher

Re: [PATCH] combine: Try harder to form zero_extends [PR106594]

2023-03-06 Thread Segher Boessenkool
On Mon, Mar 06, 2023 at 05:18:50PM +0100, Jakub Jelinek wrote: > On Mon, Mar 06, 2023 at 03:08:00PM +, Richard Sandiford via Gcc-patches > wrote: > That still feels like it could be risky in stage4, affecting various other > FEs which would be expecting ANDs in their patterns instead of *_EXTE

Re: [PATCH] combine: Try harder to form zero_extends [PR106594]

2023-03-06 Thread Segher Boessenkool
On Mon, Mar 06, 2023 at 04:34:59PM +, Richard Sandiford wrote: > Jakub Jelinek writes: > > Could we have a target hook to canonicalize memory addresses for combiner, > > like we have that targetm.canonicalize_comparison ? > > I don't think a hook makes sense as a long-term design decision. >

Re: [PATCH] combine: Try harder to form zero_extends [PR106594]

2023-03-06 Thread Segher Boessenkool
On Mon, Mar 06, 2023 at 04:34:59PM +, Richard Sandiford wrote: > Jakub Jelinek writes: > > On Mon, Mar 06, 2023 at 03:08:00PM +, Richard Sandiford via Gcc-patches > > wrote: > >> Segher Boessenkool writes: > >> > On Mon, Mar 06, 2023 at 12:47

Re: [PATCH] combine: Try harder to form zero_extends [PR106594]

2023-03-06 Thread Segher Boessenkool
Hi! On Mon, Mar 06, 2023 at 07:13:08PM +, Richard Sandiford wrote: > Segher Boessenkool writes: > > Most importantly, what makes you think this is a problem for aarch64 > > only? If it actually is, you can fix it in the aarch64 config! Either > > with or without new

Re: [PATCH] combine: Try harder to form zero_extends [PR106594]

2023-03-08 Thread Segher Boessenkool
On Wed, Mar 08, 2023 at 11:58:51AM +, Richard Sandiford wrote: > Segher Boessenkool writes: > > An #ifdef is a way of making a change that is not finished yet not hurt > > the other targets. It still hurts generic development, which indirectly > > hurts all targets.

Re: [PATCH] rs6000: Accept const pointer operands for MMA builtins [PR109073]

2023-03-09 Thread Segher Boessenkool
Hi! On Thu, Mar 09, 2023 at 05:30:53PM +0800, Kewen.Lin wrote: > on 2023/3/9 07:01, Peter Bergner via Gcc-patches wrote: > > PR109073 shows a problem where GCC 11 and GCC 10 do not accept a const > > __vector_pair pointer operand to some MMA builtins, which GCC 12 and later > > correctly accept.

Re: [PATCH 2/2] Rework 128-bit complex multiply and divide.

2023-03-09 Thread Segher Boessenkool
On Thu, Mar 09, 2023 at 11:11:34AM -0500, Michael Meissner wrote: > On Fri, Mar 03, 2023 at 03:35:44PM -0600, Segher Boessenkool wrote: > > > +/* { dg-final { scan-assembler "bl __divtc3" } } */ > > > > This name depends on what object format and ABI is in u

Re: [PATCH] [rs6000] adjust return_pc debug attrs

2023-03-13 Thread Segher Boessenkool
Hi! This is stage 1 stuff (or does it fix some regression or such?) On Fri, Mar 03, 2023 at 03:00:02PM -0300, Alexandre Oliva wrote: > Some of the rs6000 call patterns, on some ABIs, issue multiple opcodes > out of a single call insn, but the call (bl) or jump (b) is not always > the last opcode

Re: [PATCH] rs6000: Accept const pointer operands for MMA builtins [PR109073]

2023-03-13 Thread Segher Boessenkool
Hi! On Thu, Mar 09, 2023 at 07:24:58PM -0600, Peter Bergner wrote: > On 3/9/23 8:55 AM, Segher Boessenkool wrote: > >> Nit: Maybe we can build them out of the loop once and then just use the > >> built one in the loop. > > > > Or as globals even. Currently w

Re: [PATCH] [powerpc] Add a peephole2 to eliminate redundant move from VSX_REGS to GENERAL_REGS when it's from memory.

2023-05-15 Thread Segher Boessenkool
On Thu, May 04, 2023 at 01:54:46PM +0800, liuhongt wrote: > r14-172-g0368d169492017 use NO_REGS instead of GENERAL_REGS in memory cost > calculation when preferred register class is unkown. > + /* Costs for NO_REGS are used in cost calculation on the > +1st pass when the preferred regi

Re: [PATCH v5 1/4] rs6000: Enable REE pass by default

2023-05-16 Thread Segher Boessenkool
Hi! On Tue, May 16, 2023 at 11:45:28AM +0530, Ajit Agarwal wrote: > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -12455,8 +12455,8 @@ Attempt to remove redundant extension instructions. > This is especially > helpful for the x86-64 architecture, which implicitly zero-extends in 64

Re: [PATCH v1] tree-ssa-sink: Improve code sinking pass.

2023-05-18 Thread Segher Boessenkool
Hi! On Thu, May 18, 2023 at 12:44:28PM +0530, Ajit Agarwal wrote: > This patch improves code sinking pass to sink statements before call to reduce > register pressure. An example would be useful :-) > * tree-ssa-sink.cc (statement_sink_location): Modifed to > move statements before c

Re: [PATCH] [testsuite] [powerpc] adjust -m32 counts for fold-vec-extract*

2023-05-25 Thread Segher Boessenkool
Hi! On Thu, May 25, 2023 at 07:05:55AM -0300, Alexandre Oliva wrote: > On May 25, 2023, "Kewen.Lin" wrote: > > So both lp64 and ilp32 have the same count, could we merge it and > > remove the selectors? > > We could, but... I thought I wouldn't, since they were different > before, and they're l

Re: [PATCH] [testsuite] [powerpc] adjust -m32 counts for fold-vec-extract*

2023-05-25 Thread Segher Boessenkool
Hi Alex, On Thu, May 25, 2023 at 10:55:37AM -0300, Alexandre Oliva wrote: > On May 25, 2023, Segher Boessenkool wrote: > > Fwiw, updating the insn counts blindly like this > > ... is a claim that carries a wildly incorrect and insulting underlying > assumption: Sorry you f

Re: [PATCH] Only use NO_REGS in cost calculation when !hard_regno_mode_ok for GENERAL_REGS and mode.

2023-05-25 Thread Segher Boessenkool
On Thu, May 25, 2023 at 10:29:47AM -0400, Vladimir Makarov wrote: > > On 5/17/23 02:57, liuhongt wrote: > >r14-172-g0368d169492017 replaces GENERAL_REGS with NO_REGS in cost > >calculation when the preferred register class are not known yet. > >It regressed powerpc PR109610 and PR109858, it looks

Re: [BACKPORT] Apply fix for PR libgcc/97643 to gcc 10 branch

2021-01-21 Thread Segher Boessenkool
On Wed, Jan 20, 2021 at 08:28:57PM -0500, Michael Meissner wrote: > On Wed, Jan 20, 2021 at 06:46:14PM -0600, Segher Boessenkool wrote: > > Is there a reason we do not have that testcase in the testsuite, btw? > > In order to test it you need to build a compiler + toolchain wh

Re: [PATCH 1/4] unroll: Add middle-end unroll factor estimation

2021-01-21 Thread Segher Boessenkool
Hi! What is holding up this patch still? Ke Wen has pinged it every month since May, and there has still not been a review. Segher On Thu, May 28, 2020 at 08:19:59PM +0800, Kewen.Lin wrote: > > gcc/ChangeLog > > 2020-MM-DD Kewen Lin > > * cfgloop.h (struct loop): New field estimat

Re: [PATCH 3/4] rs6000: Enable vec_insert for P8 with rs6000_expand_vector_set_var_p8

2021-01-21 Thread Segher Boessenkool
Hi! You never committed 2/4? That makes it harder to review this one :-) On Sat, Oct 10, 2020 at 03:08:24AM -0500, Xionghu Luo wrote: > gcc/ChangeLog: > > 2020-10-10 Xionghu Luo > > * config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin): > Generate ARRAY_REF(VIEW_CONVER

Re: [PATCH 4/4] rs6000: Update testcases' instruction count

2021-01-21 Thread Segher Boessenkool
Hi! On Sat, Oct 10, 2020 at 03:08:25AM -0500, Xionghu Luo wrote: > 2020-10-10 Xionghu Luo > > * gcc.target/powerpc/fold-vec-insert-char-p8.c: Adjust > instruction counts. > * gcc.target/powerpc/fold-vec-insert-char-p9.c: Likewise. > * gcc.target/powerpc/fold-vec-insert-

Re: [PATCH/RFC] combine: Tweak the condition of last_set invalidation

2021-01-21 Thread Segher Boessenkool
Hi Ke Wen, On Fri, Jan 15, 2021 at 04:06:17PM +0800, Kewen.Lin wrote: > on 2021/1/15 上午8:22, Segher Boessenkool wrote: > > On Wed, Dec 16, 2020 at 04:49:49PM +0800, Kewen.Lin wrote: > >>... op regX // this regX could find wrong last_set below > >>regX = ...

Re: [PATCH 1/4] unroll: Add middle-end unroll factor estimation

2021-01-22 Thread Segher Boessenkool
On Fri, Jan 22, 2021 at 02:47:06PM +0100, Richard Biener wrote: > On Thu, 21 Jan 2021, Segher Boessenkool wrote: > > What is holding up this patch still? Ke Wen has pinged it every month > > since May, and there has still not been a review. Richard Sandiford wrote: > FAOD (si

Re: [PATCH] rs6000: Fix up __m64 typedef in mmintrin.h [PR97301]

2021-01-22 Thread Segher Boessenkool
Hi Jakub, On Fri, Jan 22, 2021 at 07:02:04PM +0100, Jakub Jelinek wrote: > The x86 __m64 type is defined as: > /* The Intel API is flexible enough that we must allow aliasing with other >vector types, and their scalar components. */ > typedef int __m64 __attribute__ ((__vector_size__ (8), __m

Re: [PATCH] testsuite: Fix sse2-andnpd-1.c and sse-andnps-1.c testscases on powerpc

2021-01-22 Thread Segher Boessenkool
Hi! On Fri, Jan 22, 2021 at 08:02:28PM +0100, Jakub Jelinek wrote: > On Mon, Sep 21, 2020 at 10:12:20AM +0200, Richard Biener wrote: > > On Mon, 21 Sep 2020, Jan Hubicka wrote: > > > these testcases now fails because they contains an invalid type puning > > > that happens via const VALUE_TYPE *v p

Re: [PATCH 4/4] rs6000: Update testcases' instruction count

2021-01-22 Thread Segher Boessenkool
On Fri, Jan 22, 2021 at 03:02:47PM -0500, David Edelsohn wrote: > All of these testcases no fail on AIX. This was not tested properly. > Please fix. They fail on -m32 Linux as well: all failures are an unexpected count of addi insns. This may be related to the LRA regression we have (just based

Re: [PATCH] rs6000: Fix up __m64 typedef in mmintrin.h [PR97301]

2021-01-22 Thread Segher Boessenkool
Hi! On Sat, Jan 23, 2021 at 01:03:31AM +0100, Jakub Jelinek wrote: > On Fri, Jan 22, 2021 at 05:45:54PM -0600, Segher Boessenkool wrote: > > On Fri, Jan 22, 2021 at 07:02:04PM +0100, Jakub Jelinek wrote: > > > The x86 __m64 type is defined as: > > > /* The Intel API

Re: [PATCH] testsuite: Fix sse2-andnpd-1.c and sse-andnps-1.c testscases on powerpc

2021-01-23 Thread Segher Boessenkool
Hi! On Sat, Jan 23, 2021 at 09:41:23AM +0100, Jakub Jelinek wrote: > On Fri, Jan 22, 2021 at 06:56:37PM -0600, Segher Boessenkool wrote: > > So what is the actual error here? This whole union stuff is because we > > *do* want proper aliasing, afaics. > > The reading throu

Re: [PATCH, rs6000] Deprecate unnecessary __builtin_dfp_dtstsfi_*_dd and td overloads

2021-01-25 Thread Segher Boessenkool
Hi! On Thu, Jan 21, 2021 at 05:49:14PM -0600, will schmidt wrote: > Noted as part of the work-in-progress builtins rewrite, the > __builtin_dfp_dtstsfi_*_{dd,td} builtins are redundant, and are thusly > being marked as deprecated. They will be removed as part of the builtins > rewrite sometime

Re: [PATCH 1/4] unroll: Add middle-end unroll factor estimation

2021-01-25 Thread Segher Boessenkool
Hi! On Mon, Jan 25, 2021 at 05:59:23PM +, Richard Sandiford wrote: > Richard Biener writes: > > On Fri, 22 Jan 2021, Segher Boessenkool wrote: > >> But what could have been done differently that would have helped? Of > >> course Ke Wen could have written a better

Re: [PATCH 3/8] [RS6000] rs6000_rtx_costs tidy AND

2021-01-25 Thread Segher Boessenkool
Hi! On Thu, Oct 08, 2020 at 09:27:55AM +1030, Alan Modra wrote: > * config/rs6000/rs6000.c (rs6000_rtx_costs): Tidy AND code. > Don't avoid recursion on const_int shift count. > > diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c > index e870ba0039a..bc5e51aa5ce 100

Re: [PATCH 4/8] [RS6000] rs6000_rtx_costs tidy break/return

2021-01-25 Thread Segher Boessenkool
On Thu, Oct 08, 2020 at 09:27:56AM +1030, Alan Modra wrote: > Most cases use "return false" rather than breaking out of the switch. > Do so in all cases. > default: > - break; > + return false; > } > - > - return false; > } Please don't do this part. The rest is okay. Than

Re: [PATCH 5/8] [RS6000] rs6000_rtx_costs cost IOR

2021-01-25 Thread Segher Boessenkool
Hi! On Thu, Oct 08, 2020 at 09:27:57AM +1030, Alan Modra wrote: > * config/rs6000/rs6000.c (rotate_insert_cost): New function. > (rs6000_rtx_costs): Cost IOR. > > diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c > index 383d2901c9f..15a806fe307 100644 > --- a/gcc/c

Re: [PATCH,rs6000] Combine patterns for p10 load-cmpi fusion

2021-01-25 Thread Segher Boessenkool
Hi! On Fri, Dec 04, 2020 at 01:19:11PM -0600, acsaw...@linux.ibm.com wrote: > This patch adds the first batch of patterns to support p10 fusion. These > will allow combine to create a single insn for a pair of instructions > that that power10 can fuse and execute. These particular ones have the >

Re: [PATCH 1/4] unroll: Add middle-end unroll factor estimation

2021-01-26 Thread Segher Boessenkool
On Tue, Jan 26, 2021 at 04:53:25PM +0800, Kewen.Lin wrote: > on 2021/1/26 上午4:37, Segher Boessenkool wrote: > > On Mon, Jan 25, 2021 at 05:59:23PM +, Richard Sandiford wrote: > >> Richard Biener writes: > >>> On Fri, 22 Jan 2021, Segher Boessenkool wrote: > &

Re: [PATCH 1/4] unroll: Add middle-end unroll factor estimation

2021-01-26 Thread Segher Boessenkool
Hi! On Tue, Jan 26, 2021 at 11:47:53AM +0100, Richard Biener wrote: > Anyway, I think the GIMPLE -> RTL transition currently is a too > big step Much agreed. I also think that the expand pass itself needs a lot of work to bring it into this century: it does much too much work, in a circuitous wa

Re: [PATCH] testsuite: Fix sse2-andnpd-1.c and sse-andnps-1.c testscases on x86 and powerpc

2021-01-26 Thread Segher Boessenkool
On Tue, Jan 26, 2021 at 06:29:47PM +0100, Jakub Jelinek wrote: > On Sat, Jan 23, 2021 at 03:10:10PM -0600, Segher Boessenkool wrote: > > > The reason I chose the "no-strict-aliasing" attribute (and already > > > committed based on Richi's ack) was consistency

Re: [PATCH, rs6000] improve vec_ctf invalid parameter handling. (pr91903)

2021-01-27 Thread Segher Boessenkool
Hi! On Mon, Oct 26, 2020 at 04:22:32PM -0500, will schmidt wrote: > Per PR91903, GCC ICEs when we attempt to pass a variable > (or out of range value) into the vec_ctf() builtin. Per > investigation, the parameter checking exists for this > builtin with the int types, but was missing for > the

Re: [PATCH] PowerPC: Map IEEE 128-bit long double built-ins.

2021-01-27 Thread Segher Boessenkool
On Tue, Jan 19, 2021 at 12:24:51PM -0500, Michael Meissner wrote: > On Fri, Jan 15, 2021 at 03:43:13PM -0600, Segher Boessenkool wrote: > > Hi! > > > > On Thu, Jan 14, 2021 at 11:59:19AM -0500, Michael Meissner wrote: > > > >From 78435dee177447080434cdc08fc76b10

Re: [Ping] PowerPC: Map IEEE 128-bit long double built-ins.

2021-01-27 Thread Segher Boessenkool
On Tue, Jan 26, 2021 at 06:39:22PM -0500, Michael Meissner wrote: > Ping https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563496.html > > | Date: Thu, 14 Jan 2021 11:59:19 -0500 > | Subject: [PATCH] PowerPC: Map IEEE 128-bit long double built-ins. > | Message-ID: <20210114165919.ga1...@ibm-t

Re: [PATCH] PowerPC: Map IEEE 128-bit long double built-ins.

2021-01-27 Thread Segher Boessenkool
On Wed, Jan 27, 2021 at 01:06:46PM -0600, will schmidt wrote: > On Thu, 2021-01-14 at 11:59 -0500, Michael Meissner via Gcc-patches wrote: > > November 19th, 2020: > > Message-ID: <20201119235814.ga...@ibm-toto.the-meissners.org> > > Subject and date should be sufficient Only if people pick good

Re: [Ping] PowerPC: Add float128/Decimal conversions.

2021-01-27 Thread Segher Boessenkool
On Tue, Jan 26, 2021 at 06:43:06PM -0500, Michael Meissner wrote: > I posted this patch on January 14th, 2021: > https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563498.html > > | Date: Thu, 14 Jan 2021 12:09:36 -0500 > | Subject: [PATCH] PowerPC: Add float128/Decimal conversions. > | Messag

Re: [PATCH,rs6000] Fusion patterns for logical-logical

2021-01-27 Thread Segher Boessenkool
Hi! On Thu, Dec 10, 2020 at 08:41:11PM -0600, acsaw...@linux.ibm.com wrote: > This patch adds a new function to genfusion.pl to generate patterns for > logical-logical fusion. They are enabled by default for power10 and can > be disabled by -mno-power10-fusion-2logical or -mno-power10-fusion. > +

Re: [PATCH] testsuite: Run vec_insert case on P8 and P9 with option specified

2021-01-28 Thread Segher Boessenkool
Hi! On Thu, Jan 28, 2021 at 02:40:25AM -0600, Xionghu Luo wrote: > Move common functions to header file for cleanup. > > gcc/testsuite/ChangeLog: > > 2021-01-27 Xionghu Luo > > * gcc.target/powerpc/pr79251.p8.c: Move definition to ... > * gcc.target/powerpc/pr79251.h: ...this. >

Re: [Ping] PowerPC: Add float128/Decimal conversions.

2021-01-28 Thread Segher Boessenkool
On Thu, Jan 28, 2021 at 01:10:39PM -0500, Michael Meissner wrote: > > The whole thread is at > > https://patchwork.ozlabs.org/project/gcc/patch/2020112524.ga...@ibm-toto.the-meissners.org/ > > . > > > > I approved *that* version of the patch. > > Yes you approved the built-in renaming patch

Re: [Ping] PowerPC: Add float128/Decimal conversions.

2021-01-28 Thread Segher Boessenkool
On Thu, Jan 28, 2021 at 02:30:56PM -0500, Michael Meissner wrote: > On Thu, Jan 28, 2021 at 12:59:18PM -0600, Segher Boessenkool wrote: > > On Thu, Jan 28, 2021 at 01:10:39PM -0500, Michael Meissner wrote: > > > > The whole thread is at > > > > https://patch

Re: [Ping] PowerPC: Add float128/Decimal conversions.

2021-01-28 Thread Segher Boessenkool
On Thu, Jan 28, 2021 at 01:58:26PM -0600, Peter Bergner wrote: > On 1/28/21 1:47 PM, Segher Boessenkool wrote: > > On Thu, Jan 28, 2021 at 02:30:56PM -0500, Michael Meissner wrote: > >> The second patch I want you to review is: > > > > "This patch r

Re: [PATCH] PR target/98870: Fix IEEE 128-bit fortran test

2021-01-29 Thread Segher Boessenkool
Hi! On Fri, Jan 29, 2021 at 01:44:03PM -0500, Michael Meissner wrote: > This test started failing when I changed the mapping of IEEE 128-bit long > double built-in functions on 2021-01-28. This patch fixes the test so it > uses the correct name. > gcc/testsuite/ > 2021-01-29 Michael Meissner

Re: [PATCH] Make asm not contain prefixed addresses.

2021-02-02 Thread Segher Boessenkool
Hi, On Mon, Feb 01, 2021 at 11:24:42PM -0500, Michael Meissner wrote: > In PR target/98519, the assembler does not like asm memory references that are > prefixed. We can't automatically change the instruction to prefixed form with > a 'p' like we do for normal RTL insns, since this is assembly co

Re: [PATCH] rs6000: Fix MMA API - Add support for compatibility built-ins

2021-02-04 Thread Segher Boessenkool
Hi! On Thu, Feb 04, 2021 at 02:40:20PM -0600, Peter Bergner wrote: > The LLVM and GCC teams agreed to rename the __builtin_mma_assemble_pair and > __builtin_mma_disassemble_pair built-ins to __builtin_vsx_assemble_pair and > __builtin_vsx_disassemble_pair respectively. It's too late to remove the

Re: [PATCH] testsuite: Fix up pr25376.c on powerpc64-linux and array-quals-1.c on powerpc-linux [PR98325]

2021-02-04 Thread Segher Boessenkool
Hi! On Thu, Feb 04, 2021 at 09:26:47PM +0100, Jakub Jelinek wrote: > On Mon, Nov 16, 2020 at 06:14:52PM -0500, David Edelsohn via Gcc-patches > wrote: > > Jenkins does function on AIX. I will take an action item to create > > another LPAR on the AIX systems at OSUOSL for Jenkins and coordinate >

Re: [PATCH] rs6000: Fix MMA API - Add support for compatibility built-ins

2021-02-05 Thread Segher Boessenkool
On Thu, Feb 04, 2021 at 10:05:19PM -0600, Peter Bergner wrote: > On 2/4/21 3:16 PM, Segher Boessenkool wrote: > > On Thu, Feb 04, 2021 at 02:40:20PM -0600, Peter Bergner wrote: > >> The LLVM and GCC teams agreed to rename the __builtin_mma_assemble_pair and > >> __b

Re: [PATCH] rs6000: Fix MMA API - Add support for compatibility built-ins

2021-02-05 Thread Segher Boessenkool
On Fri, Feb 05, 2021 at 04:11:30PM +0100, Florian Weimer wrote: > * Peter Bergner: > > On 2/5/21 4:28 AM, Florian Weimer wrote: > >> Maybe add a check that the compatibility builtins are flagged as > >> availble using __has_builtin? > > > > Do you mean add a test in the testsuite for this? I can c

Re: [PATCH, rs6000, expand, hooks]: Fix PR98872, handle uninitialized opaque mode variables

2021-02-08 Thread Segher Boessenkool
Hi! On Mon, Feb 08, 2021 at 12:38:01PM +, Richard Sandiford wrote: > Peter Bergner writes: > > Adding Richard since he's reviewed the generic opaque mode code in > > the past and this patch contains some more eneric support. > > > > GCC handles pseudos that are used uninitialized, by emitting

Re: [PING] Add conversions between _Float128 and Decimal.

2021-02-08 Thread Segher Boessenkool
On Mon, Feb 08, 2021 at 11:32:19AM -0500, Michael Meissner wrote: > Ping patch. This really needs to go in to allow switching the long double > type > to IEEE 128-bit. Please send a version that incorporates fixes to Will's nits? Especially fix the copyright dates. Segher

Re: [PATCH,rs6000] Optimize pcrel access of globals [ping]

2021-02-11 Thread Segher Boessenkool
Hi! On Wed, Dec 09, 2020 at 11:04:44AM -0600, acsaw...@linux.ibm.com wrote: > This patch implements a RTL pass that looks for pc-relative loads of the > address of an external variable using the PCREL_GOT relocation and a > single load or store that uses that external address. > --- a/gcc/config.

Re: rs6000: Fix invalid splits when using Altivec style addresses [PR98959]

2021-02-12 Thread Segher Boessenkool
On Fri, Feb 12, 2021 at 02:50:12PM -0600, Peter Bergner wrote: > The rs6000_emit_le_vsx_* functions assume they are not passed an Altivec > style "& ~16" address. However, some of our expanders and splitters do > not verify we do not have an Altivec style address before calling those > functions,

Re: [PATCH] PR 99133, Mark xxspltiw, xxspltidp, and xxsplti32x as being prefixed

2021-02-17 Thread Segher Boessenkool
Hi! On Wed, Feb 17, 2021 at 12:17:30PM -0500, Michael Meissner wrote: > I noticed that the power10 xxspltiw, xxspltidp, and xxsplti32dx > instructions are not flagged as prefixed instructions, which means the > instruction length is not set to 12 bytes. This patch sets these > instructions to be

<    1   2   3   4   5   6   7   8   9   10   >