On sparc we end up choosing vector(8) for the
condition but vector(2) int for the value of a COND_EXPR but we
fail to verify their shapes match and thus things go downhill.
This is a missed-optimization on the pattern recognition side
as well as unhandled vector decomposition in vectorizable_cond
ifcombine depends on BRANCH_COST and the testcase relies on ifcombine
to fully optimize the function. But the important parts are optimized
everywhere, so the following delectively XFAILs the less important part.
Tested on aarch64 and x86_64-unknown-linux-gnu, pushed.
PR testsuite/117958
-O2" } */
> +
> +_BitInt(32) b;
> +
> +int
> +foo (unsigned short p)
> +{
> + return p == (double) b;
> +}
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Wed, 15 Jan 2025, Richard Biener wrote:
> The following makes niter analysis recognize a loop with an exit
> condition scanning over a STRING_CST. This is done via enhancing
> the force evaluation code rather than recognizing for example
> strlen (s) as number of iterations becau
On Fri, Jan 17, 2025 at 1:11 AM Andrew Pinski wrote:
>
> This improves this pattern by 2 ways:
> * Allow for an optional convert, similar to how the few other
> `a OP ~a` patterns also allow for an optional convert.
> * Use bitwise_inverted_equal_p/maybe_bit_not instead of directly
> matching
l + EXTRA, "%s/%d", fname, order);
> + snprintf (s, l + EXTRA, "%s/%d", fname, m_uid);
>
>return s;
> }
> diff --git a/gcc/testsuite/gcc.dg/live-patching-1.c
> b/gcc/testsuite/gcc.dg/live-patching-1.c
> index 6a1ea38c491..e24c1a7a301 100644
> --- a/gcc/testsuite/gcc.dg/live-patching-1.c
> +++ b/gcc/testsuite/gcc.dg/live-patching-1.c
> @@ -19,4 +19,4 @@ int main()
>return 0;
> }
>
> -/* { dg-final { scan-ipa-dump "foo/0 function has external linkage when the
> user requests only inlining static for live patching" "inline" } } */
> +/* { dg-final { scan-ipa-dump "foo/1 function has external linkage when the
> user requests only inlining static for live patching" "inline" } } */
> diff --git a/gcc/testsuite/gcc.dg/live-patching-4.c
> b/gcc/testsuite/gcc.dg/live-patching-4.c
> index ffea8f4cc1c..bd009937cb6 100644
> --- a/gcc/testsuite/gcc.dg/live-patching-4.c
> +++ b/gcc/testsuite/gcc.dg/live-patching-4.c
> @@ -20,4 +20,4 @@ int main()
>return 0;
> }
>
> -/* { dg-final { scan-tree-dump "Inlining foo/0 into main/2" "einline" } } */
> +/* { dg-final { scan-tree-dump "Inlining foo/1 into main/3" "einline" } } */
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
libcpp/lex.c.
> This also requires working value range propagation for s/end. */
> /* { dg-do compile } */
> +/* { dg-add-options vect_early_break } */
> +/* { dg-require-effective-target vect_early_break } */
> /* { dg-require-effective-target vect_int } */
>
> const
When we PHI translate dependent expressions we keep SSA defs in
place of the translated expression in case the expression itself
did not change even though it's context did and thus the validity
of ranges associated with it. That eventually leads to simplification
errors given we violate the preco
On Thu, Jan 16, 2025 at 3:17 AM Eugene Rozenfeld
wrote:
>
> I committed the patch to trunk. Is it ok to backport to gcc-12, gcc-13, and
> gcc-14?
Yes.
> -Original Message-----
> From: Richard Biener
> Sent: Monday, January 13, 2025 11:22 PM
> To: Eugene Rozenfeld
On Wed, 15 Jan 2025, Jakub Jelinek wrote:
> On Wed, Jan 15, 2025 at 03:16:04PM +0100, Richard Biener wrote:
> > > + /* If IPA-VRP proves called function always returns a singleton
> > > range,
> > > + the return value is replaced by the only value in that
be a compile time constant
> expression
> +inside parens. The constant expression can return a container with data and
> size
> +member functions, following similar rules as C++26 @code{static_assert}
> +message. Any string is converted to the character set of the source code.
> +When this fea
15,7 +3616,9 @@ recognise_vec_perm_simplify_seq (gassign *stmt,
> vec_perm_simplify_seq *seq)
> return false;
>
>unsigned HOST_WIDE_INT v_1_nelts, v_2_nelts;
> - if (!VECTOR_CST_NELTS (v_1_sel).is_constant (&v_1_nelts)
> + if (TREE_CODE (v_1_sel) != V
; {
> - m_vec->m_vecpfx.m_num = 0;
> + m_vec->truncate (0);
>return;
> }
>
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
&v_1_nelts)
> + || !VECTOR_CST_NELTS (v_2_sel).is_constant (&v_2_nelts))
> return false;
>
> - if (nelts != VECTOR_CST_NELTS (v_1_sel).to_constant ()
> - || nelts != VECTOR_CST_NELTS (v_2_sel).to_constant ())
> + if (nelts != v_1_nelts || nelts != v_2_nelts)
> r
On Wed, 15 Jan 2025, Jakub Jelinek wrote:
> On Wed, Jan 15, 2025 at 10:42:12AM +0100, Richard Biener wrote:
> > Yes. I'll note there's a PR (or a bunch of) which are about
> >
> > x = FOO (y, ..);
> >
> >
> > vs.
> >
> > FOO (
On Wed, 15 Jan 2025, Jakub Jelinek wrote:
> On Wed, Jan 15, 2025 at 11:46:28AM +0100, Jakub Jelinek wrote:
> > BTW, I think we don't optimize returns-arg stuff like that at least right
> > now, and if we did, it wouldn't be through IPA-VRP, most of the returns-arg
> > functions actually return a p
, sizeof(h));
> + j = h;
> + {
> +c l;
> +l.b[1] = 0;
> +m = l;
> +__builtin_memcpy(&h, &m, sizeof(h));
> + d m = j;
> +__builtin_memcpy(&g, &m, sizeof(g));
> +e = g;
> +m = h;
> +__builtin_memcpy(&g, &m, sizeof
On Wed, 15 Jan 2025, Sam James wrote:
> Richard Biener writes:
>
> > [...]. It also cuts the lines down to 10 entries.
>
> (This version doesn't ;))
Yeah - the one pushed did, I failed to commit & squash that additional
change ...
Richard.
> >
>
> \\\[tail call\\\] \\\[must tail call\\\]" 1 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times " \[^\n\r]* = freddy
> \\\(\[^\n\r]\*\\\); \\\[tail call\\\] \\\[must tail call\\\]" 1 "optimized" }
> } */
> +/* { dg-final { scan-tree-dump-not " (?:bar|freddy) \\\(\[^\n\r]\*\\\);
> \\\[tail call\\\]" "optimized" } } */
>
> __attribute__ ((noipa)) void
> foo (int x)
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
The following adds /* */ to dbg_line_numbers so there's the chance
to more easily lookup the ID of the match.pd line number used for
dumping when you want to debug a speicific replacement. It also cuts
the lines down to 10 entries.
static int dbg_line_numbers[1267] = {
/* 0 */ 161, 164
On Wed, 15 Jan 2025, Jakub Jelinek wrote:
> On Wed, Jan 15, 2025 at 10:05:35AM +0100, Richard Biener wrote:
> > > When we have return somefn (whatever); where somefn is normally tail
> > > callable and IPA-VRP determines somefn returns a singleton range, VRP
> &g
attribute__ ((noinline)) int
> +bar (int x)
> +{
> + foo (x);
> + return 1;
> +}
> +
> +__attribute__ ((noinline)) int
> +baz (int *x)
> +{
> + foo (*x);
> + return 2;
> +}
> +
> +__attribute__((noipa)) int
> +qux (int x)
> +{
> + {
> +int v;
> +foo (x);
> +baz (&v);
> + }
> + [[gnu::musttail]]
> + return bar (x);
> +}
> +
> +__attribute__((noipa)) int
> +corge (int x)
> +{
> + {
> +int v;
> +foo (x);
> +baz (&v);
> + }
> + return bar (x) + 1;
> +}
> +
> +__attribute__ ((noinline)) float
> +freddy (int x)
> +{
> + foo (x);
> + return 1.75f;
> +}
> +
> +__attribute__((noipa)) float
> +garply (int x)
> +{
> + {
> +int v;
> +foo (x);
> +baz (&v);
> + }
> + [[gnu::musttail]]
> + return freddy (x);
> +}
> +
> +__attribute__((noipa)) float
> +quux (int x)
> +{
> + {
> +int v;
> +foo (x);
> +baz (&v);
> + }
> + return freddy (x) + 0.25f;
> +}
> +
> +int v;
> +
> +int
> +main ()
> +{
> + qux (v);
> + corge (v);
> + garply (v);
> + quux (v);
> +}
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
def __attribute__((__vector_size__ (16))) int C;
> +B a;
> +A b;
> +int c;
> +unsigned int *p;
> +
> +void
> +foo0 (unsigned _BitInt (512) x)
> +{
> + C d = {};
> + _BitInt (1024) e = x | *(A *) __builtin_memset (&b, c, 8);
> + unsigned h = __builtin_std
The following addresses the fact that with loop masking (or regular
mask loads) we do not implement load shortening but we override
the case where we need that for correctness. Likewise when we
attempt to use loop masking to handle large trailing gaps we cannot
do so when there's this overrun case
The following makes niter analysis recognize a loop with an exit
condition scanning over a STRING_CST. This is done via enhancing
the force evaluation code rather than recognizing for example
strlen (s) as number of iterations because it allows to handle
some more cases.
STRING_CSTs are easy to h
On Tue, 14 Jan 2025, Christoph Müllner wrote:
> On Tue, Jan 14, 2025 at 1:46 PM Richard Biener wrote:
> >
> > On Tue, 14 Jan 2025, Christoph Müllner wrote:
> >
> > > As reported in PR117079, commit ab18785840d7b8 broke the test pr105493.c.
> > > When lo
t8_t *pix1, int i_pix1, uint8_t *pix2, int i_pix2 )
>
> /* The first loop should be vectorized, which will eliminate redundant stores
> and loads. */
> -/* { dg-final { scan-tree-dump-times " MEM
> \\\[\[\^\]\]\*\\\] = " 4 "slp1" } } */
> +/
When we have the situation of an external SLP node that is
permuted the scalar stmts recorded in the permute node do not
mean the scalar computation can be removed. We are removing
those stmts from the vectorized_scalar_stmts for this reason
but we fail to check this set when we cost scalar stmts.
On Tue, Jan 14, 2025 at 6:05 AM Alexandre Oliva wrote:
>
>
> Arrange for decode_field_reference to use local variables throughout,
> to modify the out parms only when we're about to return non-NULL, and
> to drop the unused case of NULL pand_mask, that had a latent failure
> to detect signbit mask
.dg/field-merge-22.c
> new file mode 100644
> index 0..45b29c0bccaff
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/field-merge-22.c
> @@ -0,0 +1,31 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2" } */
> +
> +/* PR tree-optimization/118456 */
> +/* Check that compares with constants take into account sign/zero extension
> of
> + both the bitfield and of the shifting type. */
> +
> +#define shift (__CHAR_BIT__ - 4)
> +
> +struct S {
> + signed char a : shift + 2;
> + signed char b : shift + 2;
> + short ignore[0];
> +} s;
> +
> +__attribute__((noipa)) int
> +foo (void)
> +{
> + return ((unsigned char) s.a) >> shift == 15
> +&& ((unsigned char) s.b) >> shift == 0;
> +}
> +
> +int
> +main ()
> +{
> + s.a = -1;
> + s.b = 1;
> + if (foo () != 1)
> +__builtin_abort ();
> + return 0;
> +}
>
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Mon, Jan 13, 2025 at 10:47 PM Eugene Rozenfeld
wrote:
>
> We are initializing both the call graph node count and
>
> the entry block count of the function with the head_count value
>
> from the profile.
>
>
>
> Count propagation algorithm may refine the entry block count
>
> and we may end up w
On Mon, 13 Jan 2025, Richard Biener wrote:
> The following makes niter analysis recognize a loop with an exit
> condition scanning over a STRING_CST. This is done via enhancing
> the force evaluation code rather than recognizing for example
> strlen (s) as number of iterations becau
The following makes niter analysis recognize a loop with an exit
condition scanning over a STRING_CST. This is done via enhancing
the force evaluation code rather than recognizing for example
strlen (s) as number of iterations because it allows to handle
some more cases.
STRING_CSTs are easy to h
When vectorizing a load we are now checking alignment before emitting
a vector(1) T load instead of blindly assuming it's OK when we had
a scalar T load. For reasons we're not handling alignment computation
optimally here but we shouldn't ICE when we fall back to loads of T.
The following ensures
cost: %u; "
> - "signed cost: %u\n", was_tie ? "(needed tie breaker)" : "",
> - uns_cost, sgn_cost);
> + fprintf (dump_file, ";; positive division:%s unsigned cost: %u; "
> + "signed cost: %u\
On Fri, Jan 10, 2025 at 9:43 PM Jeff Law wrote:
>
>
>
> On 1/10/25 1:00 AM, Richard Biener wrote:
> >
> > It's a problem we're never going to fully solve. Some of the
> > testcases show missed optimizations which we can work on. Some show
> > we
Here's another fix for a missing check that an IV value fits in a
HIW. It's originally from Stefan.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
PR tree-optimization/117119
* tree-data-ref.cc (initialize_matrix_A): Check whether
an INTEGER_CST fits in HWI,
i]);
>
>free_2 ();
> return LDPS_OK;
> @@ -1615,6 +1617,9 @@ onload (struct ld_plugin_tv *tv)
>if (strstr (collect_gcc_options, "'-save-temps'"))
> save_temps = true;
>
> + if (strstr (collect_gcc_options, "'-flto-incremental="))
> + flto_incremental = true;
> +
>if (strstr (collect_gcc_options, "'-v'")
>|| strstr (collect_gcc_options, "'--verbose'"))
> verbose = true;
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
Status
==
The GCC development branch which will become GCC 15 is now in
stage4, open for regression and documentation fixes only.
Quality Data
Priority # Change from last report
--- ---
P1 32 + 6
P2
; +}
> +
> +__attribute__((noipa))
> +int f3 ()
> +{
> + if (c.d == a.d
> + && c.e == a.e)
> +return 0;
> + return -1;
> +}
> +
> +__attribute__((noipa))
> +int f4 ()
> +{
> + if (c.d != a.d
> + || c.e != a.e)
> +return -1;
> + return 0;
> +}
> +
> +int main() {
> + if (f1 () < 0
> + || f2 () < 0
> + || f3 () < 0
> + || f4 () < 0)
> +__builtin_abort();
> + return 0;
> +}
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Fri, 10 Jan 2025, Richard Biener wrote:
> The following were found compiling SPEC CPU 2017 with valgrind.
>
> Bootstrap and regtest pending on x86_64-unknown-linux-gnu.
>
> * tree-vect-slp.cc (vect_analyze_slp): Release saved_stmts
> vector.
> (
> Am 11.01.2025 um 11:29 schrieb Andrew Pinski :
>
> On Sat, Jan 11, 2025 at 2:10 AM Richard Biener
> wrote:
>>
>>
>>
>>>> Am 11.01.2025 um 09:49 schrieb Andrew Pinski :
>>>
>>> In this case, early phiopt would get rid of the
> Am 11.01.2025 um 09:49 schrieb Andrew Pinski :
>
> In this case, early phiopt would get rid of the user provided predicator
> for hot/cold as it would remove the basic blocks. The easiest and best option
> is
> for early phi-opt don't do phi-opt if the middle basic-block(s) have either
> a
> Am 10.01.2025 um 22:17 schrieb Jeff Law :
>
>
>
>> On 1/10/25 4:41 AM, Richard Biener wrote:
>> The following puts in a hard limit on ext-dce because it might end
>> up requiring memory on the order of the number of basic blocks
>> times the numbe
The following were found compiling SPEC CPU 2017 with valgrind.
Bootstrap and regtest pending on x86_64-unknown-linux-gnu.
* tree-vect-slp.cc (vect_analyze_slp): Release saved_stmts
vector.
(vect_build_slp_tree_2): Release new_oprnds_info when not
used.
---
gcc/tr
The following fixes memory leaks found compiling SPEC CPU 2017 with
valgrind.
Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
* df-core.cc (rest_of_handle_df_finish): Release dflow for
problems without free function (like LR).
* gimple-crc-optimization.cc (c
On Fri, Jan 10, 2025 at 3:27 PM Qing Zhao wrote:
>
>
>
> > On Jan 10, 2025, at 03:00, Richard Biener
> > wrote:
> >
> > On Thu, Jan 9, 2025 at 9:39 PM Qing Zhao wrote:
> >>
> >>
> >>
> >>> On Jan 9, 2025, at 14:10, Jeff
Pushed as obvious.
* gcse.cc (pass_hardreg_pre::gate): Wrap possibly unused
fun argument.
---
gcc/gcse.cc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gcc/gcse.cc b/gcc/gcse.cc
index 3f3f7fe15b0..4ae19f28430 100644
--- a/gcc/gcse.cc
+++ b/gcc/gcse.cc
@@ -435
The following puts in a hard limit on ext-dce because it might end
up requiring memory on the order of the number of basic blocks
times the number of pseudo registers. The limiting follows what
GCSE based passes do and thus I re-use --param max-gcse-memory here.
This doesn't in any way address th
(get_best_mode (end_bit - first_bit, first_bit, 0, ll_end_region,
>ll_align, BITS_PER_WORD, volatilep, &lnmode))
> @@ -8585,7 +8585,7 @@ fold_truth_andor_for_ifcombine (enum tree_code code,
> tree truth_type,
>HOST_WIDE_INT lr_align = TYPE_ALIGN (TREE_TYP
On Thu, Jan 9, 2025 at 9:39 PM Qing Zhao wrote:
>
>
>
> > On Jan 9, 2025, at 14:10, Jeff Law wrote:
> >
> >
> >
> > On 1/9/25 10:48 AM, Qing Zhao wrote:
> >
> >>>
> >>> I think Jeff's patch is not reasonable since it boils down to not diagnose
> >>> -Warray-bounds but instead remove those stmts.
s) {
> + unsigned r;
> + if (s >= -1)
> +return 1;
> + r = 1000;
> + while (s > 1 / r)
> +r /= 2;
> + return g ? 2 : 0;
> +}
> +void y() {
> + for (;;) {
> +b[w(8, *p)] = h;
> +for (; a + k; j = o)
> + i &= c = x(6) < 0;
> + }
> +}
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
r_arg,
> + &rl_and_mask, &rl_signbit,
> + &r_xor, &rr_arg, &rr_and_mask,
>&rl_load, rl_loc);
>rr_inner = decode_field_reference (&rr_arg, &rr_bitsize, &rr_bitpos,
>&rr_unsignedp, &rr_reversep, &volatilep,
> - &rr_and_mask, &rr_signbit, &r_xor, 0,
> + &rr_and_mask, &rr_signbit, &r_xor, 0, 0,
>&rr_load, rr_loc);
>
>/* It must be true that the inner operation on the lhs of each
>
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
unsigned short b;
> + __builtin_memcpy (&b, x, sizeof (short));
> + if ((b & 15) != 8)
> +return 1;
> + if unsigned char) b) >> 4) > 7)
> +return 1;
> + return 0;
> +}
> +
> +__attribute__((noipa)) int
> +bar (const void *x)
> +{
&g
On Thu, 9 Jan 2025, Robert Dubner wrote:
> I am going to trim back some of the older stuff.
>
> > -Original Message-
> > From: Richard Biener
> > Sent: Tuesday, January 7, 2025 08:32
> > To: Robert Dubner
> > Cc: jklow...@symas.com; Joseph
On Sat, Dec 21, 2024 at 6:05 AM Alexandre Oliva wrote:
>
> On Dec 20, 2024, Jakub Jelinek wrote:
>
> > On Wed, Dec 18, 2024 at 12:59:11AM -0300, Alexandre Oliva wrote:
> >> * gcc.dg/field-merge-16.c: New.
>
> > Note the test FAILs on i686-linux or on x86_64-linux with -m32.
>
> Indeed, thanks. H
On Mon, Jan 6, 2025 at 2:12 PM Richard Sandiford
wrote:
>
> g:d882fe5150fbbeb4e44d007bb4964e5b22373021, posted at
> https://gcc.gnu.org/pipermail/gcc-patches/2000-July/033786.html ,
> added code to treat:
>
> (set (reg:CC cc) (compare:CC (gt:M (reg:CC cc) 0) (lt:M (reg:CC cc) 0)))
>
> as a nop.
When we have the situation of an external SLP node that is
permuted the scalar stmts recorded in the permute node do not
mean the scalar computation can be removed. We are removing
those stmts from the vectorized_scalar_stmts for this reason
but we fail to check this set when we cost scalar stmts.
duplicate_loop_body_to_header_edge redirects the original loop entry
edge to the loop copy header and the copied loop exit to the old
loop header. But it does so in the order that requires temporary
space for an extra edge on the original loop header, causing
unnecessary re-allocations. The follo
On Thu, 9 Jan 2025, Zhou Zhao wrote:
>
> 在 2025/1/9 下午3:33, Richard Biener 写道:
> > On Thu, 9 Jan 2025, Zhou Zhao wrote:
> >
> >> 在 2025/1/8 下午6:30, Richard Biener 写道:
> >>> On Wed, 8 Jan 2025, Zhou Zhao wrote:
> >>>
> >>>> 在 20
On Thu, Jan 9, 2025 at 9:08 AM Richard Biener
wrote:
>
> On Wed, Jan 8, 2025 at 5:34 PM Qing Zhao wrote:
> >
> >
> >
> > > On Jan 7, 2025, at 07:29, Richard Biener
> > > wrote:
> > >
> > > On Mon, Jan 6, 2025 at 5:40 PM Qing Zhao
On Wed, Jan 8, 2025 at 5:34 PM Qing Zhao wrote:
>
>
>
> > On Jan 7, 2025, at 07:29, Richard Biener wrote:
> >
> > On Mon, Jan 6, 2025 at 5:40 PM Qing Zhao wrote:
> >>
> >>
> >>
> >>> On Jan 6, 2025, at 11:01, Richard Biener
On Thu, 9 Jan 2025, Zhou Zhao wrote:
>
> 在 2025/1/8 下午6:30, Richard Biener 写道:
> > On Wed, 8 Jan 2025, Zhou Zhao wrote:
> >
> >> 在 2025/1/8 下午5:04, Richard Biener 写道:
> >>> On Wed, 8 Jan 2025, Zhou Zhao wrote:
> >>>
> >>>> 在 2
amp;& *inv_expr != NULL)).
>
> > The patch you posted instead of just adjusting complexity seems to
> > change the way we distribute the invariant - in particular we now
> > distribute it to parts.offset even when that is not supported
> > (!(ok_with_ratio_p || ok_without_ra
On Wed, 8 Jan 2025, Jan Hubicka wrote:
> > On Tue, 10 Dec 2024, Jan Hubicka wrote:
> >
> > > Hi,
> > > int:
> > > struct foo
> > > {
> > > int a;
> > > void bar() const;
> > > ~foo()
> > > {
> > > if (a != 42)
> > > __builtin_abort ();
> > > }
> > > };
> > > __attribute__ ((no
On Mon, Dec 9, 2024 at 11:15 PM Lewis Hyatt wrote:
>
> On Mon, Dec 09, 2024 at 02:07:07PM +0100, Richard Biener wrote:
> > On Tue, Dec 3, 2024 at 2:42 PM Richard Biener
> > wrote:
> > >
> > > On Tue, Dec 3, 2024 at 2:07 PM Lewis Hyatt wrote:
> > >
On Wed, 8 Jan 2025, Jakub Jelinek wrote:
> On Wed, Jan 08, 2025 at 10:17:59AM +0100, Richard Biener wrote:
> > > As mentioned in the PR, the a r<< (bitsize-b) to a r>> b and similar
> > > match.pd optimization which has been introduced in GCC 15 can introduce
&
When CD-DCE creates forwarders to reduce false control dependences
it fails to update the irreducible state of edge and the forwarder
block in case the fowarder groups both normal (entry) and edges
from an irreducible region (necessarily backedges). This is because
when we split the first edge, if
On Wed, Jan 8, 2025 at 10:45 AM Jakub Jelinek wrote:
>
> Hi!
>
> Based on the comments in the PR, I've tried to write a patch which would
> try to keep backwards compatibility with the GCC 11-14 *.mod files.
>
> Testcase was
> module a
> use, intrinsic :: iso_c_binding
> end module a
> module b
On Wed, 8 Jan 2025, Jakub Jelinek wrote:
> On Wed, Jan 08, 2025 at 09:14:59AM +0100, Eric Botcazou wrote:
> > > So, this patch is an alternative to the
> > > https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669671.html
> > > patch, which had the major problem that it required changing all t
; +f1 (unsigned x, int t)
> +{
> + return lrotate32 (x, 32 - t);
> +}
> +
> +unsigned long long
> +f2 (unsigned long long x, int t)
> +{
> + return lrotate64 (x, 64 - t);
> +}
> +
> +unsigned
> +f3 (unsigned x, int t)
> +{
> + if (t == 32)
> +__builtin_unreachable ();
> + return lrotate32 (x, 32 - t);
> +}
> +
> +unsigned long long
> +f4 (unsigned long long x, int t)
> +{
> + if (t == 64)
> +__builtin_unreachable ();
> + return lrotate64 (x, 64 - t);
> +}
> +
> +unsigned
> +f5 (unsigned x, int t)
> +{
> + return rrotate32 (x, 32 - t);
> +}
> +
> +unsigned long long
> +f6 (unsigned long long x, int t)
> +{
> + return rrotate64 (x, 64 - t);
> +}
> +
> +unsigned
> +f7 (unsigned x, int t)
> +{
> + if (t == 32)
> +__builtin_unreachable ();
> + return rrotate32 (x, 32 - t);
> +}
> +
> +unsigned long long
> +f8 (unsigned long long x, int t)
> +{
> + if (t == 64)
> +__builtin_unreachable ();
> + return rrotate64 (x, 64 - t);
> +}
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Wed, 8 Jan 2025, Zhou Zhao wrote:
>
> 在 2025/1/7 下午10:45, Richard Biener 写道:
> > On Thu, 2 Jan 2025, 赵洲 wrote:
> >
> >> Add Reviewer Richard Biener.
> >>
> >>
> >>> -原始邮件-
> >>> 发件人: "Zhou Zhao"
> &
=c++20 -gdwarf-5 -dA" }
> +// { dg-options "-O -std=c++20 -gdwarf-5 -dA -gno-strict-dwarf" }
> // { dg-skip-if "AIX DWARF5" { powerpc-ibm-aix* } }
> -// For -gdwarf-6 hopefully DW_LANG_C_plus_plus_20
> // DW_LANG_C_plus_plus_14 = 0x0021
> +// DW_LNAME_C_plus_plus = 0x0004 202002
> // { dg-final { scan-assembler "0x21\[^\n\r]* DW_AT_language" } } */
> +// { dg-final { scan-assembler "0x4\[^\n\r]* DW_AT_language_name" } } */
> +// { dg-final { scan-assembler "0x31512\[^\n\r]* DW_AT_language_version" } }
> */
>
> int version;
> --- gcc/testsuite/g++.dg/debug/dwarf2/lang-cpp23.C.jj 2025-01-07
> 10:07:54.926007612 +0100
> +++ gcc/testsuite/g++.dg/debug/dwarf2/lang-cpp23.C2025-01-07
> 10:08:19.206669497 +0100
> @@ -0,0 +1,10 @@
> +// { dg-do compile }
> +// { dg-options "-O -std=c++23 -gdwarf-5 -dA -gno-strict-dwarf" }
> +// { dg-skip-if "AIX DWARF5" { powerpc-ibm-aix* } }
> +// DW_LANG_C_plus_plus_14 = 0x0021
> +// DW_LNAME_C_plus_plus = 0x0004 202302
> +// { dg-final { scan-assembler "0x21\[^\n\r]* DW_AT_language" } } */
> +// { dg-final { scan-assembler "0x4\[^\n\r]* DW_AT_language_name" } } */
> +// { dg-final { scan-assembler "0x3163e\[^\n\r]* DW_AT_language_version" } }
> */
> +
> +int version;
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
When nonlocal goto lowering creates an artificial label it fails
to adjust its context.
Bootstrapped on x86_64-unknown-linux-gnu, testing in progress
(I doubt good test coverage is present for non-local gotos)
OK when testing succeeds?
Thanks,
Richard.
PR middle-end/118325
* tre
@@
> +// { dg-do compile }
> +// { dg-options "-O1 -fdump-tree-optimized" }
> +struct foo
> +{
> + int a;
> + void bar() const;
> + ~foo()
> + {
> +if (a != 42)
> + __builtin_abort ();
> + }
> +};
> +__attribute__ ((noinline))
> +void test(const struct foo a)
> +{
> +int b = a.a;
> +a.bar();
> +if (a.a != b)
> + __builtin_printf ("optimize me away");
> +}
> +
> +/* { dg-final { scan-tree-dump-not "optimize me away" "optimized" } } */
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Thu, 2 Jan 2025, 赵洲 wrote:
> Add Reviewer Richard Biener.
>
>
> > -原始邮件-
> > 发件人: "Zhou Zhao"
> > 发送时间:2025-01-02 19:37:07 (星期四)
> > 收件人: gcc-patches@gcc.gnu.org
> > 抄送: xry...@xry111.site, i...@xen0n.name, chengl...@loongson.cn,
When the RTL unroller handles constant iteration loops it bails out
prematurely when heuristics wouldn't apply any unrolling before
checking #pragma unroll.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
PR rtl-optimization/118298
* loop-unroll.cc (decide_unroll_cons
The testcases use -save-temps which doesn't play nice with -flto
and multilib testing resulting in spurious UNRESOLVED like
/usr/lib64/gcc/x86_64-suse-linux/14/../../../../x86_64-suse-linux/bin/ld:
i386:x86-64 architecture of input file `./convert-dfp-2.ltrans0.ltrans.o' is
incompatible with i38
ave been three years ago.
>
> > -Original Message-
> > From: Richard Biener
> > Sent: Sunday, December 22, 2024 07:18
> > To: jklow...@symas.com
> > Cc: gcc-patches@gcc.gnu.org
> > Subject: Re: [PATCH] COBOL 3/8 gen: GENERIC interface
> >
>
On Mon, Jan 6, 2025 at 5:40 PM Qing Zhao wrote:
>
>
>
> > On Jan 6, 2025, at 11:01, Richard Biener wrote:
> >
> > On Mon, Jan 6, 2025 at 3:43 PM Qing Zhao wrote:
> >>
> >>
> >>
> >>> On Jan 6, 2025, at 09:21, Jeff Law
When we create the SLP reduction chain epilogue for the PHIs for
the early exit we fail to properly classify the reduction as SLP
reduction chain. The following fixes the corresponding checks.
Bootstrapped and tested on x86_64-unknown-linux-gnu.
Richard.
PR tree-optimization/118269
On Mon, Jan 6, 2025 at 3:43 PM Qing Zhao wrote:
>
>
>
> > On Jan 6, 2025, at 09:21, Jeff Law wrote:
> >
> >
> >
> > On 1/6/25 7:11 AM, Qing Zhao wrote:
> >>>
> >>> Given it doesn't cause user visible UB, we could insert the trap *before*
> >>> the UB inducing statement. That would then make the
On Tue, Dec 31, 2024 at 2:04 PM Tamar Christina wrote:
>
> > -Original Message-
> > From: Richard Biener
> > Sent: Wednesday, November 20, 2024 11:28 AM
> > To: Andrew Pinski
> > Cc: gcc-patches@gcc.gnu.org
> > Subject: Re: [PATCH v2 2/3] cfgexpan
> Am 06.01.2025 um 06:48 schrieb Andi Kleen :
>
> Mark Wielaard writes:
>
>> commit 56946c801a7c ("gimple: Add limit after which slower switchlower
>> algs are used [PR117091] [PR117352]") introduced a limit on the number
>> of cases of a switch. It also bails out on finding jump tables if t
> Am 03.01.2025 um 22:48 schrieb Jakub Jelinek :
>
> Hi!
>
> As suggested by Richi in the PR, the following patch will fail to DCE
> allocation calls if they have constant size which is too large (over
> PTRDIFF_MAX), or for the case of calloc, if either of the arguments
> is too large (in th
> Am 03.01.2025 um 09:49 schrieb Jakub Jelinek :
>
> Hi!
>
> As the following testcases show (the latter only if I revert the
> temporary reversion of the C++ large array speedup), the FEs aren't
> really consistent in the type of array CONSTRUCTOR_ELTS indexes,
> it can be bitsizetype, but c
> Am 03.01.2025 um 09:44 schrieb Jakub Jelinek :
>
> Hi!
>
> When touching the function yesterday, I was surprised to see just
> TREE_CODE (something) != INTEGER_CST checks followed by tree_to_shwi.
> That would ICE if the INTEGER_CST doesn't fit.
>
> I have actually not been able to reprodu
> Am 03.01.2025 um 10:04 schrieb Richard Sandiford :
>
> This PR was about a case in which late-combine moved a stack
> deallocation across an earlier stack access. This was possible
> because the deallocation was missing the RTL-SSA equivalent of
> a vop, which in turn was because rtl_proper
> Am 03.01.2025 um 16:22 schrieb Jeff Law :
>
> So this is an implementation of an idea I had a few years back and
> prototyped last spring to fix pr92539.
>
> pr92539 is a false positive Warray-bounds warning triggered by loop
> unrolling. The warning is in code that will never execute, b
On Wed, Jan 1, 2025 at 1:44 AM Fangrui Song wrote:
>
> so that
> `gcc -c a.cc --coverage -fprofile-prefix-map=$PWD=.`
> does not emit $PWD in the generated a.gcno file.
This looks OK to me. Please leave a few days for others to comment though.
Thanks,
Richard.
> PR gcov-profile/96092
>
> Am 02.01.2025 um 10:36 schrieb Jakub Jelinek :
>
> Hi!
>
> In order to stress test RAW_DATA_CST handling, I've tested trunk gcc with
> r15-6339 reapplied and a hack where I've changed
> const unsigned int raw_data_min_len = 128;
> to
> const unsigned int raw_data_min_len = 2;
> in cp_lexe
PRE applies GENERIC folding to some component ref components which
might result in invalid GIMPLE, like a VIEW_CONVERT_EXPR wrapping
a REALPART_EXPR as in the PR. The following removes all GENERIC
folding in the code re-constructing a GENERIC component-ref from
the PRE VN IL.
Bootstrap and regtes
The following avoids applying TER to direct internal functions that
are tailcall since the involved expansion code path doesn't honor
TER constraints.
Bootstrap and regtest running on x86_64-unknown-linux-gnu.
PR middle-end/118174
* tree-outof-ssa.cc (ssa_is_replaceable_p): Exclud
On Fri, Dec 27, 2024 at 2:27 AM Lewis Hyatt wrote:
>
> Hello-
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118205
>
> The PR shows that on some code involving indexing into a zero-length array
> in a loop, we try to look up in reduction_phi() a statement that is not a
> PHI. Since r15-6001, th
> Am 28.12.2024 um 09:13 schrieb Jakub Jelinek :
>
> Hi!
>
> The following testcases ICE because fold_array_ctor_reference in the
> RAW_DATA_CST handling just return build_int_cst without actually checking
> that if type is non-NULL, TREE_TYPE (val) is uselessly convertible to it.
>
> By fal
> Am 23.12.2024 um 10:57 schrieb Robin Dapp :
>
>
>>
>> I don't quite understand - you are checking loop_vinfo->vector_mode, but
>> how can you be sure no chosen vector uses a !VECTOR_MODE_P? It seems
>> fragile to rely on (it might work in this case), instead when any
>> !VECTOR_MODE_P nee
> Am 22.12.2024 um 23:10 schrieb Christoph Müllner
> :
>
> Recently two test cases for PR118149 have been added.
> While pr118149-2.c works well for AArch64, pr118149.c fails
> because the expected optimization in forwprop4 cannot be applied
> as SLP vectorization does not happen.
> This patc
(sorry for breaking threading, but quoting the whole mail made my MUA
unbearably slow)
>From 64bcb34e12371f61a8958645e1668e0ac2704391gen.patch 4 Oct 2024 12:01:22
-0400
From: "James K. Lowden"
Date: Thu 12 Dec 2024 06:28:07 PM EST
Subject: [PATCH] Add 'cobol' to 4 files
gcc/cobol/ChangeLog
On Tue, Dec 10, 2024 at 7:30 AM wrote:
>
> From: Pan Li
>
> This patch would like to refactor the all signed SAT_ADD patterns,
> aka:
> * Extract type check outside.
> * Re-arrange the related match pattern forms together.
OK
> The below test suites are passed for this patch.
> * The rv64gcv fu
On Thu, 19 Dec 2024, Robin Dapp wrote:
> > I wonder if LOOP_VINFO_LENS is really empty here? If not, who recorded
> > the len and why did that not disable partial vectors?
>
> It's not empty. vectorizable_operation fills it for a vectype of vector short
> (4). Before (in vector_type_mode), we
1 - 100 of 2582 matches
Mail list logo