On Thu, Sep 28, 2023 at 3:37 PM Richard Sandiford
wrote:
>
> c_readstr only operated on integer modes. It worked by reading
> the source string into an array of HOST_WIDE_INTs, converting
> that array into a wide_int, and from there to an rtx.
>
> It's simpler to do this by building a target memo
On Thu, Sep 28, 2023 at 9:10 PM Jeff Law wrote:
>
>
>
> On 9/28/23 11:26, Jason Merrill wrote:
> > On 9/28/23 05:55, Richard Sandiford wrote:
> >> poly_int was written before the switch to C++11 and so couldn't
> >> use explicit default constructors. This led to an awkward split
> >> between poly
se, 0, rti->stride * rti->nelt);
>
>for (rt = gt_ggc_rtab; *rt; rt++)
> -for (rti = *rt; rti->base != NULL; rti++)
> - memset (rti->base, 0, rti->stride * rti->nelt);
> +ggc_zero_rtab_roots (*rt);
>
>for (rt = gt_pch_scalar_rtab; *rt; rt++)
> for (rti = *rt; rti->base != NULL; rti++)
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Fri, 29 Sep 2023, Jakub Jelinek wrote:
> On Wed, Sep 27, 2023 at 11:15:26AM +0000, Richard Biener wrote:
> > > tree-vect-patterns.cc:2947 unprom.quick_grow (nops);
> > > T = vect_unpromoted_value
> > > Go for quick_grow_cleared? Something else?
> >
>
t; +gt_pointer_operator, void *)
> +{
> +}
> +
> +template
> void
> -gt_ggc_mx (generic_wide_int *)
> +gt_ggc_mx (generic_wide_int > *)
> {
> }
>
> -template
> +template
> void
> -gt_pch_nx (generic_wide_int *)
> +gt_pch_nx (generic_wide_int
On Thu, 28 Sep 2023, Jakub Jelinek wrote:
> Hi!
>
> On Tue, Aug 29, 2023 at 05:09:52PM +0200, Jakub Jelinek via Gcc-patches wrote:
> > On Tue, Aug 29, 2023 at 11:42:48AM +0100, Richard Sandiford wrote:
> > > > I'll note tree-ssa-loop-niter.cc also uses GMP in some cases, widest_int
> > > > is rea
The following conservatively fixes loop distribution to only
recognize memset/memcpy and friends when at least one element
is going to be processed. This avoids having an unconditional
builtin call in the IL that might imply the source and destination
pointers are non-NULL when originally pointers
void *, const void *,
>void *), void *data)
> {
> +#if GCC_VERSION >= 5000
> static_assert (vec_detail::is_trivially_copyable_or_pair ::value, "");
> +#endif
>if (length () > 1)
> gcc_stablesort_r (address (), length (), sizeof (T), cmp, data);
> }
> @@ -1396,7 +1415,9 @@ inline void
> vec::quick_grow (unsigned len)
> {
>gcc_checking_assert (length () <= len && len <= m_vecpfx.m_alloc);
> +#if GCC_VERSION >= 5000
> // static_assert (std::is_trivially_default_constructible ::value, "");
> +#endif
>m_vecpfx.m_num = len;
> }
>
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
ype @0)) (convert:type @2)
> #endif
>
> /* Simplify pointer equality compares using PTA. */
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
er_type (prec, unsign);
> -}
> -(convert (bit_xor (negate (convert:inttype @0)) (convert:inttype
> @2)))
> + && (!wascmp || element_precision (type) == 1)
> + && (!TYPE_OVERFLOW_WRAPS (type) || element_precision (type) > 1))
> + (bit_xor (negate (conver
_nonstandard_integer_type (prec, unsign);
> -}
> -(convert (bit_xor (negate (convert:inttype @0)) (convert:inttype
> @2)))
> + && (!wascmp || TYPE_PRECISION (type) == 1))
> + (if ((!TYPE_UNSIGNED (type) && TREE_CODE (type) == BOOLEAN_TYPE)
> + || TYPE_PRECISION (type) == 1)
> +(bit_xor (convert:type @0) @2)
> +(bit_xor (negate (convert:type @0)) @2)
> #endif
>
> /* Simplify pointer equality compares using PTA. */
>
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
to check the TYPE_PRECISION not being 1
> - here as the powerof2cst case above will handle that case correctly.
> */
> -(if (INTEGRAL_TYPE_P (type) && integer_all_onesp (@2))
> - (negate (convert:type (bit_xor (convert:boolean_type_node @0)
> - { boolean_true_node; }
> + { boolean_true_node; })) { shift; })))
>
> /* (a > 1) ? 0 : (cast)a is the same as (cast)(a == 1)
> for unsigned types. */
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Mon, Oct 2, 2023 at 2:06 PM Sergei Trofimovich wrote:
>
> From: Sergei Trofimovich
>
> Without the change profiled bootstrap fails for various warnings on
> master branch as:
>
> $ ../gcc/configure
> $ make profiledbootstrap
> ...
> gcc/genmodes.cc: In function ‘int main(int, c
The following clarifies the flatten attribute documentation to mention
the inlining applies also to calls formed as part of inlining earlier
calls but not calls to the function itself.
Will push this tomorrow or so if there are no better suggestions
on the wording.
PR ipa/111643
*
On Wed, 4 Oct 2023, Andre Vieira (lists) wrote:
>
>
> On 30/08/2023 14:04, Richard Biener wrote:
> > On Wed, 30 Aug 2023, Andre Vieira (lists) wrote:
> >
> >> This patch adds a new target hook to enable us to adapt the types of return
> >> and paramet
The following makes sure to treat values whose definition we didn't
visit as available since those by definition must dominate the entry
of the region. That avoids unpropagated copies after if-conversion
and resulting SLP discovery fails (which doesn't handle plain copies).
Bootstrapped and teste
When we do SLP discovery of SIMD calls we run into the issue that
when the call is neither builtin nor internal function we have
cfn == CFN_LAST but internal_fn_p of that returns true. Since
IFN_LAST isn't vectorizable we fail spuriously.
Fixed by checking for cfn != CFN_LAST && internal_fn_p (cf
;m_value_range.bottom_p ()
> && !plats->m_value_range.top_p ()
> && dbg_cnt (ipa_cp_vr))
> {
> - ipa_vr vr (plats->m_value_range.m_vr);
> + if (bits)
> + {
> + Value_Range tmp = plats->m_value_range.m_vr;
> + tree type = ipa_get_type (info, i);
> + irange &r = as_a (tmp);
> + irange_bitmask bm (wide_int::from (bits->get_value (),
> + TYPE_PRECISION (type),
> + TYPE_SIGN (type)),
> + wide_int::from (bits->get_mask (),
> + TYPE_PRECISION (type),
> + TYPE_SIGN (type)));
> + r.update_bitmask (bm);
> + ipa_vr vr (tmp);
> + ts->m_vr->quick_push (vr);
> + }
> + else
> + {
> + ipa_vr vr (plats->m_value_range.m_vr);
> + ts->m_vr->quick_push (vr);
> + }
> + }
> + else if (bits)
> + {
> + tree type = ipa_get_type (info, i);
> + Value_Range tmp;
> + tmp.set_varying (type);
> + irange &r = as_a (tmp);
> + irange_bitmask bm (wide_int::from (bits->get_value (),
> + TYPE_PRECISION (type),
> + TYPE_SIGN (type)),
> + wide_int::from (bits->get_mask (),
> + TYPE_PRECISION (type),
> + TYPE_SIGN (type)));
> + r.update_bitmask (bm);
> + ipa_vr vr (tmp);
> ts->m_vr->quick_push (vr);
> }
> else
> @@ -6664,6 +6654,21 @@ ipcp_store_vr_results (void)
> ipa_vr vr;
> ts->m_vr->quick_push (vr);
> }
> +
> + if (!dump_file || !bits)
> + continue;
> +
> + if (!dumped_sth)
> + {
> + fprintf (dump_file, "Propagated bits info for function %s:\n",
> +node->dump_name ());
> + dumped_sth = true;
> + }
> + fprintf (dump_file, " param %i: value = ", i);
> + print_hex (bits->get_value (), dump_file);
> + fprintf (dump_file, ", mask = ");
> + print_hex (bits->get_mask (), dump_file);
> + fprintf (dump_file, "\n");
> }
> }
> }
> @@ -6696,9 +6701,7 @@ ipcp_driver (void)
>ipcp_propagate_stage (&topo);
>/* Decide what constant propagation and cloning should be performed. */
>ipcp_decision_stage (&topo);
> - /* Store results of bits propagation. */
> - ipcp_store_bits_results ();
> - /* Store results of value range propagation. */
> + /* Store results of value range and bits propagation. */
>ipcp_store_vr_results ();
>
>/* Free all IPCP structures. */
> --- gcc/ipa-sra.cc.jj 2023-10-05 11:32:40.233739151 +0200
> +++ gcc/ipa-sra.cc2023-10-05 11:36:45.408378045 +0200
> @@ -4134,22 +4134,8 @@ zap_useless_ipcp_results (const isra_fun
>else if (removed_item)
> ts->m_agg_values->truncate (dst_index);
>
> - bool useful_bits = false;
> - unsigned count = vec_safe_length (ts->bits);
> - for (unsigned i = 0; i < count; i++)
> -if ((*ts->bits)[i])
> -{
> - const isra_param_desc *desc = &(*ifs->m_parameters)[i];
> - if (desc->locally_unused)
> - (*ts->bits)[i] = NULL;
> - else
> - useful_bits = true;
> -}
> - if (!useful_bits)
> -ts->bits = NULL;
> -
>bool useful_vr = false;
> - count = vec_safe_length (ts->m_vr);
> + unsigned count = vec_safe_length (ts->m_vr);
>for (unsigned i = 0; i < count; i++)
> if ((*ts->m_vr)[i].known_p ())
>{
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
+** f2:
> +** mov x0, -9223372036854775808
> +** fmovd[0-9]+, x0
> +** orr v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b
> +** ret
> +*/
> +float64_t f2 (float64_t a)
> +{
> + return -fabs (a);
> +}
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c
> b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c
> new file mode 100644
> index
> ..1bf34328d8841de8e6b0a5458562a9f00e31c275
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c
> @@ -0,0 +1,34 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3" } */
> +/* { dg-final { check-function-bodies "**" "" "" { target lp64 } } } */
> +
> +#include
> +#include
> +
> +/*
> +** f1:
> +** ...
> +** ld1wz[0-9]+.s, p[0-9]+/z, \[x0, x2, lsl 2\]
> +** orr z[0-9]+.s, z[0-9]+.s, #0x8000
> +** st1wz[0-9]+.s, p[0-9]+, \[x0, x2, lsl 2\]
> +** ...
> +*/
> +void f1 (float32_t *a, int n)
> +{
> + for (int i = 0; i < (n & -8); i++)
> + a[i] = -fabsf (a[i]);
> +}
> +
> +/*
> +** f2:
> +** ...
> +** ld1dz[0-9]+.d, p[0-9]+/z, \[x0, x2, lsl 3\]
> +** orr z[0-9]+.d, z[0-9]+.d, #0x8000
> +** st1dz[0-9]+.d, p[0-9]+, \[x0, x2, lsl 3\]
> +** ...
> +*/
> +void f2 (float64_t *a, int n)
> +{
> + for (int i = 0; i < (n & -8); i++)
> + a[i] = -fabs (a[i]);
> +}
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c
> b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c
> new file mode 100644
> index
> ..21f2a8da2a5d44e3d01f6604ca7be87e3744d494
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c
> @@ -0,0 +1,37 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3" } */
> +/* { dg-final { check-function-bodies "**" "" "" { target lp64 } } } */
> +
> +#include
> +
> +/*
> +** negabs:
> +** mov x0, -9223372036854775808
> +** fmovd[0-9]+, x0
> +** orr v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b
> +** ret
> +*/
> +double negabs (double x)
> +{
> + unsigned long long y;
> + memcpy (&y, &x, sizeof(double));
> + y = y | (1UL << 63);
> + memcpy (&x, &y, sizeof(double));
> + return x;
> +}
> +
> +/*
> +** negabsf:
> +** moviv[0-9]+.2s, 0x80, lsl 24
> +** orr v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b
> +** ret
> +*/
> +float negabsf (float x)
> +{
> + unsigned int y;
> + memcpy (&y, &x, sizeof(float));
> + y = y | (1U << 31);
> + memcpy (&x, &y, sizeof(float));
> + return x;
> +}
> +
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Thu, Oct 5, 2023 at 5:49 PM Andrea Corallo wrote:
>
> Hello all,
>
> this patch checks in mdcompact, the tool written in elisp that I used
> to mass convert all the multi choice pattern in the aarch64 back-end to
> the new compact syntax.
>
> I tested it on Emacs 29 (might run on older versions
!= CFN_LAST)
> + return true;
> +
Can you instead move the check inside the if (fndecl) right before
it, changing it to check gimple_call_combined_fn?
OK with that change.
Richard.
> return false;
>}
>
>
>
>
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
optab, "cond_len_fnma$a")
> OPTAB_D (cond_len_fnms_optab, "cond_len_fnms$a")
> OPTAB_D (cond_len_neg_optab, "cond_len_neg$a")
> +OPTAB_D (cond_len_copysign_optab, "cond_len_copysign$F$a")
> OPTAB_D (cond_len_one_cmpl_optab, "cond_len_one_cmpl$a")
> OPTAB_D (cmov_optab, "cmov$a6")
> OPTAB_D (cstore_optab, "cstore$a4")
>
>
>
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Thu, Oct 5, 2023 at 10:46 PM Tamar Christina wrote:
>
> > -Original Message-
> > From: Richard Sandiford
> > Sent: Thursday, October 5, 2023 9:26 PM
> > To: Tamar Christina
> > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> > ; Marcus Shawcroft
> > ; Kyrylo Tkachov
> > Subject:
On Fri, Oct 6, 2023 at 1:15 AM Andrew Pinski wrote:>
> Match has a pattern which converts `vec_cond(vec_cond(a,b,0), c, d)`
> into `vec_cond(a & b, c, d)` but since in this case a is a comparison
> fold will change `a & b` back into `vec_cond(a,b,0)` which causes an
> infinite loop.
> The best way
tion. */
>vect_get_vec_defs (loop_vinfo, stmt_info, slp_node, ncopies,
>single_defuse_cycle && reduc_index == 0
>? NULL_TREE : op.ops[0], &vec_oprnds0,
>single_defuse_cycle && reduc_index == 1
>? NULL_TREE : op.ops[1], &vec_oprnds1,
> - op.num_ops == 3
> - && !(single_defuse_cycle && reduc_index == 2)
> + op.num_ops == 4
> + || (op.num_ops == 3
> + && !(single_defuse_cycle && reduc_index == 2))
>? op.ops[2] : NULL_TREE, &vec_oprnds2);
> +
> + /* For single def-use cycles get one copy of the vectorized reduction
> + definition. */
>if (single_defuse_cycle)
> {
>gcc_assert (!slp_node);
> @@ -8301,7 +8387,7 @@ vect_transform_reduction (loop_vec_info loop_vinfo,
> }
>else
> {
> - if (op.num_ops == 3)
> + if (op.num_ops >= 3)
> vop[2] = vec_oprnds2[i];
>
> if (masked_loop_p && mask_by_cond_expr)
> @@ -8314,10 +8400,16 @@ vect_transform_reduction (loop_vec_info loop_vinfo,
> if (emulated_mixed_dot_prod)
> new_stmt = vect_emulate_mixed_dot_prod (loop_vinfo, stmt_info, gsi,
> vec_dest, vop);
> - else if (code.is_internal_fn ())
> +
> + else if (code.is_internal_fn () && !cond_fn_p)
> new_stmt = gimple_build_call_internal (internal_fn (code),
> op.num_ops,
> vop[0], vop[1], vop[2]);
> + else if (code.is_internal_fn () && cond_fn_p)
> + new_stmt = gimple_build_call_internal (internal_fn (code),
> +op.num_ops,
> +vop[0], vop[1], vop[2],
> +vop[1]);
> else
> new_stmt = gimple_build_assign (vec_dest, tree_code (op.code),
> vop[0], vop[1], vop[2]);
> diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
> index f1d0cd79961..e22067400af 100644
> --- a/gcc/tree-vectorizer.h
> +++ b/gcc/tree-vectorizer.h
> @@ -2319,7 +2319,7 @@ extern tree vect_create_addr_base_for_vector_ref
> (vec_info *,
> tree);
>
> /* In tree-vect-loop.cc. */
> -extern tree neutral_op_for_reduction (tree, code_helper, tree);
> +extern tree neutral_op_for_reduction (tree, code_helper, tree, bool = true);
> extern widest_int vect_iv_limit_for_partial_vectors (loop_vec_info
> loop_vinfo);
> bool vect_rgroup_iv_might_wrap_p (loop_vec_info, rgroup_controls *);
> /* Used in tree-vect-loop-manip.cc */
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Thu, Sep 14, 2023 at 2:43 PM Di Zhao OS
wrote:
>
> This is a new version of the patch on "nested FMA".
> Sorry for updating this after so long, I've been studying and
> writing micro cases to sort out the cause of the regression.
Sorry for taking so long to reply.
> First, following previous
On Thu, 5 Oct 2023, Jan Hubicka wrote:
[...]
> Richi, can you please look at the gimple matching part?
What did you have in mind? I couldn't find anything obvious in the
patch counting as gimple matching - do you have a pointer?
Thanks,
Richard.
On Fri, 6 Oct 2023, Robin Dapp wrote:
> > We might need a similar assert
> >
> > gcc_assert (HONOR_SIGNED_ZEROS (vectype_out)
> > && !HONOR_SIGN_DEPENDENT_ROUNDING (vectype_out));?
>
> erm, obviously not that exact assert but more something like
>
> if (HONOR_SIGNED_
> Am 07.10.2023 um 11:23 schrieb Richard Sandiford :
>
> Richard Biener writes:
>> On Thu, 5 Oct 2023, Tamar Christina wrote:
>>
>>>> I suppose the idea is that -abs(x) might be easier to optimize with other
>>>> patterns (consider
nt main (int argc, char **argv)
> }
>
>
> -/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { xfail { !
> aarch64_sve } } } } */
> +/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { xfail { { !
> aarch64_sve } && { ! riscv_v } } } } } */
> diff --git a/gcc/testsuite/gcc.dg/vect/tsvc/vect-tsvc-s353.c
> b/gcc/testsuite/gcc.dg/vect/tsvc/vect-tsvc-s353.c
> index 58898583c26..98ba7522471 100644
> --- a/gcc/testsuite/gcc.dg/vect/tsvc/vect-tsvc-s353.c
> +++ b/gcc/testsuite/gcc.dg/vect/tsvc/vect-tsvc-s353.c
> @@ -44,4 +44,4 @@ int main (int argc, char **argv)
>return 0;
> }
>
> -/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { xfail *-*-* } }
> } */
> +/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { xfail { !
> riscv_v } } } } */
> diff --git a/gcc/testsuite/gcc.dg/vect/tsvc/vect-tsvc-s441.c
> b/gcc/testsuite/gcc.dg/vect/tsvc/vect-tsvc-s441.c
> index e73f782ba01..480e5975a36 100644
> --- a/gcc/testsuite/gcc.dg/vect/tsvc/vect-tsvc-s441.c
> +++ b/gcc/testsuite/gcc.dg/vect/tsvc/vect-tsvc-s441.c
> @@ -42,4 +42,4 @@ int main (int argc, char **argv)
>return 0;
> }
>
> -/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { xfail { !
> aarch64_sve } } } } */
> +/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { xfail { { !
> aarch64_sve } && { ! riscv_v } } } } } */
> diff --git a/gcc/testsuite/gcc.dg/vect/tsvc/vect-tsvc-s443.c
> b/gcc/testsuite/gcc.dg/vect/tsvc/vect-tsvc-s443.c
> index a07800b7c95..709413fa6f8 100644
> --- a/gcc/testsuite/gcc.dg/vect/tsvc/vect-tsvc-s443.c
> +++ b/gcc/testsuite/gcc.dg/vect/tsvc/vect-tsvc-s443.c
> @@ -47,4 +47,4 @@ int main (int argc, char **argv)
>return 0;
> }
>
> -/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { xfail { !
> aarch64_sve } } } } */
> +/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { xfail { { !
> aarch64_sve } && { ! riscv_v } } } } } */
> diff --git a/gcc/testsuite/gcc.dg/vect/tsvc/vect-tsvc-vif.c
> b/gcc/testsuite/gcc.dg/vect/tsvc/vect-tsvc-vif.c
> index 48e1c141977..6eba46403b4 100644
> --- a/gcc/testsuite/gcc.dg/vect/tsvc/vect-tsvc-vif.c
> +++ b/gcc/testsuite/gcc.dg/vect/tsvc/vect-tsvc-vif.c
> @@ -38,4 +38,4 @@ int main (int argc, char **argv)
>return 0;
> }
>
> -/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { xfail { !
> aarch64_sve } } } } */
> +/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { xfail { { !
> aarch64_sve } && { ! riscv_v } } } } } */
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Sat, 7 Oct 2023, Richard Sandiford wrote:
> Richard Biener writes:
> >> Am 07.10.2023 um 11:23 schrieb Richard Sandiford
> >> >> Richard Biener writes:
> >>> On Thu, 5 Oct 2023, Tamar Christina wrote:
> >>>
> >>>>>
d-arith-6.c
> index 2aeebd44f83..c3257890735 100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-cond-arith-6.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-cond-arith-6.c
> @@ -56,8 +56,8 @@ main (void)
> }
> /* { dg-final { scan-tree-dump-times {vectorizing stmts using SLP} 4 "vect"
multi-step conversion)
For non-VLA and with the single vector size restriction we'd need
unpacking.
So it might be better
{ target { vect_unpack || { vect_vla && vect_sext_char_longlong } } }
where I think neither vect_vla nor vect_sext_char_longlong exists.
Richard - didn
to provide vect_.. dg targets
because then it's at least obvious what is meant. Or group
things as vect_float vect_int.
Richard.
> Regards
> Robin
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Sun, Oct 8, 2023 at 9:22 AM Juzhe-Zhong wrote:
>
> Previously, I removed the movmisalign pattern to fix the execution FAILs in
> this commit:
> https://github.com/gcc-mirror/gcc/commit/f7bff24905a6959f85f866390db2fff1d6f95520
>
> I was thinking that RVV doesn't allow misaligned at the beginnin
> +/* { dg-final { scan-tree-dump-times "optimizing condition reduction with
> FOLD_EXTRACT_LAST" 4 "vect" { target { { vect_fold_extract_last } && { !
> vect_pack_trunc } } } } } */
> /* { dg-final { scan-tree-dump-times "condition expression based on integer
>
4,5 @@ main ()
>return 0;
> }
>
> -/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 3 "vect" { target
> vect_pack_trunc } } } */
> -/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 2 "vect" { target {
> ! vect_pack_trunc } } } } */
c.dg/vect/no-scevccp-outer-21.c
> index 72e53c2bfb0..b30a5d78819 100644
> --- a/gcc/testsuite/gcc.dg/vect/no-scevccp-outer-21.c
> +++ b/gcc/testsuite/gcc.dg/vect/no-scevccp-outer-21.c
> @@ -59,4 +59,4 @@ int main (void)
>return 0;
> }
>
> -/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED." 1 "vect" {
> xfail { ! { vect_pack_trunc } } } } } */
> +/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED." 1 "vect" {
> xfail { { ! {vect_pack_trunc } } && { ! {riscv_v } } } } } } */
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
duction, not only
during transform (there the assert is proper, if we can distinguish
the loop mask vs. the COND_ADD here, otherwise just remove it).
Richard.
> Thanks for the pointers.
>
> Regards
> Robin
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Mon, 9 Oct 2023, Andrew Pinski wrote:
> On Mon, Oct 9, 2023 at 12:20?AM Richard Biener wrote:
> >
> > On Sat, 7 Oct 2023, Richard Sandiford wrote:
> >
> > > Richard Biener writes:
> > > >> Am 07.10.2023 um 11:23 schrieb Richard Sandiford
On Mon, Oct 9, 2023 at 11:39 AM Tamar Christina wrote:
>
> > -Original Message-
> > From: Richard Sandiford
> > Sent: Saturday, October 7, 2023 10:58 AM
> > To: Richard Biener
> > Cc: Tamar Christina ; gcc-patches@gcc.gnu.org;
> > nd ; Richard E
The following improves basic TBAA for access paths formed by
C++ abstraction where we are able to combine a path from an
address-taking operation with a path based on that access using
a pun to avoid memory access semantics on the address-taking part.
The trick is to identify the point the semanti
h } } } */
> -/* { dg-final { scan-tree-dump-times { = \.COND_RDIV} 1 "optimized" { target
> vect_double_cond_arith } } } */
> +/* { dg-final { scan-tree-dump { = \.COND_(LEN_)?ADD} "optimized" { target
> vect_double_cond_arith } } } */
> +/* { dg-final { scan-tree-dump { = \.COND_(LEN_)?SUB} "optimized" { target
> vect_double_cond_arith } } } */
> +/* { dg-final { scan-tree-dump { = \.COND_(LEN_)?MUL} "optimized" { target
> vect_double_cond_arith } } } */
> +/* { dg-final { scan-tree-dump { = \.COND_(LEN_)?RDIV} "optimized" { target
> vect_double_cond_arith } } } */
> /* { dg-final { scan-tree-dump-not {VEC_COND_EXPR} "optimized" { target
> vect_double_cond_arith } } } */
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Mon, Oct 9, 2023 at 12:17 PM Richard Sandiford
wrote:
>
> Tamar Christina writes:
> >> -Original Message-
> >> From: Richard Sandiford
> >> Sent: Monday, October 9, 2023 10:56 AM
> >> To: Tamar Christina
> >> Cc: Richard Biener ;
> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect"
> { target { ! vect_strided6 } } } } */
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Mon, 9 Oct 2023, Robin Dapp wrote:
> > Hmm, the function is called at transform time so this shouldn't help
> > avoiding the ICE. I expected we refuse to vectorize _any_ reduction
> > when sign dependent rounding is in effect? OTOH maybe sign-dependent
> > rounding is OK but only when we use
r epilogue loop" 0
> "vect" } } */
> -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect"
> } } */
> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect"
> {target { ! { vect_load
On Mon, 9 Oct 2023, Richard Biener wrote:
> The following improves basic TBAA for access paths formed by
> C++ abstraction where we are able to combine a path from an
> address-taking operation with a path based on that access using
> a pun to avoid memory access semantics on the ad
ontrols the natural exits of the loop. */
> + edge scalar_loop_iv;
all of the above sound as if they were IVs, the access macros have
_EXIT at the end, can you make the above as well?
Otherwise looks good to me.
Feel free to push approved patches of the series, no need to wait
until everything is approv
hecks (gimple *stmt,
>
> int_range_max r;
> if (!ranger->gori ().outgoing_edge_range_p (r, e, idx,
> - *get_global_range_query
> ()))
> + *get_range_query (cfun)))
> continue;
unswitching has a ranger instance but it does perform IL modification.
Did you check whether the use of the global ranger was intentional here?
Specifically we do have the 'ranger' object here and IIRC using global
ranges was intentional. So please leave this change out.
Thanks,
Richard.
> r.intersect (path_range);
> if (r.undefined_p ())
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
iptor (rtx rtl, machine_mod
> mem_loc_result->dw_loc_oprnd1.v.val_die_ref.external = 0;
> mem_loc_result->dw_loc_oprnd2.val_class
> = dw_val_class_wide_int;
> - mem_loc_result->dw_loc_oprnd2.v.val_wide = ggc_alloc ();
> - *mem_loc_result->dw_loc_oprnd2.v.val_wide = rtx_mo
On Mon, Oct 9, 2023 at 11:28 PM Andrew Pinski wrote:
>
> So currently we have a simplification for `a | ~(a ^ b)` but
> that does not match the case where we had originally `(~a) | (a ^ b)`
> so we need to add a new pattern that matches that and uses
> bitwise_inverted_equal_p
> that also catches
bits.
> +
> +proc check_effective_target_vect1024 { } {
> +return [expr { [lsearch -exact [available_vector_sizes] 1024] >= 0 }]
> +}
> +
> # Return 1 if the target supports vectors of 512 bits.
>
> proc check_effective_target_vect512 { } {
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
The following fixes fallout of r10-7145-g1dc00a8ec9aeba which made
us cautionous about CSEing a load to an object that has padding bits.
The added check also triggers for BLKmode entities like STRING_CSTs
but by definition a BLKmode entity does not have padding bits.
Bootstrapped and tested on x86
The following fixes a mistake in count_nonzero_bytes which happily
skips over stores clobbering the memory we load a value we store
from and then performs analysis on the memory state before the
intermediate store.
The patch implements the most simple fix - guarantee that there are
no intervening
nfo.conds.release ();
> + loop_form_info.alt_loop_conds.release ();
> +
>return first_loop_vinfo;
> }
>
> diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
> index
> afa7a8e30891c782a0e5e3740ecc4377f5a31e54..55b6771b271d5072fa1327d595e1dddb112cfdf6
> 10
On Tue, 10 Oct 2023, Jakub Jelinek wrote:
> On Tue, Oct 10, 2023 at 10:49:04AM +0000, Richard Biener wrote:
> > The following fixes a mistake in count_nonzero_bytes which happily
> > skips over stores clobbering the memory we load a value we store
> > from and then performs a
The following ups the limit in fold_view_convert_expr to handle
1024bit vectors as used by GCN and RVV. It also robustifies
the handling in visit_reference_op_load to properly give up when
constants cannot be re-interpreted.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
PR
edge main_iv = LOOP_VINFO_IV_EXIT (loop_vinfo);
> + slpeel_update_phi_nodes_for_guard2 (loop, epilog, main_iv, guard_e,
> + epilog_e);
> /* Only need to handle basic block before epilog loop if it's not
>the guard_bb, which is the case when skip_vector is true. */
> if (guard_bb != bb_before_epilog)
> @@ -3441,8 +3396,6 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree niters,
> tree nitersm1,
> }
> scale_loop_profile (epilog, prob_epilog, -1);
> }
> - else
> - slpeel_update_phi_nodes_for_lcssa (epilog);
>
>unsigned HOST_WIDE_INT bound;
>if (bound_scalar.is_constant (&bound))
> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index
> f1caa5f207d3b13da58c3a313b11d1ef98374349..327cab0f736da7f1bd3e024d666df46ef9208107
> 100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -5877,7 +5877,7 @@ vect_create_epilog_for_reduction (loop_vec_info
> loop_vinfo,
>basic_block exit_bb;
>tree scalar_dest;
>tree scalar_type;
> - gimple *new_phi = NULL, *phi;
> + gimple *new_phi = NULL, *phi = NULL;
>gimple_stmt_iterator exit_gsi;
>tree new_temp = NULL_TREE, new_name, new_scalar_dest;
>gimple *epilog_stmt = NULL;
> diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
> index
> 55b6771b271d5072fa1327d595e1dddb112cfdf6..25ceb6600673d71fd6012443403997e921066483
> 100644
> --- a/gcc/tree-vectorizer.h
> +++ b/gcc/tree-vectorizer.h
> @@ -2183,7 +2183,7 @@ extern bool slpeel_can_duplicate_loop_p (const class
> loop *, const_edge,
>const_edge);
> class loop *slpeel_tree_duplicate_loop_to_edge_cfg (class loop *, edge,
> class loop *, edge,
> - edge, edge *);
> + edge, edge *, bool = true);
> class loop *vect_loop_versioning (loop_vec_info, gimple *);
> extern class loop *vect_do_peeling (loop_vec_info, tree, tree,
> tree *, tree *, tree *, int, bool, bool,
>
>
>
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
On Tue, 10 Oct 2023, Jakub Jelinek wrote:
> On Tue, Oct 10, 2023 at 11:59:28AM +0000, Richard Biener wrote:
> > > I don't see why the CONSTRUCTOR case couldn't be fine regardless of the
> > > vuse. Though, am not really sure when a CONSTRUCTOR would appear, th
On Tue, 10 Oct 2023, Jakub Jelinek wrote:
> Hi!
>
> On Tue, Oct 10, 2023 at 09:30:31AM +, Richard Biener wrote:
> > On Mon, 9 Oct 2023, Jakub Jelinek wrote:
> > > > This makes wide_int unusable in GC structures, so for dwarf2out
> > > > which was the on
-dfa.cc b/gcc/tree-dfa.cc
> index af8e9243947..5355af2c869 100644
> --- a/gcc/tree-dfa.cc
> +++ b/gcc/tree-dfa.cc
> @@ -531,10 +531,7 @@ get_ref_base_and_extent (tree exp, poly_int64 *poffset,
>
> value_range vr;
> range_query *query;
> - if (cfun)
&
On Wed, Oct 11, 2023 at 2:46 AM Andrew Pinski wrote:
>
> While `a & (b ^ ~a)` is optimized to `a & b` on the rtl level,
> it is always good to optimize this at the gimple level and allows
> us to match a few extra things including where a is a comparison.
>
> Note I had to update/change the testca
IV
> > > + controls the natural exits of the loop. */ edge
> > > + vec_epilogue_loop_iv;
> > > +
> > > + /* The controlling loop IV for the scalar loop being vectorized. This
> > > IV
> > > + controls the natural exits of the loop.
On Wed, 11 Oct 2023, Tamar Christina wrote:
> > -Original Message-
> > From: Richard Biener
> > Sent: Tuesday, October 10, 2023 12:14 PM
> > To: Tamar Christina
> > Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com
> > Subject: Re: [PATCH 2/3]mi
gt; > > }
> > >
> > > -/* EPILOG loop is duplicated from the original loop for vectorizing,
> > > - the arg of its loop closed ssa PHI needs to be updated. */
> > > -
> > > -static void
> > > -slpeel_update_phi_nodes_for_lcssa (class l
On Wed, 11 Oct 2023, Juzhe-Zhong wrote:
> This patch fixes this following FAILs in RISC-V regression:
>
> FAIL: gcc.dg/vect/vect-gather-1.c -flto -ffat-lto-objects scan-tree-dump
> vect "Loop contains only SLP stmts"
> FAIL: gcc.dg/vect/vect-gather-1.c scan-tree-dump vect "Loop contains only SL
The following removes a misguided attempt to allow x + x in a reduction
path, also allowing x * x which isn't valid. x + x actually never
arrives this way but instead is canonicalized to 2 * x. This makes
reduction path handling consistent with how we handle the single-stmt
reduction case.
Boots
The support to elide calls to allocation functions in DCE runs into
the issue that when implementations are discovered noreturn we end
up DCEing the calls anyway, leaving blocks without termination and
without outgoing edges which is both invalid IL and wrong-code when
as in the example the noretur
scale, 0, condition) -> 5 arguments same
> as MASK_GATHER_LOAD.
> In this situation, MASK_LEN_GATHER_LOAD can resue the MASK_GATHER_LOAD SLP
> flow naturally.
>
> Is it reasonable ?
What's wrong with handling MASK_LEN_GATHER_LOAD with all arguments
even when the mask is
e cases we put this code in for (we should be able to
materialize all constants?). At least uniform boolean constants
should be fine.
>
>
>
> juzhe.zh...@rivai.ai
>
> From: Richard Biener
> Date: 2023-10-12 17:44
> To: ???
> CC: gcc-patches; richard.sandiford
>
The following handles byte-aligned, power-of-two and byte-multiple
sized BIT_FIELD_REF reads in SRA. In particular this should cover
BIT_FIELD_REFs created by optimize_bit_field_compare.
For gcc.dg/tree-ssa/ssa-dse-26.c we now SRA the BIT_FIELD_REF
appearing there leading to more DSE, fully elidi
nfo,
>tree scalar_dest = gimple_get_lhs (stmt_info->stmt);
>tree vec_dest = vect_create_destination_var (scalar_dest, vectype_out);
>
> + /* Get NCOPIES vector definitions for all operands except the reduction
> + definition. */
>vect_get_vec_defs (loop_vinfo, stmt_info, slp_node, ncopies,
>single_defuse_cycle && reduc_index == 0
>? NULL_TREE : op.ops[0], &vec_oprnds0,
>single_defuse_cycle && reduc_index == 1
>? NULL_TREE : op.ops[1], &vec_oprnds1,
> - op.num_ops == 3
> - && !(single_defuse_cycle && reduc_index == 2)
> + op.num_ops == 4
> + || (op.num_ops == 3
> + && !(single_defuse_cycle && reduc_index == 2))
>? op.ops[2] : NULL_TREE, &vec_oprnds2);
> +
> + /* For single def-use cycles get one copy of the vectorized reduction
> + definition. */
>if (single_defuse_cycle)
> {
>gcc_assert (!slp_node);
> @@ -8301,7 +8389,7 @@ vect_transform_reduction (loop_vec_info loop_vinfo,
> }
>else
> {
> - if (op.num_ops == 3)
> + if (op.num_ops >= 3)
> vop[2] = vec_oprnds2[i];
>
> if (masked_loop_p && mask_by_cond_expr)
> @@ -8314,10 +8402,16 @@ vect_transform_reduction (loop_vec_info loop_vinfo,
> if (emulated_mixed_dot_prod)
> new_stmt = vect_emulate_mixed_dot_prod (loop_vinfo, stmt_info, gsi,
> vec_dest, vop);
> - else if (code.is_internal_fn ())
> +
> + else if (code.is_internal_fn () && !cond_fn_p)
> new_stmt = gimple_build_call_internal (internal_fn (code),
> op.num_ops,
> vop[0], vop[1], vop[2]);
> + else if (code.is_internal_fn () && cond_fn_p)
> + new_stmt = gimple_build_call_internal (internal_fn (code),
> +op.num_ops,
> +vop[0], vop[1], vop[2],
> +vop[1]);
> else
> new_stmt = gimple_build_assign (vec_dest, tree_code (op.code),
> vop[0], vop[1], vop[2]);
> diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
> index f1d0cd79961..e22067400af 100644
> --- a/gcc/tree-vectorizer.h
> +++ b/gcc/tree-vectorizer.h
> @@ -2319,7 +2319,7 @@ extern tree vect_create_addr_base_for_vector_ref
> (vec_info *,
> tree);
>
> /* In tree-vect-loop.cc. */
> -extern tree neutral_op_for_reduction (tree, code_helper, tree);
> +extern tree neutral_op_for_reduction (tree, code_helper, tree, bool = true);
> extern widest_int vect_iv_limit_for_partial_vectors (loop_vec_info
> loop_vinfo);
> bool vect_rgroup_iv_might_wrap_p (loop_vec_info, rgroup_controls *);
> /* Used in tree-vect-loop-manip.cc */
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
((STMT_VINFO_TYPE (SLP_TREE_REPRESENTATIVE (node))
> > assert FAILed.
> == shift_vec_info_type)
> && j == 1);
> continue;
> }
>
> Could you help me with that?
>
>
> juzhe.zh...@rivai.ai
>
>
nue;
> }
>
> It seems that we handle vect_constant_def same as vect_external_def.
> So failed to SLP ?
Why? We _should_ see a SLP node for the all-true mask operand.
>
>
>
> juzhe.zh...@rivai.ai
>
> From: Richard Biener
> Date: 2023-10-12 17:55
is not vectorized.\n");
> return false;
> }
>
> If we allow vect_constant_def, we should adjust constant SLP mask ? in the
> caller "vectorizable_load" ?
>
> But I don't know how to adjust that.
>
>
>
> juzhe.zh...@rivai.ai
>
&g
On Thu, 12 Oct 2023, Richard Biener wrote:
> The following handles byte-aligned, power-of-two and byte-multiple
> sized BIT_FIELD_REF reads in SRA. In particular this should cover
> BIT_FIELD_REFs created by optimize_bit_field_compare.
>
> For gcc.dg/tree-ssa/ssa-dse-26.c
This adds support for SLP vectorization of OpenMP SIMD clone calls.
There's a complication when vectorizing calls involving virtual
operands since this is now for the first time not only leafs (loads
or stores). With SLP this runs into the issue that placement of
the vectorized stmts is not necess
arget { { ! aarch64*-*-* } && { !
> amdgcn*-*-* } } } } } */
> + Likewise for AMD GCN and RVV. */
> +/* { dg-final { scan-tree-dump "BB vectorization with gaps at the end of a
> load is not supported" "slp1" { target { { ! aarch64*-*-* } && { {
; -/* { dg-final { scan-tree-dump-not "from scalars" "slp2" { xfail amdgcn-*-*
> } } } */
> +/* { dg-final { scan-tree-dump-not "from scalars" "slp2" { xfail vect512 } }
> } */
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
The following teaches vectorizable_simd_clone_call to handle
integer mode masks. The tricky bit is to second-guess the
number of lanes represented by a single mask argument - the following
uses simdlen and the number of mask arguments to calculate that,
assuming ABIs have them uniform.
Similar to
> Am 14.10.2023 um 10:21 schrieb Jakub Jelinek :
>
> Hi!
>
> As mentioned in the PR, my estimations on needed buffer size for wide_int
> and especially widest_int printing were incorrect, I've used get_len ()
> in the estimations, but that is true only for !wi::neg_p (x) values.
> Under the h
> Am 14.10.2023 um 11:50 schrieb Jakub Jelinek :
>
> Hi!
>
>> On Sat, Oct 14, 2023 at 10:41:28AM +0200, Richard Biener wrote:
>> Can we somehow abstract this common pattern?
>
> So like this? With room for the future tweaks like printing decimal
> in
On Mon, 16 Oct 2023, Tatsuyuki Ishi wrote:
> lld and mold are platform-agnostic and not prefixed with target triple.
> Prepending the target triple makes it less likely to find the intended
> linker executable.
>
> A potential breaking change is that we no longer try to search for
> triple-prefix
On Mon, 16 Oct 2023, Tatsuyuki Ishi wrote:
>
>
> > On Oct 16, 2023, at 17:39, Richard Biener wrote:
> >
> > On Mon, 16 Oct 2023, Tatsuyuki Ishi wrote:
> >
> >> lld and mold are platform-agnostic and not prefixed with target triple.
> >> Prep
On Mon, Oct 16, 2023 at 12:00 AM Andrew Pinski wrote:
>
> This improves the `A CMP 0 ? A : -A` set of match patterns to use
> bitwise_equal_p which allows an nop cast between signed and unsigned.
> This allows catching a few extra cases which were not being caught before.
>
> OK? Bootstrapped and
On Mon, 16 Oct 2023, Tatsuyuki Ishi wrote:
>
>
> > On Oct 16, 2023, at 17:55, Richard Biener wrote:
> >
> > On Mon, 16 Oct 2023, Tatsuyuki Ishi wrote:
> >
> >>
> >>
> >>> On Oct 16, 2023, at 17:39, Richard Biener w
On Mon, Oct 16, 2023 at 2:02 AM Andrew Pinski wrote:
>
> In the case of a NOP conversion (precisions of the 2 types are equal),
> factoring out the conversion can be done even if int_fits_type_p returns
> false and even when the conversion is defined by a statement inside the
> conditional. Since
On Mon, Oct 16, 2023 at 4:34 AM Andrew Pinski wrote:
>
> Currently we able to simplify `~a CMP ~b` to `b CMP a` but we should allow a
> nop
> conversion in between the `~` and the `a` which can show up. A similarly
> thing should
> be done for `~a CMP CST`.
>
> I had originally submitted the `~a
The following addresses build_reconstructed_reference failing to
build references with a different offset than the models and thus
the caller conditional being off. This manifests when attempting
to build a ref with offset 160 from the model BIT_FIELD_REF
onto the same base l_4827 but the models
The following addresses a missed DECL_NOT_GIMPLE_REG_P setting of
a volatile declared parameter which causes inlining to substitute
a constant parameter into a context where its address is required.
The main issue is in update_address_taken which clears
DECL_NOT_GIMPLE_REG_P from the parameter but
On Thu, Oct 12, 2023 at 10:42 AM Ajit Agarwal wrote:
>
> This patch improves code sinking pass to sink statements before call to reduce
> register pressure.
> Review comments are incorporated. Synced and modified with latest trunk
> sources.
>
> For example :
>
> void bar();
> int j;
> void foo(i
On Mon, Oct 16, 2023 at 9:27 PM Jeff Law wrote:
>
>
>
> On 10/15/23 03:49, Roger Sayle wrote:
> >
> > Hi Jeff,
> > Thanks for the speedy review(s).
> >
> >> From: Jeff Law
> >> Sent: 15 October 2023 00:03
> >> To: Roger Sayle ; gcc-patches@gcc.gnu.org
> >> Subject: Re: [PATCH] PR 91865: Avoid ZER
On Mon, Oct 16, 2023 at 11:59 PM Richard Sandiford
wrote:
>
> Robin Dapp writes:
> >> Why are the contents of this if statement wrong for COND_LEN?
> >> If the "else" value doesn't matter, then the masked form can use
> >> the "then" value for all elements. I would have expected the same
> >> th
On Tue, Oct 17, 2023 at 10:53 AM Ajit Agarwal wrote:
>
> Hello Richard:
>
> On 17/10/23 2:03 pm, Richard Biener wrote:
> > On Thu, Oct 12, 2023 at 10:42 AM Ajit Agarwal
> > wrote:
> >>
> >> This patch improves code sinking pass to sink statements
On Thu, Oct 12, 2023 at 10:15 AM Richard Sandiford
wrote:
>
> Richard Biener writes:
> > On Tue, Aug 22, 2023 at 12:42 PM Szabolcs Nagy via Gcc-patches
> > wrote:
> >>
> >> From: Richard Sandiford
> >>
> >> The prologue/epilogue pass al
On Sat, Oct 14, 2023 at 2:57 AM Andrew Pinski wrote:
>
> This adds the simplification `a & (x | CST)` to a when we know that
> `(a & ~CST) == 0`. In a similar fashion as `a & CST` is handle.
>
> I looked into handling `a | (x & CST)` but that I don't see any decent
> simplifications happening.
>
>
; {
> --- gcc/tree-pretty-print.cc.jj 2023-09-21 20:02:53.467522151 +0200
> +++ gcc/tree-pretty-print.cc 2023-10-16 11:05:51.131997367 +0200
> @@ -2248,10 +2248,11 @@ dump_generic_node (pretty_printer *pp, t
> pp_minus (pp);
> val = -val;
> }
> - unsigned int prec = val.get_precision ();
> - if ((prec + 3) / 4 > sizeof (pp_buffer (pp)->digit_buffer) - 3)
> + unsigned int len;
> + print_hex_buf_size (val, &len);
> + if (UNLIKELY (len > sizeof (pp_buffer (pp)->digit_buffer)))
> {
> - char *buf = XALLOCAVEC (char, (prec + 3) / 4 + 3);
> + char *buf = XALLOCAVEC (char, len);
> print_hex (val, buf);
> pp_string (pp, buf);
> }
>
> Jakub
>
>
--
Richard Biener
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
The following avoids bogously re-using the simd-clone-info we
currently hang off stmt_info from two different SLP contexts where
a different number of lanes should have chosen a different best
simdclone.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
PR tree-optimization/111
On Tue, Oct 17, 2023 at 9:51 PM Jason Merrill wrote:
>
> Ping?
OK.
Thanks,
Richard.
> On 10/3/23 17:09, Jason Merrill wrote:
> > This revision changes from using DK_PEDWARN for permerror-with-option to
> > using
> > DK_PERMERROR.
> >
> > Tested x86_64-pc-linux-gnu. OK for trunk?
> >
> > -- 8<
1 - 100 of 26996 matches
Mail list logo