Hi,
Thanks a lot for the review. Sorry for the very late reply.
The following are my comments on the feedback.
> The main thing that worries me is:
>
> #if _GLIBCXX_SIMD_HAVE_SVE
> constexpr inline int __sve_vectorized_size_bytes =
> __ARM_FEATURE_SVE_BITS / 8;
> #else
> constexpr inlin
Pushed to r14-6911.
在 2023/12/29 下午3:48, chenxiaolong 写道:
In the LoongArch architecture, GCC supports the vectorization function tested
by vect/slp-26.c, but there is no detection of loongarch in dg-finals. Add
loongarch to the appropriate dg-finals.
gcc/testsuite/ChangeLog:
* gcc.dg/
Notice a case has "Maximum lmul = 16" which is incorrect.
Correct LMUL estimation for MASK_LEN_LOAD/MASK_LEN_STORE.
Committed.
gcc/ChangeLog:
* config/riscv/riscv-vector-costs.cc (variable_vectorized_p): New
function.
(compute_nregs_for_mode): Refine LMUL.
(max_number_of
Pushed to r14-6909.
在 2023/12/29 上午9:45, chenxiaolong 写道:
After implementing the cost model on the LoongArch architecture, the GCC
compiler code has this feature turned on by default, which causes the
lasx-xvstelm.c file test to fail. Through analysis, this test case can
generate vectorization i
Pushed to r14-6908.
在 2023/12/28 下午8:26, Li Wei 写道:
There are currently two versions of the implementations of constant
vector permutation: loongarch_expand_vec_perm_const_1 and
loongarch_expand_vec_perm_const_2. The implementations of the two
versions are different. Currently, only the impleme
Here is an updated version.
libstdc++: [_Hashtable] Avoid redundant usage of rehash policy
Bypass call to __detail::__distance_fwd and the check if rehash is
needed when
instantiating from an iterator range or assigning an
initializer_list to an
unordered_multimap or unordered
On Wed, 3 Jan 2024, Jacob Bachmeyer wrote:
> Comments before I start on an implementation?
I'd suggest to await the conclusion of the debate: I *think*
I've proved that dg-timeout-factor is already active as intended
(all parts of a test), specifically when the compilation result
is executed (f
On Wed, 3 Jan 2024, Maciej W. Rozycki wrote:
> On Wed, 3 Jan 2024, Hans-Peter Nilsson wrote:
>
> > > The test execution timeout is different from the tool execution timeout
> > > where it is GCC execution that is being guarded against taking excessive
> > > amount of time on the test host r
gcc/ChangeLog
* omp-general.cc: Fix comment typos and misplaced/confusing
comments. Delete redundant include of omp-general.h.
---
gcc/omp-general.cc | 21 +
1 file changed, 9 insertions(+), 12 deletions(-)
diff --git a/gcc/omp-general.cc b/gcc/omp-general.cc
On Thu, 2024-01-04 at 11:58 +0800, chenglulu wrote:
>
> 在 2024/1/4 上午11:51, Xi Ruoyao 写道:
> > On Wed, 2023-12-27 at 16:46 +0800, Lulu Cheng wrote:
> > > +(define_insn "movdi_pcrel64"
> > > + [(set (match_operand:DI 0 "register_operand" "=&r")
> > > + (match_operand:DI 1 "symbolic_pcrel64_ope
在 2024/1/4 上午11:51, Xi Ruoyao 写道:
On Wed, 2023-12-27 at 16:46 +0800, Lulu Cheng wrote:
+(define_insn "movdi_pcrel64"
+ [(set (match_operand:DI 0 "register_operand" "=&r")
+ (match_operand:DI 1 "symbolic_pcrel64_operand"))
+ (unspec:DI [(const_int 0)]
+ UNSPEC_MOV_PCREL64)
+ (use (re
On Wed, 2023-12-27 at 16:46 +0800, Lulu Cheng wrote:
> +(define_insn "movdi_pcrel64"
> + [(set (match_operand:DI 0 "register_operand" "=&r")
> + (match_operand:DI 1 "symbolic_pcrel64_operand"))
> + (unspec:DI [(const_int 0)]
> + UNSPEC_MOV_PCREL64)
> + (use (reg:DI T3_REGNUM))
> + (clob
Maciej W. Rozycki wrote:
On Wed, 3 Jan 2024, Hans-Peter Nilsson wrote:
The test execution timeout is different from the tool execution timeout
where it is GCC execution that is being guarded against taking excessive
amount of time on the test host rather than the resulting test case
execu
The [x]vld/[x]vst directive is defined as follows:
[x]vld/[x]vst {x/v}d, rj, si12
When not modified, the immediate field of [x]vld/[x]vst is between 10 and
14 bits depending on the type. However, in loongarch_valid_offset_p, the
immediate field is restricted first, so there is no error. However,
This patch only involves the generation of xtheadvector
special load/store instructions and vext instructions.
gcc/ChangeLog:
* config/riscv/riscv-vector-builtins-bases.cc
(class th_loadstore_width): Define new builtin bases.
(BASE): Define new builtin bases.
* con
This patch is to handle the differences in instruction generation
between Vector and XTheadVector. In this version, we only support
partial xtheadvector instructions that leverage directly from current
RVV1.0 with simple adding "th." prefix. For different name xtheadvector
instructions but share sa
On 2024/1/3 19:12, Tobias Burnus wrote:
On 22.12.23 03:36, Lipeng Zhu wrote:
This patch try to fix the bug when HAVE_ATOMIC_FETCH_ADD is
not defined in dec_waiting_unlocked function.
libgfortran/ChangeLog:
* io/io.h (dec_waiting_unlocked): Use
__gthread_rwlock_wrlock/__gthread_r
> From: Patrick Palka
> Date: Tue, 2 Jan 2024 12:48:26 -0500
> Tested on x86_64-pc-linux-gnu, does this look OK for trunk and release
> branches (r14-205 was backported everywhere)?
>
> -- >8 --
>
> The adjustment to max_size_type.cc in r14-205-g83470a5cd4c3d2
> inadvertently increased the exe
gcc/testsuite
* gcc.c-torture/compile/mipscop-1.c: Include stdio.h.
* gcc.c-torture/compile/mipscop-2.c: Ditto.
* gcc.c-torture/compile/mipscop-3.c: Ditto.
* gcc.c-torture/compile/mipscop-4.c: Ditto.
---
gcc/testsuite/gcc.c-torture/compile/mipscop-1.c | 1 +
gcc/te
This match pattern allows combination (zero_extract:DI 8, 24, QI)
with an sign-extend to 32bit INS instruction on TARGET_64BIT.
For SI mode, if the sign-bit is modified by bitops, we will need a
sign-extend operation. Since 32bit INS instruction can be sure that
result is sign-extended, and the Q
When combine some instructions, the generic `rtx_cost`
may over estimate the cost of result RTL, due to that
the RTL may be quite complex and `rtx_cost` has no
information that this RTL can be convert to simple
hardware instruction(s).
In this case, Let's use `insn_count * perf_ratio` to
estimate
The accurate cost of an pattern can get with
insn_count * perf_ratio
The default value is set to 0 instead of 1, since that
we will need to distinguish the default value and it is
really set for an pattern. Since it is not set for most
patterns yet, to use it, we will need to be sure tha
YunQiang Su writes:
> On TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)) == true platforms,
> if 31 or above bits is polluted by an bitops, we will need an
> truncate. Let's emit one, and mark let's use the same hardreg
> as in and out, the RTL may like:
>
> (insn 21 20 24 2 (set (subreg/s/u:SI (r
Hello-
May I please ping this one? Thanks...
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640386.html
-Lewis
On Tue, Dec 12, 2023 at 6:18 PM Lewis Hyatt wrote:
>
> Hello-
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110558
>
> This is a small fix for the libcpp issue noted in th
"Maciej W. Rozycki" writes:
> On Wed, 3 Jan 2024, Hans-Peter Nilsson wrote:
>
>> > The test execution timeout is different from the tool execution timeout
>> > where it is GCC execution that is being guarded against taking excessive
>> > amount of time on the test host rather than the resulting
Fix indent of some codes to make them 8 spaces align.
Committed.
gcc/ChangeLog:
* config/riscv/vector.md: Fix indent.
---
gcc/config/riscv/vector.md | 10 +-
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
in
As PR113206 and PR113209, the bugs happens on the following situation:
li a4,32
...
vsetvli zero,a4,e8,m8,ta,ma
...
slliw a4,a3,24
sraiw a4,a4,24
bge a3,a1,.L8
sb a4,%lo(e)(a0)
vsetvli zero,a4,e8,m8,ta,ma --
On 1/3/24 11:31, Tobias Burnus wrote:
Another small step in my side project of documenting all OpenMP routines in
libgomp.texi
Here, only 'omp_display_env' is added. (New since OpenMP 5.1 but since a long
time in GCC,
some fineprint in both the implementation and in the documentation is based on
On 12/19/23 15:17, Jason Merrill wrote:
Tested x86_64-pc-linux-gnu, OK for trunk?
-- 8< --
-Werror=foo implying -Wfoo wasn't working for -Wdeprecated-copy-dtor,
because it is specified as the value 2 of warn_deprecated_copy, which shows
up as CLVC_EQUAL, which is not one of the three var_typ
Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look
OK for trunk?
-- >8 --
Since partial template specializations can't be named directly, access
control (when declared at class scope) doesn't apply to them, so we
shouldn't have to set their TREE_PRIVATE / TREE_PROTECTED. This code
Dear all,
I've committed the attached, simple & obvious patch for a
gmp memory leak in gfc_get_nodesc_array_type that shows
up when running f951 under valgrind e.g. on testcase
gfortran.dg/class_optional_2.f90, after regtesting on
x86_64-pc-linux-gnu.
(Note that this does not address the underlyi
Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK
for trunk and perhaps 13?
-- >8 --
Here we neglect to emit the definitions of A::f2 and A::f4
despite the explicit instantiations ultimately because TREE_PUBLIC isn't
set on the corresponding partial specializations, the declara
Another small step in my side project of documenting all OpenMP routines in
libgomp.texi
Here, only 'omp_display_env' is added. (New since OpenMP 5.1 but since a long
time in GCC,
some fineprint in both the implementation and in the documentation is based on
TR11.)
* * *
RFC - regarding print
On 21/12/2023 23:07, Jonathan Wakely wrote:
On Thu, 23 Nov 2023 at 21:59, François Dumont wrote:
libstdc++: [_Hashtable] Fix some implementation inconsistencies
Get rid of the different usages of the mutable keyword. For
_Prime_rehash_policy methods are exported from the lib
Richard Sandiford writes:
> Jeff Law writes:
>> [...]
>> + if (GET_CODE (x) == ZERO_EXTRACT)
>> +{
>> + /* If either the size or the start position is unknown,
>> + then assume we know nothing about what is overwritten.
>> + This is overly conservativ
On Wed, 3 Jan 2024, Hans-Peter Nilsson wrote:
> > The test execution timeout is different from the tool execution timeout
> > where it is GCC execution that is being guarded against taking excessive
> > amount of time on the test host rather than the resulting test case
> > executable run on t
On 09/11/2023 12:24 pm, Thomas Schwinge wrote:
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -350,6 +350,9 @@ enum omp_clause_code {
/* OpenMP clause: doacross ({source,sink}:vec). */
OMP_CLAUSE_DOACROSS,
+ /* OpenMP clause: indirect [(constant-integer-expression)]. */
+ OMP_CLAUSE_
On Wed, Jan 03, 2024 at 11:42:58PM +0800, xndcn wrote:
> Hi, I am new to this, and I really need your advice, thanks.
>
> I noticed PR71716 and I want to enable ATOMIC_COMPARE_EXCHANGE
> internal-fn optimization
>
> for floating type or types contains padding (e.g., long double).
> Please correct
Hi, I am new to this, and I really need your advice, thanks.
I noticed PR71716 and I want to enable ATOMIC_COMPARE_EXCHANGE
internal-fn optimization
for floating type or types contains padding (e.g., long double).
Please correct me if I happen to
make any mistakes, Thanks!
Firstly, about the con
On Mon, Nov 20, 2023 at 11:22:56 -0500, Ben Boeckel wrote:
> ---
> htdocs/gcc-14/changes.html | 11 +++
> 1 file changed, 11 insertions(+)
>
> diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html
> index 7278f753..b506eeb1 100644
> --- a/htdocs/gcc-14/changes.html
> +++ b/
LGTM with only few comment suggestion
Juzhe-Zhong 於 2024年1月3日 週三,18:50寫道:
> As PR113206 and PR113209, the bugs happens on the following situation:
>
> li a4,32
> ...
> vsetvli zero,a4,e8,m8,ta,ma
> ...
> slliw a4,a3,24
> sraiw a4,a4,24
>
Hello
I have committed the following trivial patch to emit FUNC_MAP or
IND_FUNC_MAP in separate branches of an if statement.
Kwok
On 09/11/2023 12:24 pm, Thomas Schwinge wrote:
Similar to how you have it here:
--- a/gcc/config/nvptx/mkoffload.cc
+++ b/gcc/config/nvptx/mkoffload.cc
@@ -51,6
Ping David. :)
Le lun. 18 déc. 2023 à 23:27, Guillaume Gomez
a écrit :
>
> Ping David. :)
>
> Le sam. 9 déc. 2023 à 12:12, Guillaume Gomez
> a écrit :
> >
> > Added it.
> >
> > Le jeu. 7 déc. 2023 à 18:13, Antoni Boucher a écrit :
> > >
> > > It seems like you forgot to prefix the commit messag
Linaro CI tells me that this patch caused regressions on ARM. I don't
have an ARM machine available to test on, but it appears to have been
caused by attempting to stream vtables as static data members, and ARM
having different behaviour with regards to when DECL_INTERFACE_KNOWN is
marked on vtable
Jeff Law writes:
> I know we're deep into stage3 and about to transition to stage4. So if
> the consensus is for this to wait, I'll understand
>
> This it the V3 of the ext-dce patch based on Joern's work from last year.
>
> Changes since V2:
>Handle MINUS
>Minor logic cleanup for SU
Hi!
update-copyright.py --this-year FAILs on two spots in the modula2
directories.
One is gpl_v3_without_node.texi, I think that is similar to other
license files which we already exclude from updates.
And the other is GmcOptions.cc, which has lines like
mcPrintf_printf0 ((const char *) "Copyrig
On 22.12.23 03:36, Lipeng Zhu wrote:
This patch try to fix the bug when HAVE_ATOMIC_FETCH_ADD is
not defined in dec_waiting_unlocked function.
libgfortran/ChangeLog:
* io/io.h (dec_waiting_unlocked): Use
__gthread_rwlock_wrlock/__gthread_rwlock_unlock or
__gthread_mutex_lock/_
While working on PR113209, I noticed it is same issue so this patch not only
fixes PR113206 bug, but also fixes
PR113209.
Send V2 with adding PR113209 test and PR target/113209:
https://gcc.gnu.org/pipermail/gcc-patches/2024-January/641740.html
juzhe.zh...@rivai.ai
From: Juzhe-Zhong
Date:
As PR113206 and PR113209, the bugs happens on the following situation:
li a4,32
...
vsetvli zero,a4,e8,m8,ta,ma
...
slliw a4,a3,24
sraiw a4,a4,24
bge a3,a1,.L8
sb a4,%lo(e)(a0)
vsetvli zero,a4,e8,m8,ta,ma --
On Wed, 2024-01-03 at 16:24 +0800, chenglulu wrote:
> LGTM!
>
> Thanks!
Pushed r14-6890.
FWIW sometimes tree optimizer still fails to emit .reduc_f{max,min} or
it emits them sub-optimally. I've commented in PR112457 but maybe I
should've created a new ticket...
> 在 2024/1/1 上午3:15, Xi Ruoyao 写
As PR113206, the bugs happens on the following situation:
li a4,32
...
vsetvli zero,a4,e8,m8,ta,ma
...
slliw a4,a3,24
sraiw a4,a4,24
bge a3,a1,.L8
sb a4,%lo(e)(a0)
vsetvli zero,a4,e8,m8,ta,ma --> a4 is pollu
Bootstrapped & regtested on x86_64-pc-linux-gnu, OK for trunk?
-- >8 --
This patch stops 'add_binding_entity' from ignoring all names in the
global module fragment, since they should still be exported if named
in an exported using-declaration.
PR c++/109679
gcc/cp/ChangeLog:
*
Sergey Bugaev writes:
> Since it's not i386-specific; this makes it possible to reuse it for other
> architectures.
>
> Also, add a warning for the case gnu.h is specified before gnu-user.h, which
> would cause gnu-user's version of the spec to override gnu's, and not the
> other
> way around as
On 2023/12/21 19:42, Thomas Schwinge wrote:
Hi!
On 2023-12-13T21:52:29+0100, I wrote:
On 2023-12-12T02:05:26+, "Zhu, Lipeng" wrote:
On 2023/12/12 1:45, H.J. Lu wrote:
On Sat, Dec 9, 2023 at 7:25 PM Zhu, Lipeng wrote:
On 2023/12/9 23:23, Jakub Jelinek wrote:
On Sat, Dec 09, 2023 at
LGTM!
Thanks!
在 2024/1/1 上午3:15, Xi Ruoyao 写道:
We already had smin/smax RTL pattern using vfmin/vfmax instructions.
But for smin/smax, it's unspecified what will happen if either operand
contains any NaN operands. So we would not vectorize the loop with
-fno-finite-math-only (the default for a
55 matches
Mail list logo