Hi Andrew!
On 2024-08-08T13:50:17+, Andrew Stubbs wrote:
> Previously, trampolines worked on GCN3 devices, but the newer GCN5
> devices had different permissions on the stack memory space we were
> using.
>
> That changed when we added the reverse-offload features because we
> switched from u
Hi all,
The constraint Dm is intended to match vectors of minus 1, but actually
checks for CONST1_RTX. This doesn't have a bad effect in practice as its
only use in the aarch64_wrffr pattern for the setffr instruction which
is a VNx16BI operation and -1 and 1 are the same there. That pattern
can o
I'm not quite that sure about the general applicability of these, as
these depend somewhat on code size. Although there might be something
we can prove about a minimum frame size for one test or the other at -O0.
I also tried to add
/* { dg-skip-if "memory tight" { !size20plus } { "-O3" } } */
This fixes problems with tests that exceed a data type or the maximum
stack frame size on 16 bit targets. Note: GCC has a limitation that
a stack frame cannot exceed half the address space.
For two tests the decision to modify or skip them seems not so clear-cut;
I choose to modify gcc.dg/pr4789
From: Pan Li
For QI/HImode of .SAT_ADD, the operands may be sign-extended and the
high bits of Xmode may be all 1 which is not expected. For example as
below code.
signed char b[1];
unsigned short c;
signed char *d = b;
int main() {
b[0] = -40;
c = ({ (unsigned short)d[0] < 0xFFF6 ? (unsig
Hi Richard S,
Please feel free to let me know if there is any further comments in v2. Thanks
a lot.
Pan
-Original Message-
From: Li, Pan2
Sent: Thursday, August 1, 2024 8:11 PM
To: Richard Biener
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com;
tamar.christ..
When we emit .p2align to align BB_HEAD, we must update BB_HEAD. Otherwise
ENDBR will be inserted as the wrong place.
gcc/
PR target/116174
* config/i386/i386.cc (ix86_align_loops): Update BB_HEAD when
aligning BB_HEAD
gcc/testsuite/
PR target/116174
* gc
Just simulating a 32 bit CPU, these test take about a minute, and
simulating a 16 bit target with size-optimized multi-word divide /
modulus functions, it takes way too long, so I made the tests not
run on targets without int32plus unless run_expensive_test is true.
Even for a simulated 32 bit
On Fri, Aug 09, 2024 at 11:03:24AM +1000, Nathaniel Shead wrote:
> On Thu, Aug 08, 2024 at 03:16:24PM -0400, Marek Polacek wrote:
> > On Thu, Aug 08, 2024 at 09:13:05AM +1000, Nathaniel Shead wrote:
> > > diff --git a/gcc/cp/error.cc b/gcc/cp/error.cc
> > > index 6c22ff55b46..03c19e4a7e4 100644
> >
On Thu, Aug 08, 2024 at 03:16:24PM -0400, Marek Polacek wrote:
> On Thu, Aug 08, 2024 at 09:13:05AM +1000, Nathaniel Shead wrote:
> > diff --git a/gcc/cp/error.cc b/gcc/cp/error.cc
> > index 6c22ff55b46..03c19e4a7e4 100644
> > --- a/gcc/cp/error.cc
> > +++ b/gcc/cp/error.cc
> > @@ -4782,12 +4782,14
I also noticed this test always timing out on a simulator.Reduce iteration count for
23_containers/unordered_set/hash_policy/load_factor.cc for !run_expensive_tests
2024-07-19 Joern Rennecke
libstdc++-v3/
* testsuite/23_containers/unordered_set/hash_policy/load_factor.cc:
[!r
Thanks for the comments.
> On 2 Aug 2024, at 8:36 pm, Richard Biener wrote:
>
> External email: Use caution opening links or attachments
>
>
> On Fri, Aug 2, 2024 at 11:20 AM Kugan Vivekanandarajah
> wrote:
>>
>>
>>
>>> On 1 Aug 2024, at 10:46 pm, Richard Biener
>>> wrote:
>>>
>>> Exter
In the previous patch to reduce iteration counts, I have overlooked
that, in the inner loop of s176, the array index i+m-j-1
turns negativeat for higher iterations of the middle loop for small m.
m and the iteration end of the middle loop should stay the same.Fix PR testsuite/116271, gcc.dg/vect/
On Thu, 2024-08-08 at 22:29 +0200, Arsen Arsenović wrote:
> Tested on x86_64-pc-linux-gnu. I have blinking tsan test results
> again,
> but I think they're bogus (I'll re-test on physical hardware before
> pushing if needed).
>
> I'm somewhat curious of we should do a similar change WRT
> RETURN_
On 8/8/24 2:38 PM, Patrick Palka wrote:
Bootstrap and regtest in progress, does this look OK for the 13 branch
if successful?
OK.
-- >8 --
This is essentially a narrow backport of r14-6724-gfced59166f95e9
that uses cp_evaluated instead of maybe_push_to_top_level to clear
cp_unevaluated_opera
On 8/8/24 4:29 PM, Arsen Arsenović wrote:
Tested on x86_64-pc-linux-gnu. I have blinking tsan test results again,
but I think they're bogus (I'll re-test on physical hardware before
pushing if needed).
I'm somewhat curious of we should do a similar change WRT RETURN_EXPRs
in the C FE (currently
On Thu, Aug 08, 2024 at 10:01:14PM +0200, Alejandro Colomar wrote:
> ./libstdc++-v3/testsuite/21_strings/basic_string/cons/char/constexpr.cc:62:
> const auto len = (sizeof(cs) - 1)/sizeof(C);
> ./libstdc++-v3/testsuite/21_strings/basic_string/cons/wchar_t/constexpr.cc:62:
> const au
On Thu, Aug 08, 2024 at 10:01:14PM +0200, Alejandro Colomar wrote:
> Hi Joseph,
>
> On Thu, Aug 08, 2024 at 05:31:05PM GMT, Joseph Myers wrote:
> > Actual bugs should of course be fixed.
>
> Here are the suspects:
>
> ./gcc/testsuite/gcc.target/powerpc/sse3-addsubps.c:80:
> for (i = 0; i
Hi!
While running `make check -j24`, I've seen an internal compiler error.
I've tried to reproduce it, but it only triggered once that time.
This is the only log I've been able to collect. I hope it helps.
lto1: internal compiler error: in lto_read_decls, at
lto/lto-common.cc:1970
On Fri, Aug 2, 2024 at 7:30 AM Jeff Law wrote:
>
>
>
> On 8/1/24 4:12 AM, Surya Kumari Jangala wrote:
> > lra: emit caller-save register spills before call insn [PR116028]
> >
> > LRA emits insns to save caller-save registers in the
> > inheritance/splitting pass. In this pass, LRA builds EBBs (Ex
On Thu, Aug 08, 2024 at 08:36:36PM GMT, Joseph Myers wrote:
> On Thu, 8 Aug 2024, Alejandro Colomar wrote:
>
> > Here are the suspects:
> >
> > ./gcc/testsuite/gcc.target/powerpc/sse3-addsubps.c:80:
> > for (i = 0; i < sizeof (vals) / sizeof (vals); i += 8)
>
> The key question for testcas
Tested on x86_64-pc-linux-gnu. I have blinking tsan test results again,
but I think they're bogus (I'll re-test on physical hardware before
pushing if needed).
I'm somewhat curious of we should do a similar change WRT RETURN_EXPRs
in the C FE (currently, the C FE uses the operand location for its
On Thu, 8 Aug 2024, Alejandro Colomar wrote:
> Here are the suspects:
>
> ./gcc/testsuite/gcc.target/powerpc/sse3-addsubps.c:80:
> for (i = 0; i < sizeof (vals) / sizeof (vals); i += 8)
The key question for testcases is *does the test actually test what was
intended*? We never want to
Hi Joseph,
On Thu, Aug 08, 2024 at 05:31:05PM GMT, Joseph Myers wrote:
> Actual bugs should of course be fixed.
Here are the suspects:
./gcc/testsuite/gcc.target/powerpc/sse3-addsubps.c:80:
for (i = 0; i < sizeof (vals) / sizeof (vals); i += 8)
./gcc/c-family/c-pragma.cc:1811:
Hi Prathamesh!
On 2024-08-08T06:46:25-0700, Andrew Pinski wrote:
> On Thu, Aug 8, 2024 at 6:11 AM Prathamesh Kulkarni
> wrote:
>> After differing NUM_POLY_INT_COEFFS fix for AArch64/nvptx offloading, the
>> following minimal test:
First, thanks for your work on enabling this! I will say that
On Thu, Aug 08, 2024 at 09:13:05AM +1000, Nathaniel Shead wrote:
> diff --git a/gcc/cp/error.cc b/gcc/cp/error.cc
> index 6c22ff55b46..03c19e4a7e4 100644
> --- a/gcc/cp/error.cc
> +++ b/gcc/cp/error.cc
> @@ -4782,12 +4782,14 @@ qualified_name_lookup_error (tree scope, tree name,
> s
Bootstrap and regtest in progress, does this look OK for the 13 branch
if successful?
-- >8 --
This is essentially a narrow backport of r14-6724-gfced59166f95e9
that uses cp_evaluated instead of maybe_push_to_top_level to clear
cp_unevaluated_operand within synthesize_method, which turns out is
s
Am 08.08.24 um 19:13 schrieb Thomas Koenig:
Am 08.08.24 um 11:09 schrieb Mikael Morin:
As we are not planning to remove the library implementation (-Os!),
this is also the best way to compare library to inline code.
This makes perfect sense, but why reuse the -ffrontend-optimize option?
The m
Hi Martin,
On Thu, Aug 08, 2024 at 08:16:50PM GMT, Martin Uecker wrote:
> > It will serve me as a huge test suite anyway; so it's worth it even if
> > just for myself. And it will uncover bugs. :)
>
> Did you implement a C++ version? Or are you talking about the C parts
> of the code.
I'll sta
The following intrinsics are not implemented. Thus, remove them.
Ok for mainline?
gcc/ChangeLog:
* config/s390/vecintrin.h (vec_vstbrh): Remove.
(vec_vstbrf): Remove.
(vec_vstbrg): Remove.
(vec_vstbrq): Remove.
(vec_vstbrf_flt): Remove.
(vec_vstbr
Starting with r14-9449-g9f2b16ce1efef0 builtins were streamlined with
those in LLVM. In particular s390_vgfm{,a}g have been changed from
UV16QI to UINT128 in order to match those in LLVM. However, these
low-level builtins are directly used by the high-level builtins
vec_gfmsum{,_accum}_128 which
Am 8. August 2024 19:21:23 MESZ schrieb David Brown :
>
>
> On 08/08/2024 11:13, Jens Gustedt wrote:
> > Hi
> >
> > Am 8. August 2024 10:26:14 MESZ schrieb Alejandro Colomar :
> >> Hello Jens,
> >>
> >> On Thu, Aug 08, 2024 at 07:35:12AM GMT, Jₑₙₛ Gustedt wrote:
> >>> Hello Alejandro,
> >>>
>
Am Donnerstag, dem 08.08.2024 um 20:04 +0200 schrieb Alejandro Colomar:
>
...
> >
> > *If* the feature were adopted into C++26, we could then consider if
> > existing macros should be renamed to look more like the future language
> > feature.
> >
> > Target code is at least always compiled wi
Hi Joseph,
On Thu, Aug 08, 2024 at 05:31:05PM GMT, Joseph Myers wrote:
> On Thu, 8 Aug 2024, Alejandro Colomar wrote:
>
> > How about having __lengthof__ behave like sizeof, but deprecate it in
> > sizeof too?
>
> Deprecation would be a matter for WG14.
Yep; I wouldn't add it to -Wall unless WG
> On Wed, 7 Aug 2024, Richard Biener wrote:
>
> > OK with that change.
> >
> > Did you think about a AVX512 version (possibly with 32 byte vectors)?
> > In case there's a more efficient variant of pshufb/pmovmskb available
> > there - possibly
> > the load on the branch unit could be lessened w
Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
-- >8 --
The problem in this PR is that we ended up with
{.rows=(&)->n,
.outer_stride=(&)->rows}
that is, two PLACEHOLDER_EXPRs for different types on the same level
in one { }. That should not happen; we may, for instance, negle
Andrew Pinski writes:
> For bar1 and bar2, we currently is expecting to use the bsl instruction but
> with slightly different register allocation inside the loop (which happens
> after
> the removal of the vcond{,u,eq} patterns), we get the bit instruction. The
> pattern that
> outputs bsl inst
On Thu, 8 Aug 2024, Alejandro Colomar wrote:
> How about having __lengthof__ behave like sizeof, but deprecate it in
> sizeof too?
Deprecation would be a matter for WG14.
> We could start by adding a -Wall warning for sizeof without parens, and
> promote it to an error a few versions later.
Thi
On 08/08/2024 11:13, Jens Gustedt wrote:
Hi
Am 8. August 2024 10:26:14 MESZ schrieb Alejandro Colomar :
Hello Jens,
On Thu, Aug 08, 2024 at 07:35:12AM GMT, Jₑₙₛ Gustedt wrote:
Hello Alejandro,
On Thu, 8 Aug 2024 00:44:02 +0200, Alejandro Colomar wrote:
+Its syntax is similar to @code{si
Am 08.08.24 um 11:09 schrieb Mikael Morin:
As we are not planning to remove the library implementation (-Os!),
this is also the best way to compare library to inline code.
This makes perfect sense, but why reuse the -ffrontend-optimize option?
The manual describes it as:
This option performs f
Improve handling of constants where the high half can be constructed
by shifting the low half.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_build_integer): Detect constants
were the higher half is a shift of the lower half.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/synt
From: Raphael Zinsly
Improve handling of constants where its upper and lower 32-bit
halves are the same and Zbkb is not available in riscv_move_integer.
riscv_split_integer already handles this but the changes in
riscv_build_integer makes it possible to improve code generation for
negative values
On Linux/x86_64,
ab18785840d7b8afd9f716bab9d1eab415bc4fe9 is the first bad commit
commit ab18785840d7b8afd9f716bab9d1eab415bc4fe9
Author: Manolis Tsamis
Date: Tue Jun 25 08:00:04 2024 -0700
Rearrange SLP nodes with duplicate statements [PR98138]
caused
FAIL: gcc.target/i386/pr105493.c sc
Hi Martin, Jens, Joseph,
On Thu, Aug 08, 2024 at 06:30:42PM GMT, Martin Uecker wrote:
> Am Donnerstag, dem 08.08.2024 um 18:23 +0200 schrieb Jens Gustedt:
> > As said, even if we don't consider this problematic because we are used to
> > the mildly complex case distinction that you just exposed o
Bootstrapped and regtested on x86_64-pc-linux-gnu, does this
look OK for trunk/14?
-- >8 --
This implements the inherited vs non-inherited guide tiebreaker
specified by P2582R1. In order to track inherited-ness of a guide
it seems natural to reuse the lang_decl_fn::context field that already
tra
Applied this fix to trunk and v14 branch.
Johann
--
AVR: target/116295 - Fix unrecognizable insn with __flash read.
Some loads from non-generic address-spaces are performed by
libgcc calls, and they don't have a POST_INC form. Don't consider
such insns when running -mfuse-add.
PR tar
Some post-inc address adjustments can be skipped when the
address register is unused after.
Johann
--
AVR: Improve POST_INC output in some rare cases.
gcc/
* config/avr/avr.cc (avr_insn_has_reg_unused_note_p): New function.
(_reg_unused_after): Use it to recognize more cases.
On Thu, 8 Aug 2024, Jens Gustedt wrote:
> No, the ambiguity is there because the first ( after the keyword could
> start either a type in parenthesis or an expression, and among these a
> compound literal. If that first parenthesis would be part of the
> construct (as for the typeof or offsetof
Applied as obvious.
Johann
--
AVR: Fix a typo in __builtin_avr_mask1 documentation.
gcc/
* doc/extend.texi (AVR Built-in Functions) : Fix a typo.
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 89fe5db7aed..ae1ada3cdf8 100644
--- a/gcc/doc/extend.texi
+++ b/g
On Thu, Aug 08, 2024 at 11:09:10AM +0200, Mikael Morin wrote:
>
> These patches are about inlining, there is no manipulation of the parse
> tree. So I would rather use a separate option (-finline-intrinsics?).
I've only followed the discussion from afar, but gcc already supports
a -finline and -
Andrew Carlotti writes:
> The availability of ls64 intrinsics and data types were determined
> solely by the globally specified architecture features, which did not
> reflect any changes specified in target pragmas or attributes.
>
> This patch removes the initialisation-time guards for the intrin
As said, even if we don't consider this problematic because we are used to the
mildly complex case distinction that you just exposed over several paragraphs,
it doesn't mean that we should do it, nor does it mean that it would be
beneficial for our users or for other implementations that would l
Andrew Carlotti writes:
> The availability of memtag intrinsics and data types were determined
> solely by the globally specified architecture features, which did not
> reflect any changes specified in target pragmas or attributes.
>
> This patch removes the initialisation-time guards for the intr
Am Donnerstag, dem 08.08.2024 um 18:23 +0200 schrieb Jens Gustedt:
> As said, even if we don't consider this problematic because we are used to
> the mildly complex case distinction that you just exposed over several
> paragraphs, it doesn't mean that we should
> do it, nor does it mean that it w
The metadata for RDNA3 kernels allocates VGPRs in blocks of 12, which means the
maximum usable number of registers is 252. This patch prevents the compiler
from exceeding this artifical limit.
gcc/ChangeLog:
* config/gcn/gcn.cc (gcn_conditional_register_usage): Fix registers
rema
Am 8. August 2024 17:42:54 MESZ schrieb Martin Uecker :
> Am Donnerstag, dem 08.08.2024 um 16:56 +0200 schrieb Jens Gustedt:
> > Am 8. August 2024 13:28:57 MESZ schrieb Joseph Myers :
> > > On Thu, 8 Aug 2024, Alejandro Colomar wrote:
> > >
> > > > Hi Jens,
> > > >
> > > > On Thu, Aug 08, 2024 at
Andrew Carlotti writes:
> The availability of tme intrinsics was previously gated at both
> initialisation time (using global target options) and usage time
> (accounting for function-specific target options). This patch removes
> the check at initialisation time, and also moves the intrinsics ou
From: Andi Kleen
It is using a class now with a different name.
I will commit as obvious unless someone complains
Also I included this patch by mistake in my earlier if conversion v2
patch. Please ignore that hunk there.
gcc/ChangeLog:
* doc/cfg.texi: Fix references to dom_walker.
---
The gimple-if-to-switch pass converts if statements with
multiple equal checks on the same value to a switch. This breaks
vectorization which cannot handle switches.
Teach the tree-if-conv pass used by the vectorizer to handle
simple switch statements, like those created by if-to-switch earlier.
T
Andrew Carlotti writes:
> Move SVE extension checking functionality to aarch64-builtins.cc, so
> that it can be shared by non-SVE intrinsics.
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64-sve-builtins.cc (check_builtin_call)
> (expand_builtin): Update calls to the below.
> (rep
Am Donnerstag, dem 08.08.2024 um 16:56 +0200 schrieb Jens Gustedt:
> Am 8. August 2024 13:28:57 MESZ schrieb Joseph Myers :
> > On Thu, 8 Aug 2024, Alejandro Colomar wrote:
> >
> > > Hi Jens,
> > >
> > > On Thu, Aug 08, 2024 at 11:13:02AM GMT, Jens Gustedt wrote:
> > > > > but to maintain expecta
The predicates for checking an IDENTIFIER node's cp_identifier_kind
currently directly test the three flag bits that encode the kind. This
patch instead makes the checks first reconstruct the cp_identifier_kind
in its entirety and then compare that.
gcc/cp/ChangeLog:
* cp-tree.h (get_ide
DECL_UNINSTANTIATED_TEMPLATE_FRIEND_P templates can only appear as part
of a template friend declaration, and in turn get partially instantiated
only from tsubst_friend_function or tsubst_friend_class. So rather than
having tsubst_template_decl clear the flag, let's leave it up to the
tsubst frien
Richard Biener writes:
>> Am 08.08.2024 um 15:12 schrieb Richard Sandiford :
>>>PR tree-optimization/116274
>>>* tree-vect-slp.cc (vect_bb_slp_scalar_cost): Cost scalar loads
>>>and stores as simple scalar stmts when they access a non-global,
>>>not address-taken variable that does
Hi Saurabh,
> On 7 Aug 2024, at 17:11, saurabh@arm.com wrote:
>
> External email: Use caution opening links or attachments
>
>
> The AArch64 FEAT_FAMINMAX extension is optional from Armv9.2-a and
> mandatory from Armv9.5-a. It introduces instructions for computing the
> floating point absolute
> Am 08.08.2024 um 15:12 schrieb Richard Sandiford :
>
> Richard Biener writes:
>> The following tries to address that the vectorizer fails to have
>> precise knowledge of argument and return calling conventions and
>> views some accesses as loads and stores that are not.
>> This is mainly im
Am 8. August 2024 13:28:57 MESZ schrieb Joseph Myers :
> On Thu, 8 Aug 2024, Alejandro Colomar wrote:
>
> > Hi Jens,
> >
> > On Thu, Aug 08, 2024 at 11:13:02AM GMT, Jens Gustedt wrote:
> > > > but to maintain expectations, I think it would be better to do
> > > > the same here.
> > >
> > > Just
On 8/8/24 8:34 AM, Christoph Müllner wrote:
On Wed, Aug 7, 2024 at 4:48 PM Jeff Law wrote:
On 8/7/24 12:27 AM, Christoph Müllner wrote:
Test file xtheadfmemidx-medany.c has been added in b79cd204c780 as a
test case that provoked an ICE when loading DFmode registers via two
SImode registe
On Wed, Aug 7, 2024 at 4:48 PM Jeff Law wrote:
>
>
>
> On 8/7/24 12:27 AM, Christoph Müllner wrote:
> > Test file xtheadfmemidx-medany.c has been added in b79cd204c780 as a
> > test case that provoked an ICE when loading DFmode registers via two
> > SImode register loads followed by a SI->DF[63:32
The availability of memtag intrinsics and data types were determined
solely by the globally specified architecture features, which did not
reflect any changes specified in target pragmas or attributes.
This patch removes the initialisation-time guards for the intrinsics,
and replaces them with che
The availability of ls64 intrinsics and data types were determined
solely by the globally specified architecture features, which did not
reflect any changes specified in target pragmas or attributes.
This patch removes the initialisation-time guards for the intrinsics,
and replaces them with check
From: Justin Squirek
This patch further enhances the mutably tagged type implementation by fixing
several oversights relating to generic instantiations, attributes, and
type conversions.
gcc/ada/
* exp_put_image.adb (Append_Component_Attr): Obtain the mutably
tagged type for the
From: Steve Baird
If the primitive equality operator of the component type of an array type is
abstract, then a call to that abstract function raises Program_Error (when
such a call is legal). The FE generates a raise expression to implement this.
That raise expression is an expression so it shou
From: Steve Baird
An access discriminant is allowed to have a default value only if the
discriminated type is immutably limited. In the case of a discriminated
limited private type declaration, this rule needs to be checked when
the completion of the type is seen.
gcc/ada/
* sem_ch6.adb
From: Justin Squirek
This patch fixes an issue in the compiler whereby disabling style checks via
pragma Style_Checks ("-L") resulted in the minimum nesting level being zero
but the style still being enabled - leading to spurious maximum nesting level
exceeded warnings.
gcc/ada/
* style
The availability of tme intrinsics was previously gated at both
initialisation time (using global target options) and usage time
(accounting for function-specific target options). This patch removes
the check at initialisation time, and also moves the intrinsics out of
the header file to allow for
From: Gary Dismukes
When unnesting is enabled, the compiler was failing to copy the At_End_Proc
field from a block statement to the procedure created to replace it when
unnesting of top-level blocks is done. At run time this could lead to
exceptions due to missing finalization calls.
gcc/ada/
From: Javier Miranda
When the attribute Finalization_Size is applied to an interface type
object, the compiler-generated code fails at runtime, raising a
Constraint_Error exception.
gcc/ada/
* exp_attr.adb (Expand_N_Attribute_Reference) :
If the prefix is an interface type, gene
Move SVE extension checking functionality to aarch64-builtins.cc, so
that it can be shared by non-SVE intrinsics.
gcc/ChangeLog:
* config/aarch64/aarch64-sve-builtins.cc (check_builtin_call)
(expand_builtin): Update calls to the below.
(report_missing_extension, check_requ
This series of patches fixes issues with some intrinsics being incorrectly
gated by global target options, instad of just using function-specific target
options. These issues have been present since the +tme, +memtag and +ls64
intrinsics were introduced.
This series is an rebased and fixed versio
On 8/8/24 06:21, Tobias Burnus wrote:
Update for the very recently released TR13. Unsurprisingly, most item
are still unimplemented.
→ https://www.openmp.org/specifications/ → Technical Report 13
Comments, suggestions, typo fixes? — If not, I will commit it later today.
I've got a few things
> > But your comment made me realize there is a major bug.
> >
> > if_convertible_switch_p also needs to check that that the labels don't fall
> > through, so the the flow graph is diamond shape. Need some easy way to
> > verify that.
>
> Do we verify this for if()s? That is,
No we do not. Afte
Paul-Antoine Arras wrote:
This patch introduces the OMP_DISPATCH tree node, as well as two new clauses
`nocontext` and `novariants`. It defines/exposes interfaces that will be
used in subsequent patches that add front-end and middle-end support, but
nothing generates these nodes yet.
LGTM - tha
On 8/8/24 06:20, Jakub Jelinek wrote:
On Thu, Aug 08, 2024 at 02:18:48PM +0200, Tobias Burnus wrote:
Document -fno-builtin-omp_is_initial_device as discussed:
Jakub Jelinek wrote:
RFC: Should be document this new built-in some where? If so, where? As part
of the routine description in libgomp
Previously, trampolines worked on GCN3 devices, but the newer GCN5
devices had different permissions on the stack memory space we were
using.
That changed when we added the reverse-offload features because we
switched from using the "private" memory space to using a regular memory
allocation.
The
On Thu, Aug 8, 2024 at 6:11 AM Prathamesh Kulkarni
wrote:
>
> Hi Richard,
> After differing NUM_POLY_INT_COEFFS fix for AArch64/nvptx offloading, the
> following minimal test:
>
> int main()
> {
> int x;
> #pragma omp target map(x)
> x = 5;
> return x;
> }
>
> compiled with -fopenmp -fo
Richard Biener writes:
> The following tries to address that the vectorizer fails to have
> precise knowledge of argument and return calling conventions and
> views some accesses as loads and stores that are not.
> This is mainly important when doing basic-block vectorization as
> otherwise loop i
Hi Richard,
After differing NUM_POLY_INT_COEFFS fix for AArch64/nvptx offloading, the
following minimal test:
int main()
{
int x;
#pragma omp target map(x)
x = 5;
return x;
}
compiled with -fopenmp -foffload=nvptx-none now fails with:
gcc: error: unrecognized command-line option '-m64'
On 8/8/24 7:59 AM, Nathaniel Shead wrote:
Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
OK.
The change to 'finish_struct_bits' is not required for this PR but I
felt it was a nice cleanup; happy to commit without it though if
preferred.
-- >8 --
This has caused issues wit
On 8/8/24 8:06 AM, Nathaniel Shead wrote:
Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
OK.
-- >8 --
While stepping through some code I noticed that we do some extra work
(finding the originating module decl, stripping the template, and
inspecting the attached-ness) for ev
Update for the very recently released TR13. Unsurprisingly, most item
are still unimplemented.
→ https://www.openmp.org/specifications/ → Technical Report 13
Comments, suggestions, typo fixes? — If not, I will commit it later today.
Tobias
libgomp.texi: Update implementation status table for O
On Thu, Aug 08, 2024 at 02:18:48PM +0200, Tobias Burnus wrote:
> Document -fno-builtin-omp_is_initial_device as discussed:
>
> Jakub Jelinek wrote:
> > > RFC: Should be document this new built-in some where? If so, where? As
> > > part
> > > of the routine description in libgomp.texi? Or in exte
Document -fno-builtin-omp_is_initial_device as discussed:
Jakub Jelinek wrote:
RFC: Should be document this new built-in some where? If so, where? As part
of the routine description in libgomp.texi? Or in extend.texi (or even
invoke.texi)?
I think libgomp.texi in the omp_is_initial_device desc
On Tue, Aug 6, 2024 at 12:38 PM Manolis Tsamis wrote:
>
> Pinging this for a review and/or further feedback.
>
> Thanks,
> Manolis
>
> On Wed, Jun 26, 2024 at 3:06 PM Manolis Tsamis
> wrote:
> >
> > This change checks when a two_operators SLP node has multiple occurrences of
> > the same stateme
On Mon, Aug 5, 2024 at 4:02 PM Juergen Christ wrote:
>
> Am Mon, Aug 05, 2024 at 01:00:31PM +0200 schrieb Richard Biener:
> > On Fri, Aug 2, 2024 at 2:43 PM Juergen Christ wrote:
> > >
> > > Do not convert floats to ints in multiple step if trapping math is
> > > enabled. This might hide some in
Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
-- >8 --
While stepping through some code I noticed that we do some extra work
(finding the originating module decl, stripping the template, and
inspecting the attached-ness) for every declaration taken from a header
unit. This doe
On Sat, Aug 3, 2024 at 2:42 PM Feng Xue OS wrote:
>
> >> 1. Background
> >>
> >> For loop reduction of accumulating result of a widening operation, the
> >> preferred pattern is lane-reducing operation, if supported by target.
> >> Because
> >> this kind of operation need not preserve intermediat
Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
The change to 'finish_struct_bits' is not required for this PR but I
felt it was a nice cleanup; happy to commit without it though if
preferred.
-- >8 --
This has caused issues with modules when an import fills in the
definition of
On 2024-08-08 09:04, Richard Biener wrote:
On Thu, Aug 8, 2024 at 4:55 AM Peter Damianov
wrote:
Currently, if a warning references a cloned function, the name of the
cloned
function will be emitted in the "In function 'xyz'" part of the
diagnostic,
which users aren't supposed to see. This pa
On Thu, 8 Aug 2024, Alejandro Colomar wrote:
> Hi Jens,
>
> On Thu, Aug 08, 2024 at 11:13:02AM GMT, Jens Gustedt wrote:
> > > but to maintain expectations, I think it would be better to do
> > > the same here.
> >
> > Just to compare, the recent additions in C23 typeof etc. only have the
> > par
1 - 100 of 132 matches
Mail list logo