Pointer semantics in GIMPLE

2025-04-05 Thread Krister Walfridsson via Gcc
I'm working on ensuring that the GIMPLE semantics used by smtgcc are 
correct, and I have a lot of questions about the details. I'll be sending 
a series of emails with these questions. This first one is about pointers 
in general.


Each section starts with a description of the semantics I've implemented 
(or plan to implement), followed by concrete questions if relevant. Let me 
know if the described semantics are incorrect or incomplete.



Pointers values
---
C imposes restrictions on pointer values (alignment, must point to valid 
memory, etc.), but GIMPLE seems more permissive. For example, when 
vectorizing a loop:


  void foo(int *p, int n)
  {
for (int i = 0; i < n; i++)
  p[i] = 1;
  }

the vectorized loop updates the pointer vectp_p.8 for the next iteration 
at the end of the loop, causing it to point up to a vector-length out of 
range after the last iteration:


   :
  # vectp_p.8_34 = PHI 
  # ivtmp_37 = PHI 
  MEM  [(int *)vectp_p.8_34] = { 1, 1, 1, 1 };
  vectp_p.8_35 = vectp_p.8_34 + 16;
  ivtmp_38 = ivtmp_37 + 1;
  if (ivtmp_38 < bnd.5_31)
goto ;
  else
goto ;

   :
  goto ;

Because of this, I allow all values -- essentially treating pointers as 
unsigned integers. Things like ensuring pointers point to valid memory or 
have correct alignment are only checked when dereferencing the pointer 
(I'll discuss pointer dereferencing semantics in the next email).



Pointer arithmetic -- POINTER_PLUS_EXPR
---
POINTER_PLUS_EXPR calculates p + x:
 * It is UB if the calculation overflows when
   TYPE_OVERFLOW_WRAPS(ptr_type) is false.
 * It is UB if p + x == 0, unless p == 0 and x == 0, unless
   flag_delete_null_pointer_checks is false.

Since the pointer type is unsigned, it's not entirely clear what overflow 
means. For example, p - 1 is represented as p + 0x, which 
overflows when treated as an unsigned addition. I check for overflow in 
p + x as follows:

  is_overflow = ((intptr_t)x < 0) ? (p + x) > p : (p + x) < p;


Pointer arithmetic -- MEM_REF and TARGET_MEM_REF

A calculation p + x can also be done using MEM_REF as &MEM[p + x].

Question: smtgcc does not check for overflow in address calculations for 
MEM_REF and TARGET_MEM_REF. Should it treat overflow as UB in the same way 
as POINTER_PLUS_EXPR?



Pointer arithmetic -- COMPONENT_REF
---
COMPONENT_REF adds the offset of a structure element to the pointer to the 
structure.


Question: smtgcc does not check for overflow in address calculation for 
COMPONENT_REF. Should it treat overflow as UB in the same way as 
POINTER_PLUS_EXPR?



Pointer arithmetic -- ARRAY_REF
---
ARRAY_REF adds the offset of the indexed element to the pointer to the 
array.


 * It is UB if the index is negative.
 * If the TYPE_DOMAIN for the array type has a TYPE_MAX_VALUE and the
   array is not a flexible array, it is UB if the index exceeds "one
   after" this maximum value.
 * If it is a flexible array or does not have a maximum value, it is
   considered an unsized array, so all non-negative indices are valid.
   But it is UB if the calculation of
  offset = index * element_size
   overflows.

Question: smtgcc does not check for overflow in the calculation of p + 
offset. Should it treat overflow as UB in the same way as 
POINTER_PLUS_EXPR?


Question: What is the correct way to check for flexible arrays? I am 
currently using array_ref_flexible_size_p and flag_strict_flex_arrays, but 
I noticed that middle-end passes do not seem to use 
flag_strict_flex_arrays. Is there a more canonical way to determine how to 
interpret flexible arrays?



Pointer arithmetic -- POINTER_DIFF_EXPR
---
Subtracting a pointer q from a pointer p is done using POINTER_DIFF_EXPR.
 * It is UB if the difference does not fit in a signed integer with the
   same precision as the pointers.

This implies that an object's size must be less than half the address 
space; otherwise, POINTER_DIFF_EXPR cannot be used to compute sizes in C. 
But there may be additional restrictions. For example, compiling the 
function:


  void foo(int *p, int *q, int n)
  {
for (int i = 0; i < n; i++)
  p[i] = q[i] + 1;
  }

causes the vectorized code to perform overlap checks like:

  _7 = q_11(D) + 4;
  _25 = p_12(D) - _7;
  _26 = (sizetype) _25;
  _27 = _26 > 8;
  _28 = _27;
  if (_28 != 0)
goto ;
  else
goto ;

which takes the difference between two pointers that may point to 
different objects. This suggests that all objects must fit within half the 
address space.


Question: What are the restrictions on valid address ranges and object 
sizes?



Issues
--
The semantics I've described above result in many reports of 
miscompilations that I haven't reported yet.


As mentioned earlier, the vectorizer can use POINTER_PLUS_EXPR to generate 
pointers that extend

Re: GSoC 2025: In-Memory Filesystem for GPU Offloading Tests

2025-04-05 Thread Arijit Kumar Das via Gcc
Hi,

> Let us know if you need further help; we understand it's not trivial to
get this set up at first.

Sure! To be honest, I haven't had time to completely set up the toolchain
yet (due to classes and ongoing mid-semester examinations). I plan to
finish it as soon as I get some time. I have set up the development
environment though. I have installed Debian 12 (multiboot) onto my laptop
and installed all necessary packages. Newlib and GCC sources are at my hand.

> Do you have access to a system with an AMD GPU supported by GCC, or any
Nvidia GPU?

Yes, my laptop has an Nvidia RTX 4050 GPU, which I believe should work for
nvptx. The only thing that concerns me here, though, is that I was unable
to get the nvidia drivers to work. Installed and reinstalled them a couple
of times, still they won't load. I currently do not have them installed,
but I suppose I'll do something about it soon. Or will the preinstalled
nouveau driver work just fine?

> (I assume, by now you've found the newlib source code?)

Yes! I have browsed through the newlib sources. The sources that we may be
concerned with are apparently at newlib/libc/machine/nvptx. Since, I do not
own an AMD GPU, I guess I should probably focus on nvptx. Right?


> Actually, only for GCN: 'gcc/config/gcn/gcn-run.cc'. For nvptx, it's

> part of : 'nvptx-run.cc'.

Fine, I'll check this out!


Just also wanted to ask that currently I'm preparing a draft proposal,
would it be alright if I send it here in this thread for your review?


Best regards,

Arijit

On Wed, 2 Apr, 2025, 8:01 pm Thomas Schwinge, 
wrote:

> Hi Arijit, Andrew!
>
> Arijit, welcome to GCC!
>
> On 2025-03-11T03:26:44+0530, Arijit Kumar Das via Gcc 
> wrote:
> > Thank you for the detailed response! This gives me a much clearer picture
> > of how things work.
> >
> > Regarding the two possible approaches:
> >
> >- I personally find *Option A (self-contained in-memory FS)* more
> >interesting, and I'd like to work on it first.
>
> Sure, that's fine.  ..., and we could then still put Option B on top, in
> case that just Option A should turn out to be too easy for you.  ;-)
>
> >- However, if *Option B (RPC-based host FS access)* is the preferred
> >approach for GSoC, I’d be happy to work on that as well.
>
> > For now, I’ll begin setting up the toolchain and running simple OpenMP
> > target kernels as suggested. Thanks again for your guidance!
>
> Let us know if you need further help; we understand it's not trivial to
> get this set up at first.
>
> Do you have access to a system with an AMD GPU supported by GCC, or any
> Nvidia GPU?
>
> Just a few more comments in addition to Andrew's very useful remarks.
> (Thank you, Andrew!)
>
> > On Mon, 10 Mar, 2025, 10:55 pm Andrew Stubbs,  wrote:
> >> On 10/03/2025 15:37, Arijit Kumar Das via Gcc wrote:
> >> > I have carefully read the project description and understand that the
> goal
> >> > is to modify *newlib* and the *run tools* to redirect system calls
> for file
> >> > I/O operations to a virtual, volatile filesystem in host memory, as
> the GPU
>
> Instead of "in host memory", it should be "in GPU memory" (for Option A).
>
> >> > lacks its own filesystem. Please correct me if I’ve misunderstood any
> >> > aspect.
>
> >> > I have set up the GCC source tree and am currently browsing relevant
> files
> >> > in the *gcc/testsuite* directory. However, I am unsure *where the run
> tools
> >> > source files are located and how they interact with newlib system
> calls.*
>
> >> Newlib isn't part of the GCC repo, so if you
> >> can't find the files then that's probably why!
>
> (I assume, by now you've found the newlib source code?)
>
> >> The "run" tools are installed as part of the offload toolchain, albeit
> >> hidden under the "libexec" directory because they're really only used
> >> for testing. You can find the sources with the config/nvptx or
> >> config/gcn backend files.
>
> Actually, only for GCN: 'gcc/config/gcn/gcn-run.cc'.  For nvptx, it's
> part of : 'nvptx-run.cc'.
>
> >> Currently, system calls such as "open" simply return EACCESS
> >> ("permission denied") so the stub implementations are fairly easy to
> >> understand (e.g. newlib/libc/sys/amdgcn/open.c).
>
> (I assume, by now you've found the corresponding nvptx code in newlib?)
>
> >> The task would be to
> >> insert new code there that actually does something.  You do not need to
> >> modify the compiler itself.
>
>
> Grüße
>  Thomas
>


Re: Using nonzero_bits() in insn conditions?

2025-04-05 Thread Georg-Johann Lay via Gcc

Am 21.03.25 um 19:16 schrieb Georg-Johann Lay via Gcc:

Am 21.03.25 um 01:02 schrieb Jeff Law:

On 3/19/25 4:14 AM, Georg-Johann Lay wrote:

Am 16.03.25 um 14:51 schrieb Jeff Law via Gcc:

On 3/13/25 5:39 AM, Georg-Johann Lay via Gcc wrote:

There are situations where knowledge about which bits
of a value are (not) set can be used for optimization.
For example in an insn combine pattern like:

(define_insn_and_split ""
   [(set (match_operand:QI 0 "register_operand" "=d")
 (ior:QI (ashift:QI (match_operand:QI 1 "register_operand" 
"r")
    (match_operand:QI 2 
"const_0_to_7_operand" "n"))

 (match_operand:QI 3 "register_operand" "0")))]
   "optimize
    && !reload_completed
    && nonzero_bits (operands[1], VOIDmode) == 1"
...

This pattern is only correct when operands[1] is 0 or 1.

While such patterns seem to work, it's all quite wonky,
in particular since nonzero_bits() may forget about known
properties in later passes.
While it works most of the time, it's fundamentally wrong to have a 
pattern where the conditional is dependent on state that changes 
based on pass specific data, nearby context, etc.




For the use case I have in mind, it is in order when the
pattern works until split1 which would transform it into
something else (and without nonzero_bits() in the insn
condition, asserting that the existence of the pattern
certifies the bit condition).
It's still the wrong thing to do.  You'll get away with it for a 
while, but one day it'll break.


We have similar problems in the RISC-V world where we would like to 
be able to match certain patterns based on known ranges of an 
operand. The most common case would be bset/bclr/binv on an SImode 
object on rv64 where the bit twiddled is variable.  In particular we 
need to know the bit position is not bit 31.


There's no way to really describe that in an insn's condition 
because range information like that isn't available in RTL and 
something like nonzero bits is pass specific.


As a result we're limited in our ability to use the bset/bclr/binv 
instructions.


Jeff


One way to support this is a new target hook that would run somewhere
in recog_for_combine().  The hook would allow the backend to replace
the pattern as synthesized by combine with an equivalent pattern.
Much simpler: Add a split pass immediately after combine.  Use 
define_insn_and_split to handle rewriting.  No hooks needed.


Jeff


Unfortunately, that doesn't work:

.../libgcc/config/avr/libf7/libf7.c: In function '__f7_get_float':
.../libgcc/config/avr/libf7/libf7.c:354:1: error: wrong amount of branch 
edges after unconditional jump 18

   354 | }
   | ^
during RTL pass: avr-split-after-combine
.../libgcc/config/avr/libf7/libf7.c:354:1: internal compiler error: 
verify_flow_info failed

0x1df2e91 internal_error(char const*, ...)
.../gcc/diagnostic-global-context.cc:517
0xa580fe verify_flow_info()
.../gcc/cfghooks.cc:287
0xf084c8 checking_verify_flow_info()
.../gcc/cfghooks.h:214
0xf084c8 split_all_insns()
.../gcc/recog.cc:3608
0xf084e8 execute
.../gcc/recog.cc:4507

This used a clone of pass_split_all_insns which runs
checking_verify_flow_info() at the end. passes.def reads:

   NEXT_PASS (pass_combine);
   NEXT_PASS (pass_late_combine);
   NEXT_PASS (pass_if_after_combine);
   NEXT_PASS (pass_jump_after_combine);
   NEXT_PASS (pass_partition_blocks);
   NEXT_PASS (pass_outof_cfg_layout_mode);
   NEXT_PASS (pass_split_all_insns);

So just cloning pass_split_all_insns won't work.
But using split_all_insns_noflow() instead should
do the trick then?

Johann


...unfortunately, using split_all_insns_noflow() doesn't work, either:

sreg.c:136:1: error: flow control insn inside a basic block
(jump_insn 67 66 70 2 (set (pc)
(if_then_else (eq (zero_extract:QI (reg:QI 112 [ sreg ])
(const_int 1 [0x1])
(const_int 0 [0]))
(const_int 0 [0]))
(label_ref 69)
(pc))) "sreg.c":125:20 -1
 (nil)
 -> 69)
during RTL pass: avr-split-after-combine
dump file: sreg.c.305r.avr-split-after-combine
sreg.c:136:1: internal compiler error: in rtl_verify_bb_insns, at 
cfgrtl.cc:2836


So something more sophisticated than just cloning a pass is needed.
Any ideas?

Johann



'TREE_READONLY' for 'const' array in C vs. C++

2025-04-05 Thread Thomas Schwinge
Hi!

In Nvidia PTX, "A state space is a storage area with particular
characteristics.  All variables reside in some state space.  [...]".
These include:

.const  Shared, read-only memory.
.global  Global memory, shared by all threads.

Implemented via 'TARGET_ENCODE_SECTION_INFO', GCC/nvptx then uses
special-cased instructions for accessing the respective memory regions.

Now, given a 'const' array (with whatever element type; not interesting
here), like:

extern int * const arr[];

..., for GCC/C compilation, we access this as '.const' memory: GCC/nvptx
'DATA_AREA_CONST', but for GCC/C++ compilation, we access it as
'DATA_AREA_GLOBAL', and then fault at run time due to mismatch with the
definition, which actually is '.const' for both C and C++ compilation.

The difference is, when we get to 'TARGET_ENCODE_SECTION_INFO', that for
C we've got 'TREE_READONLY(decl)', but for C++ we don't.

C:

Breakpoint 3, nvptx_encode_section_info (decl=0x77824720, 
rtl=0x77843180, first=1) at ../../source-gcc/gcc/config/nvptx/nvptx.cc:468
468 {
(gdb) call debug_tree(decl)
 
readonly unsigned DI
size 
unit-size 
align:64 warn_if_not_align:0 symtab:0 alias-set -1 
canonical-type 0x777f7d20>
BLK 
align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 
0x777f7f18>
readonly used public external read BLK 
source-gcc/gcc/testsuite/gcc.target/nvptx/const-1-2.c:8:20
align:64 warn_if_not_align:0 context 
(mem/u/c:BLK (symbol_ref:DI ("arr") ) [1 
arr+0 A64]) chain >

Note 'readonly' ('TREE_READONLY') in the third-last line,
and '/u' (RTL 'MEM_READONLY_P') in the last line.

C++:

Breakpoint 3, nvptx_encode_section_info (decl=0x7783fa18, 
rtl=0x77844a50, first=1) at ../../source-gcc/gcc/config/nvptx/nvptx.cc:468
468 {
(gdb) call debug_tree(decl)
 
readonly unsigned type_6 DI
size 
unit-size 
align:64 warn_if_not_align:0 symtab:0 alias-set -1 
canonical-type 0x7782fe70>
type_6 BLK
align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 
0x7782f738>
used public external read decl_2 BLK 
source-gcc/gcc/testsuite/g++.target/nvptx/const-1-2.C:14:20
align:64 warn_if_not_align:0 context 
(mem/c:BLK (symbol_ref:DI ("arr") ) [1 
arr+0 A64]) chain >

Note no 'readonly' ('!TREE_READONLY') in the third-last line,
and no '/u' (RTL '!MEM_READONLY_P') in the last line.

Is this difference expected?


Now, for example, in 'gcc/config/avr/avr.cc', I found code like:

tree node0 = node;

/* For C++, we have to peel arrays in order to get correct
   determination of readonlyness.  */

do
  node0 = TREE_TYPE (node0);
while (TREE_CODE (node0) == ARRAY_TYPE);

if (error_mark_node == node0)
  return;

[...]

if (!TYPE_READONLY (node0)
&& !TREE_READONLY (node))
  {

That is, in our case, instead of just looking at 'TREE_READONLY (node)',
we need:

if (TREE_READONLY (node) || TYPE_READONLY (node0))
  [DATA_AREA_CONST]
else
  [DATA_AREA_GLOBAL]

Is this indeed what we have to do?  (It would appear to work in the case
analyzed above, but I've not yet checked for other fallout.)


Grüße
 Thomas


gcc-12-20250403 is now available

2025-04-05 Thread GCC Administrator via Gcc
Snapshot gcc-12-20250403 is now available on
  https://gcc.gnu.org/pub/gcc/snapshots/12-20250403/
and on various mirrors, see https://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 12 git branch
with the following options: git://gcc.gnu.org/git/gcc.git branch 
releases/gcc-12 revision 3c4fbdbacd386e2bee5c826a0f75ccc2b2d34f3f

You'll find:

 gcc-12-20250403.tar.xz   Complete GCC

  SHA256=db8c673c7241fa346b4d81b5a52aa42e1f8ffec9a7c7d10b1dfdd5d710259c1e
  SHA1=422b61bb8209971e30ffba1010809963c7ab0a2e

Diffs from 12-20250327 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-12
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Status of PDB support

2025-04-05 Thread Tom Kacvinsky via Gcc
Hi,

I know that PDB support has been worked on in binutils.  I thinkt
he missing piece is to get GCC to emit CodeView debug information that
binutils will turn into a PDBm (not sure if the work is complete in
binutils, either).

What's the status of this?  I ask because our WIndows offering is
built with MSVC for C and C++, but a MinGW-w64 built GCC for Ada
compilation.  Having GCC + binutils generate PDB information would enable
us to use Visual Studio for one stop debugging.  As it is we have to use
gdb for Ada code (DWARF debug symbols) and Visual Studio C/C++ debugging
(because PDB), which is a pain.

Thanks,

Tom


Remove duplication for the handling of attributes between different frontends

2025-04-05 Thread Antoni Boucher via Gcc

Hi.
We're trying to remove the duplication of the attributes code between 
the C and libgccjit frontend.
The attached patch shows a draft of this attempt that only supports a 
few attributes.
Would that kind of approach be acceptable (I'm not sure since this 
includes a c-family file in libgccjit)?

If not, do you have any idea of how we could do this?
Thanks.From 320d91cd348f6bb2f2b9dbd7760d63a31f48984e Mon Sep 17 00:00:00 2001
From: Guillaume Gomez 
Date: Sun, 16 Mar 2025 23:34:02 +0100
Subject: [PATCH] Add support for `access` attribute

---
 gcc/c-family/c-attribs.cc |  3 +-
 gcc/c-family/c-common.h   |  1 +
 gcc/jit/dummy-frontend.cc | 69 ++-
 gcc/jit/jit-playback.cc   | 44 +
 gcc/jit/jit-playback.h|  4 +++
 gcc/jit/jit-recording.cc  | 35 
 gcc/jit/jit-recording.h   |  7 
 gcc/jit/libgccjit.cc  | 38 +
 gcc/jit/libgccjit.h   | 15 +
 gcc/jit/libgccjit.map |  5 +++
 10 files changed, 147 insertions(+), 74 deletions(-)

diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index a1c5d0c895b..29862854724 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -145,7 +145,6 @@ static tree handle_expected_throw_attribute (tree *, tree, tree, int, bool *);
 static tree handle_cleanup_attribute (tree *, tree, tree, int, bool *);
 static tree handle_warn_unused_result_attribute (tree *, tree, tree, int,
 		 bool *);
-static tree handle_access_attribute (tree *, tree, tree, int, bool *);
 
 static tree handle_sentinel_attribute (tree *, tree, tree, int, bool *);
 static tree handle_type_generic_attribute (tree *, tree, tree, int, bool *);
@@ -5399,7 +5398,7 @@ append_access_attr_idxs (tree node[3], tree attrs, const char *attrstr,
representing a VLA bound.  To speed up parsing, the handler transforms
the attribute and its arguments into a string.  */
 
-static tree
+tree
 handle_access_attribute (tree node[3], tree name, tree args, int flags,
 			 bool *no_add_attrs)
 {
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index a74530bafff..3a349ed5ae1 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1668,6 +1668,7 @@ extern tree handle_musttail_attribute (tree *, tree, tree, int, bool *);
 extern bool has_attribute (location_t, tree, tree, tree (*)(tree));
 extern tree build_attr_access_from_parms (tree, bool);
 extern void set_musttail_on_return (tree, location_t, bool);
+extern tree handle_access_attribute (tree *, tree, tree, int, bool *);
 
 /* In c-format.cc.  */
 extern bool valid_format_string_type_p (tree);
diff --git a/gcc/jit/dummy-frontend.cc b/gcc/jit/dummy-frontend.cc
index c93df2e4796..544684356cd 100644
--- a/gcc/jit/dummy-frontend.cc
+++ b/gcc/jit/dummy-frontend.cc
@@ -21,6 +21,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
+#include "c-family/c-common.h"
 #include "target.h"
 #include "jit-playback.h"
 #include "stor-layout.h"
@@ -50,13 +51,10 @@ static tree handle_always_inline_attribute (tree *, tree, tree, int,
 static tree handle_cold_attribute (tree *, tree, tree, int, bool *);
 static tree handle_const_attribute (tree *, tree, tree, int, bool *);
 static tree handle_fnspec_attribute (tree *, tree, tree, int, bool *);
-static tree handle_format_arg_attribute (tree *, tree, tree, int, bool *);
-static tree handle_format_attribute (tree *, tree, tree, int, bool *);
 static tree handle_leaf_attribute (tree *, tree, tree, int, bool *);
 static tree handle_malloc_attribute (tree *, tree, tree, int, bool *);
 static tree handle_noinline_attribute (tree *, tree, tree, int, bool *);
 static tree handle_nonnull_attribute (tree *, tree, tree, int, bool *);
-static tree handle_noreturn_attribute (tree *, tree, tree, int, bool *);
 static tree handle_nothrow_attribute (tree *, tree, tree, int, bool *);
 static tree handle_novops_attribute (tree *, tree, tree, int, bool *);
 static tree handle_patchable_function_entry_attribute (tree *, tree, tree,
@@ -78,19 +76,6 @@ static tree ignore_attribute (tree *, tree, tree, int, bool *);
 #define ATTR_EXCL(name, function, type, variable)	\
   { name, function, type, variable }
 
-/* Define attributes that are mutually exclusive with one another.  */
-static const struct attribute_spec::exclusions attr_noreturn_exclusions[] =
-{
-  ATTR_EXCL ("alloc_align", true, true, true),
-  ATTR_EXCL ("alloc_size", true, true, true),
-  ATTR_EXCL ("const", true, true, true),
-  ATTR_EXCL ("malloc", true, true, true),
-  ATTR_EXCL ("pure", true, true, true),
-  ATTR_EXCL ("returns_twice", true, true, true),
-  ATTR_EXCL ("warn_unused_result", true, true, true),
-  ATTR_EXCL (NULL, false, false, false),
-};
-
 static const struct attribute_spec::exclusions attr_returns_twice_exclusions[] =
 {
   ATTR_EXCL ("noreturn", true, true, true),
@@ -158,6 +143,8 @@ static const attribute_spec jit_gnu_attribute

GSoC[Fortran Runtime argument check ] Draft of Proposal and some doubts about the needs

2025-04-05 Thread Gwen Fu via Gcc
My doubt :
1.Does the compilation option only need to support fortran versions above
9, o5r does it also need to support fortran 77?
2.Regarding parameter checking, *my idea is that after the user creates an
array of a specified size, it is passed into the function as a parameter*.
However, the array size required by the function does not match the size
specified by the user. However, this idea seems to be impossible to
implement in fortran77. *In addition, in addition to implicit and explicit
size arrays, are there other data structures that require parameter type
checking?*
3.
I found out that "-fcheck=*" is an option for runtime checking, but the
relevant options are commented out.
  OPT_fcheck_ = 1070,/* -fcheck= */
  /* OPT_fcheck_assert = 1071, *//* -fcheck=assert */
  /* OPT_fcheck_bounds = 1072, *//* -fcheck=bounds */
  /* OPT_fcheck_in = 1073, *//* -fcheck=in */
  /* OPT_fcheck_invariant = 1074, */ /* -fcheck=invariant */
  /* OPT_fcheck_out = 1075, */   /* -fcheck=out */
  /* OPT_fcheck_switch = 1076, *//* -fcheck=switch */
  OPT_fcheckaction_ = 1077,  /* -fcheckaction= */
  OPT_fchecking = 1078,  /* -fchecking */

And I tried :
$ gfortran -o fibonacci fabonaqi.f90 -fcheck=in
f951: Warning: command-line option ‘-fpreconditions’ is valid for D but not
for Fortran
$ gfortran --help=check
cc1: warning: unrecognized argument to ‘--help=’ option: ‘check’
So Is this related  to the project ?

Here is my proposal draft (importran part )
Parameter Mismatch Implicit Declaration Features of Fortran 77

I think this is the reason why parameter mismatch may occur

In *Fortran 77*, *Implicit Typing* is a variable type automatic inference
mechanism, that is, *the compiler automatically determines its data type
based on the first letter of the variable name* without explicit
declaration.
Variable first characterDefault data type
A-H O-Z AH OZ REAL (single-precision floating point number)
I-N INTEGER (integer)

*This will lead*

   -

   *Error-prone*: If you make a typo (e.g., TOTAL is mistakenly written as
   TOAL), the compiler will not report an error, but the variable type may
   change.
   -

   *Poor code readability*: Implicit rules are not intuitive, increasing
   maintenance difficulty.


Parameter mismatch of implicit-size or explicit-size array

*1. Assumed-size Array*

*Syntax example:*

SUBROUTINE SUB(A)
  DIMENSION A(*)  ! Assumed-size array
  INTEGER A
  ! ...
END

*Potential issues:*

   -

   *No bounds checking*: The compiler does not know the actual size of the
   array passed in, and relies entirely on the programmer to ensure correct
   access.
   -

   *Out-of-bounds risk*: If A(i) is accessed inside the function but i
   exceeds the actual array range, undefined behavior (such as memory
   corruption, program crash) will occur.
   -

   *No shape checking*: Even if the dimensions of the array passed in do
   not match (such as A(10) but the actual passed in A(5)), the compiler
   will not report an error.

*2. Explicit-size Array *

*Syntax example:*

SUBROUTINE SUB(A, N)
  DIMENSION A(N)  ! Explicit-size Array
  INTEGER A, N
  ! ...
END

*Potential Issues:*

   -

   *Manual Size Passing*: The array size N must be passed additionally,
   which is prone to errors (e.g. passing the wrong N).
   -

   *Mismatch Risk*: If N is larger than the actual array size, the function
   may access illegal memory.
   -

   *No Dimension Check*: The compiler will not warn even if the array
   dimensions passed in are different (e.g. A(10) vs A(5,2)).

Technology stack mastered

   -

   Familiar with C++/C language
   -

   Familiar with Linux environment programming, proficient in three IO
   multiplexing technologies: epoll, select, and poll, and their underlying
   principles
   -

   Proficient in debugging tools such as gdb valgrind
   -

   Understand Fortran language


Projects participated in

   -

   Personal project Lightweight HTTP server based on Reactor mode
   

This project is developed in C++, including log output module, connection
management module, business module, and I/O processing connection request
processing module

   -

   OSPP-2024 Kytunig-Client data output function module development
   

This project is developed in Python, and its core functions include
deserialization of JSON data and conversion of JSON data into Excel files.


Re: [Draft] GSoC 2025 Proposal: Implementing Clang's -ftime-trace Feature in GCC

2025-04-05 Thread Andi Kleen
On Mon, Mar 31, 2025 at 11:14:47PM +0300, Eldar Kusdavletov wrote:
> I wanted to follow up on my previous email regarding my interest in
> participating in Google Summer of Code with GCC. I saw the discussion in the
> thread, but it seems there was no final confirmation.
> 
> Could you please let me know if everything is in order and whether I can
> proceed with submitting my application? I would really appreciate any guidance
> on the next steps.

It might be useful if you write a more concrete proposal
that stands by itself, and not just says "it's like clang" 

I haven't found a clear spec what clang does exactly.

I assume it's like timevars, but somehow higher level
and it does most of its accounting per function and outputs the result as a
JSON file that can be read by Chrome.

But overall it seems like a reasonable proposal to implement
on top of the existing timevar infrastructure in the pass manager,
as long as you can define what "the higher level" means exactly 
and how it relates to the existing timevars.

-Andi


gcc-13-20250321 is now available

2025-04-05 Thread GCC Administrator via Gcc
Snapshot gcc-13-20250321 is now available on
  https://gcc.gnu.org/pub/gcc/snapshots/13-20250321/
and on various mirrors, see https://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 13 git branch
with the following options: git://gcc.gnu.org/git/gcc.git branch 
releases/gcc-13 revision 41db4716a5603052df626a1ab911b0b3fab322b2

You'll find:

 gcc-13-20250321.tar.xz   Complete GCC

  SHA256=27d6cfba0ddd7be8676dd9bedb64cb1166cc8d59332896301a44a993b017535f
  SHA1=b4231dc5d53f646855e341fc7a636573b966bba3

Diffs from 13-20250314 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-13
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: Using nonzero_bits() in insn conditions?

2025-04-05 Thread Jeff Law via Gcc




On 3/19/25 4:14 AM, Georg-Johann Lay wrote:

Am 16.03.25 um 14:51 schrieb Jeff Law via Gcc:

On 3/13/25 5:39 AM, Georg-Johann Lay via Gcc wrote:

There are situations where knowledge about which bits
of a value are (not) set can be used for optimization.
For example in an insn combine pattern like:

(define_insn_and_split ""
   [(set (match_operand:QI 0 "register_operand" "=d")
 (ior:QI (ashift:QI (match_operand:QI 1 "register_operand" "r")
    (match_operand:QI 2 
"const_0_to_7_operand" "n"))

 (match_operand:QI 3 "register_operand" "0")))]
   "optimize
    && !reload_completed
    && nonzero_bits (operands[1], VOIDmode) == 1"
...

This pattern is only correct when operands[1] is 0 or 1.

While such patterns seem to work, it's all quite wonky,
in particular since nonzero_bits() may forget about known
properties in later passes.
While it works most of the time, it's fundamentally wrong to have a 
pattern where the conditional is dependent on state that changes based 
on pass specific data, nearby context, etc.




For the use case I have in mind, it is in order when the
pattern works until split1 which would transform it into
something else (and without nonzero_bits() in the insn
condition, asserting that the existence of the pattern
certifies the bit condition).
It's still the wrong thing to do.  You'll get away with it for a 
while, but one day it'll break.


We have similar problems in the RISC-V world where we would like to be 
able to match certain patterns based on known ranges of an operand.  
The most common case would be bset/bclr/binv on an SImode object on 
rv64 where the bit twiddled is variable.  In particular we need to 
know the bit position is not bit 31.


There's no way to really describe that in an insn's condition because 
range information like that isn't available in RTL and something like 
nonzero bits is pass specific.


As a result we're limited in our ability to use the bset/bclr/binv 
instructions.


Jeff


One way to support this is a new target hook that would run somewhere
in recog_for_combine().  The hook would allow the backend to replace
the pattern as synthesized by combine with an equivalent pattern.
Much simpler: Add a split pass immediately after combine.  Use 
define_insn_and_split to handle rewriting.  No hooks needed.


Jeff



Re: COBOL: Call to builtin_decl_explicit (BUILT_IN_EXIT), is optimized away.

2025-04-05 Thread Richard Biener via Gcc
On Fri, Apr 4, 2025 at 12:17 AM Robert Dubner  wrote:
>
> The COBOL compiler has this routine:
>
> void
> gg_exit(tree exit_code)
>   {
>   tree the_call =
>   build_call_expr_loc(location_from_lineno(),
>   builtin_decl_explicit (BUILT_IN_EXIT),
>   1,
>   exit_code);
>   gg_append_statement(the_call);
>   }
>
> I have found that when GCOBOL is used with -O2, -O3, or -Os, the call to
> gg_exit() is optimized away, and the intended exit value is lost, and I
> end up with zero.
>
> By changing the routine to
>
> void
> gg_exit(tree exit_code)
>   {
>   tree args[1] = {exit_code};
>   tree function = gg_get_function_address(INT, "exit");
>   tree the_call = build_call_array_loc (location_from_lineno(),
> VOID,
> function,
> 1,
> args);
>   gg_append_statement(the_call);
>   }
>
> the call is not optimized away, and the generated executable behaves as
> expected.
>
> How do I prevent the call to gg_exit() from being optimized away?

I don't see anything wrong here, so the issue must be elsewhere.
Do you have a COBOL testcase that shows the exit() being optimized?

>
> Thanks!
>


[no subject]

2025-04-05 Thread Ndlelanhle Makhathini (223038204) via Gcc



UKZN email disclaimer:
The contents of this e-mail may contain personal information, and/or privileged 
information, and/or confidential information. The information contained herein 
is therefore only meant for consumption by the recipient mentioned and for the 
purpose as specified in the body of the e-mail. Should you receive this e-mail 
in error kindly inform the sender of such by responding to the sender via a 
response e-mail and thereafter please delete the original e-mail received as 
well as the response e-mail. The University of KwaZulu-Natal e-mail platform is 
meant for business purposes (of the University) only and the University 
therefore does not accept any liability whatsoever that may arise from 
instances where such platform is utilised for personal reasons. Any views or 
opinions expressed in this e-mail represent those of the author and may not 
necessarily be binding on the University. The author of the e-mail may also not 
bind the University in any manner that may be construed from the contents of 
the e-mail unless such sender has been granted the requisite authority to do so 
by the University.


GSoC Draft Proposal Submission: Fortran 2018/202x

2025-04-05 Thread Yuao Ma via Gcc
Hi GCC developers,

I'm sharing the draft proposal for my GSoC project titled "Fortran 2018/202x".
It has already been posted on the Fortran mailing list, where I received
valuable feedback from gfortran developers.

As mentioned on the GCC GSoC page, proposals should also be shared on the GCC
mailing list - so I’m submitting it here as well. I’d greatly appreciate any
feedback or suggestions to help refine the proposal further before the
submission deadline.

URL: https://drive.google.com/file/d/1cwj5gUxywgaqkcJWJM4Ps4IkbmAeWSAT

Thank you for your time and consideration!

Best regards,
Yuao


Re: Does gcc have different inlining heuristics on different platforms?

2025-04-05 Thread Richard Biener via Gcc
On Mon, Mar 31, 2025 at 3:14 PM Julian Waters  wrote:
>
> Thanks for the quick reply, I'll ask the people responsible for
> working on the Linux parts try to compile and link the codebase with
> -fno-use-linker-plugin to see what happens. It's a bit disheartening
> to hear that LTO support on Windows is behind Linux though. I'd help
> get that up to speed if I could, but I don't even know where to start
> or look :(

You can see what -fuse-linker-plugin says, what gcc/auto-host.h contains
for HAVE_LTO_PLUGIN.  I don't know whether the BFD linker (or mold)
supports linker plugins on windows.  I do know that libiberty simple-object
does not support PE, that is, at _least_ (DWARF) debuginfo will be subpar.

Richard.

>
> best regards,
> Julian
>
> On Mon, Mar 31, 2025 at 8:09 PM Richard Biener
>  wrote:
> >
> > On Mon, Mar 31, 2025 at 1:20 PM Julian Waters via Gcc  
> > wrote:
> > >
> > > Hi all,
> > >
> > > I've been trying to chase down an issue that's been driving me insane
> > > for a while now. It has to do with the flatten attribute being
> > > combined with LTO. I've heard that flatten and LTO are a match made in
> > > hell (Someone else's words, not mine), but from what I observe,
> > > several methods marked as flatten on Linux compile to an acceptable
> > > size with ok amount of inlining, but on Windows however... The exact
> > > same methods marked as flatten have their callees inlined so
> > > aggressively that they reach sizes of 5MB per method! Something seems
> > > to be different between how inlining works on the 2 platforms, what
> > > are the differences (If any) between Linux and Windows when it comes
> > > to inlining, particularly involving the flatten attribute? Is there a
> > > list of differences that is easily accessible somewhere, or
> > > alternatively is there somewhere in the gcc source where the
> > > heuristics are defined that I can decipher?
> > >
> > > Here's one such example of the differences between Linux and Windows
> > > (Both were compiled with the same optimization settings, -O3 and
> > > -flto=auto):
> > >
> > > Linux:
> > > 010b12d0 6289 t
> > > G1ParScanThreadState::trim_queue_to_threshold(unsigned int)
> > >
> > > Windows:
> > > 000296f9b0c0 00642d40 T
> > > G1ParScanThreadState::trim_queue_to_threshold(unsigned int) [clone
> > > .constprop.0]
> > > 000295125480 00630080 T
> > > G1ParScanThreadState::trim_queue_to_threshold(unsigned int)
> > >
> > >
> > > Thanks in advance for the help, and for humouring my question
> >
> > The main difference is that LTO on Linux can use the linker plugin
> > to derive information about how TUs are combined while on Windows
> > we're using the "collect2 path" which is quite unmaintained and which
> > gives imprecise information.  This can already result in quite different
> > inlining.  You can "simulated" that on Linux with -fno-use-linker-plugin
> > (only for experimenting, don't use this unless necssary).
> >
> > Richard.
> >
> > >
> > > best regards,
> > > Julian


gcc-14-20250405 is now available

2025-04-05 Thread GCC Administrator via Gcc
Snapshot gcc-14-20250405 is now available on
  https://gcc.gnu.org/pub/gcc/snapshots/14-20250405/
and on various mirrors, see https://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 14 git branch
with the following options: git://gcc.gnu.org/git/gcc.git branch 
releases/gcc-14 revision 4feaf39363bc3ff57bb37aa65e08308a70d7792f

You'll find:

 gcc-14-20250405.tar.xz   Complete GCC

  SHA256=9a84b0947d8fb18197eef3fce8e255e30a61f7f382cebb961b1705c1d99214a3
  SHA1=b6b938392de7e6bb33a38df5ada283fac5ef32b0

Diffs from 14-20250329 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-14
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.