GCC 9 branch is now closed

2022-05-27 Thread Richard Biener via Gcc


After the GCC 9.5 release the GCC 9 branch is now closed and the
hooks should reject any further pushes to it.

Thanks,
Richard.


GCC 9.5 Released

2022-05-27 Thread Richard Biener via Gcc
The GNU Compiler Collection version 9.5 has been released.

GCC 9.5 is a bug-fix release from the GCC 9 branch
containing important fixes for regressions and serious bugs in
GCC 9.4 with more than 171 bugs fixed since the previous release.

This is also the last release from the GCC 9 branch, GCC continues
to be maintained on the GCC 10, GCC 11 and GCC 12 branches and the
development trunk.

This release is available from the FTP servers listed here:

  https://sourceware.org/pub/gcc/releases/gcc-9.5.0/
  https://gcc.gnu.org/mirrors.html

Please do not contact me directly regarding questions or comments
about this release.  Instead, use the resources available from
http://gcc.gnu.org.

As always, a vast number of people contributed to this GCC release
-- far too many to thank them individually!


specs question

2022-05-27 Thread Iain Sandoe
Hi.

My ‘downstream’ have a situation in which they make use of a directory outside 
of the configured GCC installation - and symlink from there to libraries in the 
actual install tree.

e.g.

/foo/bar/lib:
  libgfortran.dylib -> /gcc/install/path/lib/libgfortran.dylib

Now I want to find a way for them to add an embedded runpath that references 
/foo/bar/lib.

I could add a configure option, that does exactly this job - but then I’d have 
to back port that to every GCC version they are still supporting (not, perhaps, 
the end of the world but much better avoided).

So I was looking at using —with-specs= to add a link-time spec for this:

--with-specs='%{!nodefaultrpaths:%{!r:%:version-compare(>= 10.5 
mmacosx_version_min= -Wl,-rpath,/foo/bar/lib)}}}’

Which works, fine except for PCH jobs which it breaks because the presence of 
an option claimed by the linker causes a link job to be created, even though 
one is not required (similar issue have been seen before).

There is this:
 %{,S:X}  substitutes X, if processing a file which will use spec S.

so I could then do:

--with-specs=‘%{,???:%{!nodefaultrpaths:%{!r:%:version-compare(>= 10.5 
mmacosx_version_min= -Wl,-rpath,/foo/bar/lib)’

but, unfortunately, I cannot seem to figure out what ??? should be  [I tried 
‘l’ (link_spec) ‘link_command’ (*link_command)]

…JFTR also tried
  %{!.h:
   %{!,c-header:

——
any insight would be welcome, usually I muddle through with specs, but this one 
has me stumped.

thanks
Iain



Re: Documentation format question

2022-05-27 Thread Andrew MacLeod via Gcc

On 5/27/22 02:38, Richard Biener wrote:

On Wed, May 25, 2022 at 10:36 PM Andrew MacLeod via Gcc  wrote:

I am going to get to some documentation for ranger and its components
later this cycle.

I use to stick these sorts things on the wiki page, but i find that gets
out of date really quickly.  I could add more comments to the top of
each file, but that doesnt seem very practical for larger architectural
descriptions, nor for APIs/use cases/best practices.   I could use
google docs and turn it into a PDF or some other format, but that isnt
very flexible.

Do we/anyone have any forward looking plans for GCC documentation that I
should consider using?  It would be nice to be able to tie some of it
into source files/classes in some way, but I am unsure of a decent
direction.  It has to be easy to use, or I wont use it :-)  And i
presume many others wouldn't either.  Im not too keep an manually
marking up text either.

The appropriate place for this is the internals manual and thus the
current format in use is texinfo in gcc/doc/

And there is no move to convert it to anything more modern?    Is there 
at least a reasonable tool to be able to generate texinfo from?  
Otherwise the higher level stuff is likely to end up in a wiki page 
where I can just visually do it.


Andrew




Loop splitting based on constant prefix of an array

2022-05-27 Thread Laleh Beni via Gcc
GCC compiler is able to understand if the prefix of an array holds
constant/static data and apply compiler optimizations on that partial
constant part of the array, however, it seems that it is not leveraging
this information in all cases.

On understanding the behavior of compiler optimization for partially
constant arrays and especially how the loop splitting pass could have an
influence on the potential constant related optimizations such as constant
folding I am using  the following example:



Considering an array where the prefix of that array is compile-time
constant data, and the rest of the array is runtime data, should the
compiler be able to optimize the calculation for the first part of the
array?

Let's look at the below example:



You can see the code and its assembly here: https://godbolt.org/z/xjxbz431b



#include 

inline int sum(const int array[], size_t len) {

  int res = 0;

  for (size_t i = 0; i < len; i++) {

res += array[i];

  }

  return res;

}

int main(int argc, char** argv)

{

int arr1[6] = {200,2,3, argc, argc+1, argc+2};

return  sum(arr1, 6);

}





In our sum function we are measuring the some of the array elements, where
the first half of it is  static compile-time constants and the second half
are dynamic data.

When we compile this with the "x86-64 GCC 12.1" compiler with "-O3
-std=c++2a " flags, we get the following assembly code:



 main:

mov rax, QWORD PTR .LC0[rip]

mov DWORD PTR [rsp-28], edi

mov DWORD PTR [rsp-32], 3

movqxmm1, QWORD PTR [rsp-32]

mov QWORD PTR [rsp-40], rax

movqxmm0, QWORD PTR [rsp-40]

lea eax, [rdi+1]

add edi, 2

mov DWORD PTR [rsp-24], eax

paddd   xmm0, xmm1

mov DWORD PTR [rsp-20], edi

movqxmm1, QWORD PTR [rsp-24]

paddd   xmm0, xmm1

movdeax, xmm0

pshufd  xmm2, xmm0, 0xe5

movdedx, xmm2

add eax, edx

ret

.LC0:

.long   200

.long   2





However, if we add an “if” condition in the loop for calculating the result
of the sum, the if condition seems to enable the loop splitting pass:



You can see the code and its assembly here:  https://godbolt.org/z/ejecbjMKG



#include 

inline int sum(const int array[], size_t len) {

  int res = 0;

  for (size_t i = 0; i < len; i++) {

if (i < 1)

res += array[i];

else

res += array[i];

  }

  return res;

}

int main(int argc, char** argv)

{

int arr1[6] = {200,2,3, argc, argc+1, argc+2};

return  sum(arr1, 6);

}





we get the following assembly code:



main:

lea eax, [rdi+208+rdi*2]

ret





As you can see the “if” condition has the same calculation for both the
“if” and “else” branch in calculating the sum over the array, however, it
seems that it is triggering the “loop splitting pass” which results in
further optimizations such as constant folding of the whole computation and
resulting in such a smaller and faster assembly code.

My question is, why the compiler is not able to take advantage of
constantans in the prefix of the array in the first place?

Also adding a not necessary “if condition” which is just repeating the same
code for "if" and "else", doesn’t seem to be the best way to hint the
compiler to take advantage of this optimization; so is there another way to
make the compiler aware of this? ( I used the -fsplit-loops flag and it
didn't have any effect for this example.)



As a next step if we use an array that has some constant values in the
prefix but not a compile time constant length such as the following example:

Code link is here: https://godbolt.org/z/3qGqshzn9



#include 

inline int sum(const int array[], size_t len) {

  int res = 0;

  for (size_t i = 0; i < len; i++) {

if (i < 1)

res += array[i];

else

res += array[i];

  }

  return res;

}

int main(int argc, char** argv)

{

size_t len = argc+3;

int arr3[len] = {600,10,1};

for (unsigned int i = 3; i < len; i++) arr3[i] = argc+i;

return sum(arr3, 2);

}



In this case the GCC compiler is not able to apply constant folding on the
first part of the array!

In general is there anyway that the GCC compiler would understand this and
apply constant folding optimizations here?


gcc-11-20220527 is now available

2022-05-27 Thread GCC Administrator via Gcc
Snapshot gcc-11-20220527 is now available on
  https://gcc.gnu.org/pub/gcc/snapshots/11-20220527/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 11 git branch
with the following options: git://gcc.gnu.org/git/gcc.git branch 
releases/gcc-11 revision 186fcf8b7a7c17a8a17466bc9149b3ca4ca9dd3e

You'll find:

 gcc-11-20220527.tar.xz   Complete GCC

  SHA256=dd2162db1a11ad3761391cf8f2d0d5157f0feeab14dd8c36fbb59d41b93b55f4
  SHA1=9ea959604b616fe99188cb3fd06b3da4a4217508

Diffs from 11-20220520 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-11
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Passing c blocks to pre processor

2022-05-27 Thread Yair Lenga via Gcc
Hi,

I am trying to define macros that will accept code blocks. This will be used to 
create high level structures (like iterations, etc.). Ideally, I would like to 
do something like (this is simplified example, actual code much more 
application specific):

Foreach(var, low, high, block)

While should translate to:
For (int var=low ; var < high ; var ++) block 

The challnge is the gcc pre processor does not accept blocks. It assume a comma 
terminate a block. For example:
Foreach(foo, 1, 30, { int z=5, y=3, …}) will invoke the macro with 5 arguments: 
(1) foo (2) 1, (3) 30, (4) { int z=5}, (5) y=3,

Is there a way to tell CPP that an argument that start with ‘{‘ should extend 
until the matching ‘}’ ? Similar to the way ‘(‘ in macro arguments will extend 
till matching ‘)’.

Thanks, yair.


Sent from my iPad

[MRISC32] Not getting scaled index addressing in loops

2022-05-27 Thread m

Hello!

I maintain a fork of GCC which adds support for my custom CPU ISA, 
MRISC32 (the machine description can be found here: 
https://github.com/mrisc32/gcc-mrisc32/tree/mbitsnbites/mrisc32/gcc/config/mrisc32 
).


I recently discovered that scaled index addressing (i.e. MEM[base + 
index * scale]) does not work inside loops, but I have not been able to 
figure out why.


I believe that I have all the plumbing in the MD that's required 
(MAX_REGS_PER_ADDRESS, REGNO_OK_FOR_BASE_P, REGNO_OK_FOR_INDEX_P, etc), 
and I have verified that scaled index addressing is used in trivial 
cases like this:


charcarray[100];
shortsarray[100];
intiarray[100];
voidsingle_element(intidx, intvalue) {
carray[idx] = value; // OK
sarray[idx] = value; // OK
iarray[idx] = value; // OK
}

...which produces the expected machine code similar to this:

stbr2, [r3, r1] // OK
sthr2, [r3, r1*2] // OK
stwr2, [r3, r1*4] // OK

However, when the array assignment happens inside a loop, only the char 
version uses index addressing. The other sizes (short and int) will be 
transformed into code where the addresses are stored in registers that 
are incremented by +2 and +4 respectively.


voidloop(void) {
for(intidx = 0; idx < 100; ++idx) {
carray[idx] = idx; // OK
sarray[idx] = idx; // BAD
iarray[idx] = idx; // BAD
}
} ...which produces:
.L4:
sthr1, [r3] // BAD
stwr1, [r2] // BAD
stbr1, [r5, r1] // OK
addr1, r1, #1
sner4, r1, #100
addr3, r3, #2 // (BAD)
addr2, r2, #4 // (BAD)
bsr4, .L4

I would expect scaled index addressing to be used in loops too, just as 
is done for AArch64 for instance. I have dug around in the machine 
description, but I can't really figure out what's wrong.


For reference, here is the same code in Compiler Explorer, including the 
code generated for AArch64 for comparison: https://godbolt.org/z/drzfjsxf7


Passing -da (dump RTL all) to gcc, I can see that the decision to not 
use index addressing has been made already in *.253r.expand.


Does anyone have any hints about what could be wrong and where I should 
start looking?


Regards,

  Marcus