GCC 9 branch is now closed
After the GCC 9.5 release the GCC 9 branch is now closed and the hooks should reject any further pushes to it. Thanks, Richard.
GCC 9.5 Released
The GNU Compiler Collection version 9.5 has been released. GCC 9.5 is a bug-fix release from the GCC 9 branch containing important fixes for regressions and serious bugs in GCC 9.4 with more than 171 bugs fixed since the previous release. This is also the last release from the GCC 9 branch, GCC continues to be maintained on the GCC 10, GCC 11 and GCC 12 branches and the development trunk. This release is available from the FTP servers listed here: https://sourceware.org/pub/gcc/releases/gcc-9.5.0/ https://gcc.gnu.org/mirrors.html Please do not contact me directly regarding questions or comments about this release. Instead, use the resources available from http://gcc.gnu.org. As always, a vast number of people contributed to this GCC release -- far too many to thank them individually!
specs question
Hi. My ‘downstream’ have a situation in which they make use of a directory outside of the configured GCC installation - and symlink from there to libraries in the actual install tree. e.g. /foo/bar/lib: libgfortran.dylib -> /gcc/install/path/lib/libgfortran.dylib Now I want to find a way for them to add an embedded runpath that references /foo/bar/lib. I could add a configure option, that does exactly this job - but then I’d have to back port that to every GCC version they are still supporting (not, perhaps, the end of the world but much better avoided). So I was looking at using —with-specs= to add a link-time spec for this: --with-specs='%{!nodefaultrpaths:%{!r:%:version-compare(>= 10.5 mmacosx_version_min= -Wl,-rpath,/foo/bar/lib)}}}’ Which works, fine except for PCH jobs which it breaks because the presence of an option claimed by the linker causes a link job to be created, even though one is not required (similar issue have been seen before). There is this: %{,S:X} substitutes X, if processing a file which will use spec S. so I could then do: --with-specs=‘%{,???:%{!nodefaultrpaths:%{!r:%:version-compare(>= 10.5 mmacosx_version_min= -Wl,-rpath,/foo/bar/lib)’ but, unfortunately, I cannot seem to figure out what ??? should be [I tried ‘l’ (link_spec) ‘link_command’ (*link_command)] …JFTR also tried %{!.h: %{!,c-header: —— any insight would be welcome, usually I muddle through with specs, but this one has me stumped. thanks Iain
Re: Documentation format question
On 5/27/22 02:38, Richard Biener wrote: On Wed, May 25, 2022 at 10:36 PM Andrew MacLeod via Gcc wrote: I am going to get to some documentation for ranger and its components later this cycle. I use to stick these sorts things on the wiki page, but i find that gets out of date really quickly. I could add more comments to the top of each file, but that doesnt seem very practical for larger architectural descriptions, nor for APIs/use cases/best practices. I could use google docs and turn it into a PDF or some other format, but that isnt very flexible. Do we/anyone have any forward looking plans for GCC documentation that I should consider using? It would be nice to be able to tie some of it into source files/classes in some way, but I am unsure of a decent direction. It has to be easy to use, or I wont use it :-) And i presume many others wouldn't either. Im not too keep an manually marking up text either. The appropriate place for this is the internals manual and thus the current format in use is texinfo in gcc/doc/ And there is no move to convert it to anything more modern? Is there at least a reasonable tool to be able to generate texinfo from? Otherwise the higher level stuff is likely to end up in a wiki page where I can just visually do it. Andrew
Loop splitting based on constant prefix of an array
GCC compiler is able to understand if the prefix of an array holds constant/static data and apply compiler optimizations on that partial constant part of the array, however, it seems that it is not leveraging this information in all cases. On understanding the behavior of compiler optimization for partially constant arrays and especially how the loop splitting pass could have an influence on the potential constant related optimizations such as constant folding I am using the following example: Considering an array where the prefix of that array is compile-time constant data, and the rest of the array is runtime data, should the compiler be able to optimize the calculation for the first part of the array? Let's look at the below example: You can see the code and its assembly here: https://godbolt.org/z/xjxbz431b #include inline int sum(const int array[], size_t len) { int res = 0; for (size_t i = 0; i < len; i++) { res += array[i]; } return res; } int main(int argc, char** argv) { int arr1[6] = {200,2,3, argc, argc+1, argc+2}; return sum(arr1, 6); } In our sum function we are measuring the some of the array elements, where the first half of it is static compile-time constants and the second half are dynamic data. When we compile this with the "x86-64 GCC 12.1" compiler with "-O3 -std=c++2a " flags, we get the following assembly code: main: mov rax, QWORD PTR .LC0[rip] mov DWORD PTR [rsp-28], edi mov DWORD PTR [rsp-32], 3 movqxmm1, QWORD PTR [rsp-32] mov QWORD PTR [rsp-40], rax movqxmm0, QWORD PTR [rsp-40] lea eax, [rdi+1] add edi, 2 mov DWORD PTR [rsp-24], eax paddd xmm0, xmm1 mov DWORD PTR [rsp-20], edi movqxmm1, QWORD PTR [rsp-24] paddd xmm0, xmm1 movdeax, xmm0 pshufd xmm2, xmm0, 0xe5 movdedx, xmm2 add eax, edx ret .LC0: .long 200 .long 2 However, if we add an “if” condition in the loop for calculating the result of the sum, the if condition seems to enable the loop splitting pass: You can see the code and its assembly here: https://godbolt.org/z/ejecbjMKG #include inline int sum(const int array[], size_t len) { int res = 0; for (size_t i = 0; i < len; i++) { if (i < 1) res += array[i]; else res += array[i]; } return res; } int main(int argc, char** argv) { int arr1[6] = {200,2,3, argc, argc+1, argc+2}; return sum(arr1, 6); } we get the following assembly code: main: lea eax, [rdi+208+rdi*2] ret As you can see the “if” condition has the same calculation for both the “if” and “else” branch in calculating the sum over the array, however, it seems that it is triggering the “loop splitting pass” which results in further optimizations such as constant folding of the whole computation and resulting in such a smaller and faster assembly code. My question is, why the compiler is not able to take advantage of constantans in the prefix of the array in the first place? Also adding a not necessary “if condition” which is just repeating the same code for "if" and "else", doesn’t seem to be the best way to hint the compiler to take advantage of this optimization; so is there another way to make the compiler aware of this? ( I used the -fsplit-loops flag and it didn't have any effect for this example.) As a next step if we use an array that has some constant values in the prefix but not a compile time constant length such as the following example: Code link is here: https://godbolt.org/z/3qGqshzn9 #include inline int sum(const int array[], size_t len) { int res = 0; for (size_t i = 0; i < len; i++) { if (i < 1) res += array[i]; else res += array[i]; } return res; } int main(int argc, char** argv) { size_t len = argc+3; int arr3[len] = {600,10,1}; for (unsigned int i = 3; i < len; i++) arr3[i] = argc+i; return sum(arr3, 2); } In this case the GCC compiler is not able to apply constant folding on the first part of the array! In general is there anyway that the GCC compiler would understand this and apply constant folding optimizations here?
gcc-11-20220527 is now available
Snapshot gcc-11-20220527 is now available on https://gcc.gnu.org/pub/gcc/snapshots/11-20220527/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 11 git branch with the following options: git://gcc.gnu.org/git/gcc.git branch releases/gcc-11 revision 186fcf8b7a7c17a8a17466bc9149b3ca4ca9dd3e You'll find: gcc-11-20220527.tar.xz Complete GCC SHA256=dd2162db1a11ad3761391cf8f2d0d5157f0feeab14dd8c36fbb59d41b93b55f4 SHA1=9ea959604b616fe99188cb3fd06b3da4a4217508 Diffs from 11-20220520 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-11 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Passing c blocks to pre processor
Hi, I am trying to define macros that will accept code blocks. This will be used to create high level structures (like iterations, etc.). Ideally, I would like to do something like (this is simplified example, actual code much more application specific): Foreach(var, low, high, block) While should translate to: For (int var=low ; var < high ; var ++) block The challnge is the gcc pre processor does not accept blocks. It assume a comma terminate a block. For example: Foreach(foo, 1, 30, { int z=5, y=3, …}) will invoke the macro with 5 arguments: (1) foo (2) 1, (3) 30, (4) { int z=5}, (5) y=3, Is there a way to tell CPP that an argument that start with ‘{‘ should extend until the matching ‘}’ ? Similar to the way ‘(‘ in macro arguments will extend till matching ‘)’. Thanks, yair. Sent from my iPad
[MRISC32] Not getting scaled index addressing in loops
Hello! I maintain a fork of GCC which adds support for my custom CPU ISA, MRISC32 (the machine description can be found here: https://github.com/mrisc32/gcc-mrisc32/tree/mbitsnbites/mrisc32/gcc/config/mrisc32 ). I recently discovered that scaled index addressing (i.e. MEM[base + index * scale]) does not work inside loops, but I have not been able to figure out why. I believe that I have all the plumbing in the MD that's required (MAX_REGS_PER_ADDRESS, REGNO_OK_FOR_BASE_P, REGNO_OK_FOR_INDEX_P, etc), and I have verified that scaled index addressing is used in trivial cases like this: charcarray[100]; shortsarray[100]; intiarray[100]; voidsingle_element(intidx, intvalue) { carray[idx] = value; // OK sarray[idx] = value; // OK iarray[idx] = value; // OK } ...which produces the expected machine code similar to this: stbr2, [r3, r1] // OK sthr2, [r3, r1*2] // OK stwr2, [r3, r1*4] // OK However, when the array assignment happens inside a loop, only the char version uses index addressing. The other sizes (short and int) will be transformed into code where the addresses are stored in registers that are incremented by +2 and +4 respectively. voidloop(void) { for(intidx = 0; idx < 100; ++idx) { carray[idx] = idx; // OK sarray[idx] = idx; // BAD iarray[idx] = idx; // BAD } } ...which produces: .L4: sthr1, [r3] // BAD stwr1, [r2] // BAD stbr1, [r5, r1] // OK addr1, r1, #1 sner4, r1, #100 addr3, r3, #2 // (BAD) addr2, r2, #4 // (BAD) bsr4, .L4 I would expect scaled index addressing to be used in loops too, just as is done for AArch64 for instance. I have dug around in the machine description, but I can't really figure out what's wrong. For reference, here is the same code in Compiler Explorer, including the code generated for AArch64 for comparison: https://godbolt.org/z/drzfjsxf7 Passing -da (dump RTL all) to gcc, I can see that the decision to not use index addressing has been made already in *.253r.expand. Does anyone have any hints about what could be wrong and where I should start looking? Regards, Marcus