[Bug target/92729] [avr] Convert the backend to MODE_CC so it can be kept in future releases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92729 --- Comment #12 from Georg-Johann Lay --- Simulator: avrtest core simulator hosted on SourceForge as part of WinAVR. Libc: avr-libc trunk hosted on nongnu.org. There are several patches not yet integrated: recent xtiny devices, fixes in libm to adjust to the recent double64 additions, and extensions for the build environment to handle the new avr-gcc configure options for double multilib layout. Patches are pending for some time; you'll have to resolve conflicts. Binutils is vanilla from sourceware.org.
[Bug target/92729] [avr] Convert the backend to MODE_CC so it can be kept in future releases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92729 --- Comment #13 from Georg-Johann Lay --- FYI, avrtest is here: https://sourceforge.net/p/winavr/code/HEAD/tree/trunk/avrtest/
[Bug target/92729] [avr] Convert the backend to MODE_CC so it can be kept in future releases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92729 --- Comment #15 from Georg-Johann Lay --- I built the tools by hand so I knew what I had... Dunno about gcc/buildbot policies concerning avr. As avr as a 3ary target, that BE's quality is of no consideration when releasing the compiler. Again, I added/ran tests by myself when working on the BE. However, test coverage is low, and there are no performance tests. And there is no performance test suite I know of that would work reasonably for AVR, or one that has been designed for AVR/avr-gcc . And be warned that the avr BE has many kludges, work-arounds and hacks. Some are historical, but most of them work around shortcomings and flaws in the middle-ends (nobody will fix middle-end issues that hamper a 3ary target).
[Bug target/108287] AVR build: gcc/config/avr/t-avr tries to edit the source tree
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108287 --- Comment #4 from Georg-Johann Lay --- Well, updating or creating some auto-generated files is intentional. What's not supported as of GCC documentation is configure'ing in the source tree: https://gcc.gnu.org/install/configure.html > First, we **highly** recommend that GCC be built into a separate directory > from the sources which does not reside within the source tree. > This is how we generally build GCC; building where srcdir == objdir > should still work, but doesn’t get extensive testing; building where > objdir is a subdirectory of srcdir is unsupported. The reason why it does not work for you might be: 1) Maybe you changed avr-mcus.def to support more devices. This change will trigger more changes, for example to auto-generated documentation (texi) bits. This means you are basically a maintainer, which in turn means you migth have more jobs to do, or tools to use than a simple user who just builds GCC from source. 2) When you get the sources from some repo like git, the checked-out sources might have timestamps that don't reflect their true state. This triggers make to re-build auto-generated files, even though no prerequisite was changed and the targets need not be rebuilt. To fix this, run ./contrib/gcc_update --touch from the top-level source dir. This script will touch some source files and fix their timestamps. You obviously need write permission for that.
[Bug target/108287] AVR build: gcc/config/avr/t-avr tries to edit the source tree
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108287 --- Comment #5 from Georg-Johann Lay --- ...ok, yes, building outside srcdir won't fix this one. But points 1) and 2) still apply.
[Bug target/106307] error when I do a test on a pointer on Arduino 1.8.19
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106307 Georg-Johann Lay changed: What|Removed |Added Last reconfirmed||2023-01-21 Ever confirmed|0 |1 Status|UNCONFIRMED |WAITING
[Bug target/99435] avr: incorrect I/O address ranges for some cores
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99435 Georg-Johann Lay changed: What|Removed |Added Status|UNCONFIRMED |WAITING Ever confirmed|0 |1 Last reconfirmed||2023-01-25 --- Comment #2 from Georg-Johann Lay --- Still wainting for a reply.
[Bug target/100962] Poor optimization of AVR code when using structs in __flash
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100962 Georg-Johann Lay changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |WORKSFORME --- Comment #5 from Georg-Johann Lay --- The code is optimized fine with -Os. With -Og, you can expect less optimized code. For the provided code and -Og, you can improve code quality by means of -mstrict-X (where I am not sure whether it would be appropriate to have -mstrict-X as the default).
[Bug target/97276] A whole if-block is ignored by avr-gcc 9.3.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97276 Georg-Johann Lay changed: What|Removed |Added Target|atxmega32a4 |avr --- Comment #2 from Georg-Johann Lay --- Can you provide the pre-compiled source pwm.i? Just add -save-temps to the compile options.
[Bug target/97276] A whole if-block is ignored by avr-gcc 9.3.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97276 --- Comment #4 from Georg-Johann Lay --- Created attachment 54518 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54518&action=edit pwn-i.c pre-compiled test case Ok, I found it and attached a cleaned-up version. IIUC correctly, the relevant options you are using to compile are: -O1 -mmcu=atxmega32a4 -g -ggdb -std=gnu99 With these options (and with -fverbose-asm to easier navigate in asm) I could not reproduce the problem with avr-gcc v8.5. The respective part of .s reads (I dropped -g for legibility, but with -g it's same): ; pwm-i.c:287: if (last_brightness < 181 && j >= 181) ldi r30,lo8(-76) ; , ; 320 [c=4 l=1] movqi_insn/1 cp r30,r11 ; , last_brightness ; 193 [c=4 l=1] cmpqi3/1 brsh .+2 ; ; 194 [c=16 l=2] branch rjmp .L16; ; pwm-i.c:287: if (last_brightness < 181 && j >= 181) cpi r22,lo8(-75) ; iftmp.3_5, ; 196 [c=4 l=1] cmpqi3/2 brsh .+2 ; ; 197 [c=16 l=2] branch rjmp .L16; ; pwm-i.c:289: slot->top = 0xfe00; st X+,r8 ; tmp226 ; 200 [c=4 l=3] *movhi/3 st X,r9 ; tmp226 sbiw r26,1 ; pwm-i.c:290: slot->mask = ~mask; movw r30,r24 ; tmp200, mask ; 321 [c=4 l=1] *movhi/0 com r30 ; tmp200 ; 201 [c=8 l=2] one_cmplhi2 com r31 ; tmp200 ; pwm-i.c:290: slot->mask = ~mask; adiw r26,2 ; slot_172->mask ; 202 [c=4 l=4] *movhi/3 st X+,r30; tmp200 st X,r31 ; tmp200 sbiw r26,2+1 ; slot_172->mask ; pwm-i.c:291: ++slot; adiw r26,4 ; slot,; 203 [c=4 l=1] addhi3_clobber/0
[Bug target/97276] A whole if-block is ignored by avr-gcc 9.3.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97276 --- Comment #5 from Georg-Johann Lay --- ... also tried v9.2 via https://godbolt.org/z/9r3vMj1e3 and just like with v8.5, the respective block is around asm line 350.
[Bug rtl-optimization/90706] [10/11/12/13 Regression] Useless code generated for stack / register operations on AVR
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90706 Georg-Johann Lay changed: What|Removed |Added Known to work||8.5.0 --- Comment #19 from Georg-Johann Lay --- (In reply to CVS Commits from comment #18) > https://gcc.gnu.org/g:2639f9d2313664e6b4ed2f8131fefa60aeeb0518 > > commit r13-6424-g2639f9d2313664e6b4ed2f8131fefa60aeeb0518 > Author: Vladimir N. Makarov > Date: Thu Mar 2 16:29:05 2023 -0500 > > IRA: Use minimal cost for hard register movement Thank you; the code looks clean now. (For my test case from comment #16 I needed -fno-split wide-types which is a different story). Is there any chance your fix will be back-ported?
[Bug target/104988] Zero register (R1) clobbered by __udivmodsi4 for AVR
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104988 Georg-Johann Lay changed: What|Removed |Added Resolution|--- |INVALID Status|UNCONFIRMED |RESOLVED --- Comment #2 from Georg-Johann Lay --- As you already found out this PR is invalid, thus closing.
[Bug target/99184] [avr] wrong double to 16-Bit and 32-Bit integers in libgcc/libf7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99184 --- Comment #1 from Georg-Johann Lay --- As a work-around, one can cast to an intermediate 64-bit integer: // For [u]int64_t and uint32_t, do #include double x = 2.9; int x_int = (int) (int64_t) x; uint32_t x_u32 = (uint32_t) (uint64_t) x;
[Bug target/107201] New: [avr] -nodevicelib not working for devices -mmcu=avr...
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107201 Bug ID: 107201 Summary: [avr] -nodevicelib not working for devices -mmcu=avr... Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: gjl at gcc dot gnu.org Target Milestone: --- The -nodevicelib option can be used so that the executable is not linked against -l when a device is specified as -mmcu=. This is useful if such a library is not avilable. This is achieved by the following spec in the device-specs file specs-: *avrlibc_devicelib: %{!nodevicelib:-lavr64dd64} However, in a spec function, the driver in ./gcc/config/avr/driver-avr.c[c]::avr_devicespecs_file() removes that option because it thinks that -mmcu=avr* is a device *family* like avr25 or avrxmega2 etc.: #if defined (WITH_AVRLIBC) " %{mmcu=avr*:" X_NODEVLIB "} %{!mmcu=*:" X_NODEVLIB "}", #else where X_NODEVLIB resolves to "%
[Bug target/107201] [avr] -nodevicelib not working for devices -mmcu=avr...
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107201 --- Comment #1 from Georg-Johann Lay --- Created attachment 53691 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53691&action=edit pr107201.diff: Proposed patch. This proposed patch (effectively) sets macro X_NODEVLIB to "" in all of ./config/avr/driver-avr.cc. -nodevicelib is a known driver option from avr.opt, so there should be no need to explicitly remove it by hand by means of %
[Bug target/100962] Poor optimization of AVR code when using structs in __flash
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100962 --- Comment #4 from Georg-Johann Lay --- Did you try option -mstrictX? And try to make a problem-report self-contained.
[Bug libstdc++/101867] avr libc build error for libstdc++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101867 --- Comment #16 from Georg-Johann Lay --- --with-avrlibc is default, so setting it is void. C.f. install info.
[Bug target/103975] DWARF .debug_frame incorrect for ISRs on AVR; pushing SREG creates off-by-one error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103975 --- Comment #5 from Georg-Johann Lay --- If someone is going to fix this, the following changes might also play a role: * v8+ may emit optimized ISR prologues / epilogues using PR81268: gcc will just emit pseudo-instruction __gcc_isr which will be resolved by gas. Debug info might be incorrect or missing; gas would have to add respective debug info. * v12+ PR92729 changed condition code from implicit cc0 to explicit REG_CC and introduced a new hard register "cc" with hard register number REG_CC = 36. The highest hard regno before that transition was 35.
[Bug target/99435] avr: incorrect I/O address ranges for some cores
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99435 --- Comment #1 from Georg-Johann Lay --- I am really confused. To all of my knowledge, IN and OUT can address a range of 64 bytes. For example, the opcode of OUT is 1011 1AAr where "r" bits encode for the register number (2^5 = 32 of them) and "A" bits encode absolute target addresses (2^6 = 64 of them). So there isn't even enough space in the instruction encoding to provide an address range as clained by this PR. Similar for, say, SBI with opcode encoding 1011 1010 Abbb where "A" bits encode for absolute target address (2^5 = 32 of them) and "b" encode target bit number (2^3 = 8 of them). Are you sure you didn't just stumble upon a typo in the data sheet? All AVRs are using these encodings. The only difference is between Xmega and non-Xmega which use different, implicit SFR_OFFSETs (which don't affect the encoding or the number of address that can be encoded).
[Bug rtl-optimization/90706] [10/11/12/13 Regression] Useless code generated for stack / register operations on AVR
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90706 --- Comment #13 from Georg-Johann Lay --- Created attachment 53812 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53812&action=edit Test case with 32-bit integer. This problem is still present in current master (future v13) and also occurs with 32-bit integers. > avr-gcc -S -Os -mul.c -fdump-rtl-ira With v8, mul.s has 15 instructions. With newer versions, mul.s has 26 additional instructions: * 12 silly, useless stores into / loads from frame. * 12 instructions to setup the frame. * More instructions due to sub-optimal register alloc. * Uses 6 bytes stack frame where v8 needs no frame at all. In the IRA dump, there is: Pass 0 for finding pseudo/allocno costs a0 (r53,l0) best NO_REGS, allocno NO_REGS a2 (r49,l0) best GENERAL_REGS, allocno GENERAL_REGS a1 (r48,l0) best NO_REGS, allocno NO_REGS ... Pass 1 for finding pseudo/allocno costs r53: preferred NO_REGS, alternative NO_REGS, allocno NO_REGS r49: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS r48: preferred NO_REGS, alternative NO_REGS, allocno NO_REGS ... Spill a0(r53,l0) Spill a1(r48,l0) Allocno a2r49 of GENERAL_REGS(30) ... So there are 2 register spills for no reason that lead to that code bloat.
[Bug web/107610] Broken 'onlinedocs' after "Porting the Docs to Sphinx"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107610 --- Comment #4 from Georg-Johann Lay --- Also affected are other bits of the web page that are auto-generated, like https://gcc.gnu.org/install/configure.html And with the new URLs, "deep" links like https://gcc.gnu.org/install/configuration.html#avr ceased to work, too, even though the old ./gcc/doc/install.texi generated (working) anchors like: @html @end html @item --with-avrlibc So the porting-to-sphinx dropped them, which is really sad.
[Bug target/107842] New: [avr] Set --param=min-pagesize=0 in the backend
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107842 Bug ID: 107842 Summary: [avr] Set --param=min-pagesize=0 in the backend Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: gjl at gcc dot gnu.org Target Milestone: --- The AVR backend should set --param=min-pagesize=0 in v12+, or otherwise we will see warnings for each and every SFR access like: typedef __UINT8_TYPE__ uint8_t; #define SREG (*(volatile uint8_t*) (0x3F + __AVR_SFR_OFFSET__ )) void bar (void) { SREG = 0; } > avr-gcc -c foo-i.c -mmcu=atmega8 -Os -Wall foo-i.c: In function 'bar': foo-i.c:7:6: warning: array subscript 0 is outside array bounds of 'volatile uint8_t[0]' {aka 'volatile unsigned char[]'} [-Warray-bounds] 7 | SREG = 0; | ~^~~~
[Bug target/107842] [avr] Set --param=min-pagesize=0 in the backend
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107842 Georg-Johann Lay changed: What|Removed |Added Resolution|--- |DUPLICATE Status|UNCONFIRMED |RESOLVED --- Comment #1 from Georg-Johann Lay --- Dupe, but I don't know wheter only AVR is annoyed by this. *** This bug has been marked as a duplicate of bug 105523 ***
[Bug target/105523] Wrong warning array subscript [0] is outside array bounds
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105523 Georg-Johann Lay changed: What|Removed |Added CC||gjl at gcc dot gnu.org --- Comment #7 from Georg-Johann Lay --- *** Bug 107842 has been marked as a duplicate of this bug. ***
[Bug target/106307] error when I do a test on a pointer on Arduino 1.8.19
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106307 --- Comment #1 from Georg-Johann Lay --- We'd need at least a test case so we can reproduce th issue. Thanks.
[Bug libstdc++/104875] libstdc++-v3/src/c++11/codecvt.cc:312:24: warning: left shift count >= width of type
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104875 --- Comment #3 from Georg-Johann Lay --- Is this fixed now?
[Bug rtl-optimization/90706] [10/11/12/13 Regression] Useless code generated for stack / register operations on AVR
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90706 Georg-Johann Lay changed: What|Removed |Added CC||gjl at gcc dot gnu.org --- Comment #16 from Georg-Johann Lay --- Created attachment 54113 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54113&action=edit More elaborate C test case. This is a more complicated test case, compile with > avr-gcc -c pi-i.c -mmcu=atmega8 -Os -mcall-prologues -fno-tree-loop-optimize > -fno-move-loop-invariants && avr-size pi-i.o Code sizes are: 664 with avr-gcc v8.5 992 with avr-gcc v11.3 834 with avr-gcc master with the change from comment #13 So there is a clear improvement with patch #13, but size is still +25% compared to v8. What also has an effect is -fno-split-wide-types. The test case mostly operates on float; unfortunately I don't have a similar test-case for 32-bit integers at hand.
[Bug target/113824] New: AVR: ATA5795 in wrong multilib set
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113824 Bug ID: 113824 Summary: AVR: ATA5795 in wrong multilib set Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: gjl at gcc dot gnu.org Target Milestone: --- This device is currently filed in avr5, where according to https://github.com/avrdudes/avr-libc/issues/874#issuecomment-1933051758 is should be in avr4.
[Bug target/113824] AVR: ATA5795 in wrong multilib set
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113824 Georg-Johann Lay changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED --- Comment #4 from Georg-Johann Lay --- Fixed in v12.4 and v13.3+
[Bug target/113824] AVR: ATA5795 in wrong multilib set
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113824 Georg-Johann Lay changed: What|Removed |Added Target Milestone|--- |13.3
[Bug rtl-optimization/101188] [11/12/13 Regression] [postreload] Uses content of a clobbered register
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101188 Georg-Johann Lay changed: What|Removed |Added Resolution|FIXED |--- Status|RESOLVED|REOPENED Summary|[postreload] Uses content |[11/12/13 Regression] |of a clobbered register |[postreload] Uses content ||of a clobbered register --- Comment #19 from Georg-Johann Lay --- Reopened for back-porting.
[Bug target/105523] Wrong warning array subscript [0] is outside array bounds
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105523 Georg-Johann Lay changed: What|Removed |Added Target Milestone|--- |13.3 --- Comment #37 from Georg-Johann Lay --- Back-ported to v13.3
[Bug other/113927] New: [avr-tiny] Sets up a stack-frame even for trivial code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113927 Bug ID: 113927 Summary: [avr-tiny] Sets up a stack-frame even for trivial code Product: gcc Version: 13.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: other Assignee: unassigned at gcc dot gnu.org Reporter: gjl at gcc dot gnu.org Target Milestone: --- Code like char func (char c) { return c; } compiles as expected to func: /* prologue: function */ /* frame size = 0 */ /* stack size = 0 */ .L__stack_usage = 0 /* epilogue start */ ret with avr-gcc -S -Os -mmcu=attiny26 -da , but for attiny40 (Reduced Tiny with 16 GPRs only) the result is: func: push r28 push r29 push __tmp_reg__ in r28,__SP_L__ in r29,__SP_H__ /* prologue: function */ /* frame size = 1 */ /* stack size = 3 */ .L__stack_usage = 3 /* epilogue start */ pop __tmp_reg__ pop r29 pop r28 ret In .asmcons, i.e. just prior to register allocation, the code reads: (insn 13 4 2 2 (set (reg:QI 46) (reg:QI 24 r24 [ c ])) "main.c":2:1 86 {movqi_insn_split} (expr_list:REG_DEAD (reg:QI 24 r24 [ c ]) (nil))) (insn 2 13 3 2 (set (reg/v:QI 44 [ c ]) (reg:QI 46)) "main.c":2:1 86 {movqi_insn_split} (expr_list:REG_DEAD (reg:QI 46) (nil))) (note 3 2 10 2 NOTE_INSN_FUNCTION_BEG) (insn 10 3 11 2 (set (reg/i:QI 24 r24) (reg/v:QI 44 [ c ])) "main.c":4:1 86 {movqi_insn_split} (expr_list:REG_DEAD (reg/v:QI 44 [ c ]) (nil))) (insn 11 10 0 2 (use (reg/i:QI 24 r24)) "main.c":4:1 -1 (nil)) so everything is fine and this PR is not a dup of PR110093. According to Vladimir Makarov, PR110093 is because DFA cannot handle subregs, but the RTL code above does not have subregs. What's the case is that IRA has very high register costs, for example in .ira: Pass 0 for finding pseudo/allocno costs a1 (r46,l0) best NO_REGS, allocno NO_REGS a0 (r44,l0) best NO_REGS, allocno NO_REGS a0(r44,l0) costs: POINTER_X_REGS:65535000 POINTER_Y_REGS:65535000 POINTER_Z_REGS:65535000 BASE_POINTER_REGS:65535000 POINTER_REGS:65535000 SIMPLE_LD_REGS:65535000 GENERAL_REGS:65535000 MEM:3000 whereas the .ira for attiny26 (ordinary core with 32 GPRs): Pass 0 for finding pseudo/allocno costs a0 (r46,l0) best GENERAL_REGS, allocno GENERAL_REGS a0(r46,l0) costs: POINTER_X_REGS:4000 POINTER_Y_REGS:4000 POINTER_Z_REGS:4000 BASE_POINTER_REGS:4000 POINTER_REGS:4000 ADDW_REGS:4000 SIMPLE_LD_REGS:4000 LD_REGS:4000 NO_LD_REGS:4000 GENERAL_REGS:4000 MEM:4000 ../../source/gcc-master/configure --target=avr --disable-nls --with-dwarf2 --with-gnu-as --with-gnu-ld --disable-shared --enable-languages=c,c++
[Bug target/113927] [avr-tiny] Sets up a stack-frame even for trivial code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113927 Georg-Johann Lay changed: What|Removed |Added Resolution|--- |FIXED Keywords|missed-optimization | Target Milestone|--- |13.3 Component|other |target Status|UNCONFIRMED |RESOLVED --- Comment #3 from Georg-Johann Lay --- Fixed in v13.3+
[Bug target/113934] Switch avr to LRA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113934 --- Comment #1 from Georg-Johann Lay --- What's the LRA way to do LEGITIMIZE_RELOAD_ADDRESS?
[Bug other/113974] New: Attribute common ignored
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113974 Bug ID: 113974 Summary: Attribute common ignored Product: gcc Version: 13.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: other Assignee: unassigned at gcc dot gnu.org Reporter: gjl at gcc dot gnu.org Target Milestone: --- __attribute__((common,used)) static int cc; when this code is compiled with -S -fdata-sections then cc is not put into .lcomm (and is not .local .comm either): .section.bss.cc,"aw",@nobits .align 4 .type cc, @object .size cc, 4 cc: .zero 4 .ident "GCC: (GNU) 13.2.1 20231022" with -fno-data-sections, though, it works as expected: .local cc .comm cc,4,4
[Bug middle-end/113974] Attribute common ignored
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113974 --- Comment #3 from Georg-Johann Lay --- Then the documentation should make that clear that with -fno-data-sections the object goes in COMM, but with -fdata-sections it does not and the attribute "common" is ignored. Better still, the compiler would behave as documented irrespective of -f[no]-data-sections. This is an issue of the compiler, not of the assembler. Presumably clang just copied gcc behaviour back then?
[Bug target/97276] A whole if-block is ignored by avr-gcc 9.3.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97276 Georg-Johann Lay changed: What|Removed |Added Last reconfirmed||2024-02-20 Ever confirmed|0 |1 Status|UNCONFIRMED |WAITING
[Bug target/114100] New: [avr] Inefficient indirect addressing on Reduced Tiny
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114100 Bug ID: 114100 Summary: [avr] Inefficient indirect addressing on Reduced Tiny Product: gcc Version: 13.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: gjl at gcc dot gnu.org Target Milestone: --- The Reduced Tiny core does not support indirect addressing with offset, which basically means that every indirect memory access with a size of more than one byte is effectively POST_INC or PRE_DEC. The lack of that addressing mode is currently handled by pretending to support it, and then let the insn printers add and subtract again offsets as needed on the fly. For example, the following C code int vars[10]; void inc_var2 (void) { ++vars[2]; } is compiled to: ldi r30,lo8(vars) ; 14 [c=4 l=2] *movhi/4 ldi r31,hi8(vars) subi r30,lo8(-(4)); 15 [c=8 l=6] *movhi/2 sbci r31,hi8(-(4)) ld r20,Z+ ld r21,Z subi r30,lo8((4+1)) sbci r31,hi8((4+1)) subi r20,-1 ; 16 [c=4 l=2] *addhi3_clobber/1 sbci r21,-1 subi r30,lo8(-(4+1)); 17 [c=4 l=4] *movhi/3 sbci r31,hi8(-(4+1)) st Z,r21 st -Z,r20 where the code could be: ldi r30,lo8(vars+4); 28 [c=4 l=2] *movhi/4 ldi r31,hi8(vars+4) ld r20,Z+ ; 17 [c=8 l=2] *movhi/2 ld r21,Z+ subi r20,-1; 19 [c=4 l=2] *addhi3_clobber/1 sbci r21,-1 st -Z,r21 ; 30 [c=4 l=2] *movhi/3 st -Z,r20
[Bug target/114100] [avr] Inefficient indirect addressing on Reduced Tiny
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114100 Georg-Johann Lay changed: What|Removed |Added Target Milestone|--- |14.0 Priority|P3 |P4 Keywords||missed-optimization Target||avr
[Bug middle-end/114111] New: [avr] Expensive code instead of conditional branch.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114111 Bug ID: 114111 Summary: [avr] Expensive code instead of conditional branch. Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: gjl at gcc dot gnu.org Target Milestone: --- Created attachment 57541 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57541&action=edit addcc.c: C test case Compile the code with avr-gcc -S -Os -dp: int add_ge0 (int x, char c) { return x + (c >= 0); } int add_eq0 (int x, char c) { return x + (c == 0); } int add_le0 (int x, char c) { return x + (c <= 0); } int add_ge1 (int x, char c) { return x + (c >= 1); } int add_ltm3 (int x, char c) { return x + (c < -3); } int add_bit6 (int x, char c) { return x + !!(c & (1 << 6)); } int add_nbit6 (int x, char c) { return x + !(c & (1 << 6)); } All these could be performed by a test and the addition of x in an if-block. But what the compiler does is to extend the 8-bit value c to 16 bit, then complement it, then shift the MSB to the LSB: add_ge0: mov __tmp_reg__,r22 ; 23 [c=12 l=3] *extendqihi2/0 lsl r0 sbc r23,r23 com r22 ; 24 [c=8 l=2] *one_cmplhi2 com r23 bst r23,7; 31 [c=16 l=4] *lshrhi3_const/3 clr r22 clr r23 bld r22,0 add r24,r22 ; 26 [c=8 l=2] *addhi3/0 adc r25,r23 ret ; 29 [c=0 l=1] return Even when it does a conditional to set the addend, it should rather have the addition in the if-block (and moving x to R18 adds even more bloat): add_eq0: mov r18,r24 ; 44 [c=4 l=1] movqi_insn/0 mov r19,r25 ; 45 [c=4 l=1] movqi_insn/0 ldi r24,lo8(1) ; 46 [c=4 l=2] *movhi/4 ldi r25,0 cp r22, __zero_reg__ ; 47 [c=4 l=1] cmpqi3/0 breq .L3 ; 48 [c=4 l=1] branch ldi r24,0; 43 [c=4 l=2] *movhi/1 ldi r25,0 .L3: add r24,r18 ; 42 [c=8 l=2] *addhi3/0 adc r25,r19 ret ; 51 [c=0 l=1] return ... .ident "GCC: (GNU) 14.0.1 20240212 (experimental)" With avr-gcc 3.4.6 from around 2006, the generated code is as follows: add_ge0: sbrs r22,7 ; 38 *sbrx_branch[length = 2] adiw r24,1 ; 15 *addhi3/2 [length = 1] .L2: ret ; 37 return [length = 1] add_eq0: tst r22 ; 13 tstqi [length = 1] brne .L4 ; 14 branch [length = 1] adiw r24,1 ; 15 *addhi3/2 [length = 1] .L4: ret ; 35 return [length = 1] etc. So at some point in time GCC lost all that smartness. Appears to be around emit_stor_flag and friends; as far as I can see it doesn't even try to work out costs.
[Bug target/114132] New: [avr] Code sets up a frame pointer without need
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114132 Bug ID: 114132 Summary: [avr] Code sets up a frame pointer without need Product: gcc Version: 13.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: gjl at gcc dot gnu.org Target Milestone: --- $ avr-gcc -S -Os -mmcu=attiny40 of void funcab_c (long x, char c) { } sets up a frame-pointer without need. Arguments x and c occupy all of the argument registers R25..R20, so that no arg registers are left. Then there is this implementation of TARGET_FRAME_POINTER_REQUIRED in avr.cc: static bool avr_frame_pointer_required_p (void) { return (cfun->calls_alloca || cfun->calls_setjmp || cfun->has_nonlocal_label || crtl->args.info.nregs == 0 || get_frame_size () > 0); } Problem is that crtl->args.info.nregs == 0 does not discriminate between need for arg pointer and no need for arg pointer (but all arg regs are used up, like in the example).
[Bug target/114132] [avr] Code sets up a frame pointer without need
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114132 Georg-Johann Lay changed: What|Removed |Added Target Milestone|--- |14.0 Priority|P3 |P4 Target||avr
[Bug target/114132] [avr] Code sets up a frame pointer without need
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114132 Georg-Johann Lay changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED --- Comment #2 from Georg-Johann Lay --- Fixed in v14.
[Bug target/114100] [avr] Inefficient indirect addressing on Reduced Tiny
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114100 Georg-Johann Lay changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #3 from Georg-Johann Lay --- Improved in v14
[Bug other/114191] New: Flags "Warning" and "Target" don't mix well in target.opt files
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114191 Bug ID: 114191 Summary: Flags "Warning" and "Target" don't mix well in target.opt files Product: gcc Version: 13.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: other Assignee: unassigned at gcc dot gnu.org Reporter: gjl at gcc dot gnu.org Target Milestone: --- In an .opt file, a backend can define target-specific diagnostic options, for example gcc/config/avr/avr.opt has: Wmisspelled-isr Warning C C++ Var(avr_warn_misspelled_isr) Init(1) Warn if the ISR is misspelled, ... This is a "Target" option however (so it should be listed with --help=target, which it currently is not). However, specifying the "Target" flag in avr.opt makes the option no more recognizable: $ avr-gcc main.c -c -Wall -Wmisspelled-isr cc1: error: unrecognized command-line option '-Wmisspelled-isr' I can reproduce this for target avr, but it likely affects all other targets as well. Set the component to "other". As it appears, there is no bugzilla component for such internal problems.
[Bug rtl-optimization/114208] New: DSE deletes a store that is not dead
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114208 Bug ID: 114208 Summary: DSE deletes a store that is not dead Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: gjl at gcc dot gnu.org Target Milestone: --- Created attachment 57594 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57594&action=edit Reduced C test case $ avr-gcc -mmcu=attiny40 bug-dse.c -S -Os -dp -mfuse-add=3 -fdse the following C test case: struct S { char a, b; }; __attribute__((__noinline__,__noclone__)) void test (const struct S *s) { if (s->a != 3 || s->b != 4) __builtin_abort(); } int main (void) { struct S s = { 3, 4 }; test (&s); return 0; } Then with DSE off (-fno-dse), main has a store of 3 into s.a: main: ... ldi r20,lo8(3) ; 22 [c=4 l=1] movqi_insn/1 ld __tmp_reg__,Y+; 24 [c=4 l=1] *addhi3/3 st Y+,r20; 48 [c=4 l=1] movqi_insn/2 ldi r20,lo8(4) ; 27 [c=4 l=1] movqi_insn/1 st Y,r20 ; 30 [c=4 l=1] movqi_insn/2 ... but with DSE on, pass .dse2 removes the first store (insn 48, and in the wake also insn 22) that sets s.a to 3: main: ... ldi r20,lo8(4) ; 27 [c=4 l=1] movqi_insn/1 subi r28,-2 ; 29 [c=4 l=2] *addhi3/3 sbci r29,-1 st Y,r20 ; 30 [c=4 l=1] movqi_insn/2 ... Configured with: ../../source/gcc-master/configure --target=avr --disable-nls --with-dwarf2 --with-gnu-as --with-gnu-ld --disable-shared --enable-languages=c,c++ Thread model: single Supported LTO compression algorithms: zlib gcc version 14.0.1 20240302 (experimental) (GCC)
[Bug rtl-optimization/114208] RTL DSE deletes a store that is not dead
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114208 --- Comment #2 from Georg-Johann Lay --- (In reply to Andrew Pinski from comment #1) > I wonder if this is related to r14-6674-g4759383245ac97 . Not unlikely. PR112525 tries to eliminate dead stores for arguments that are passed. It seems like that change misses some required conditions like frame-pointer / arg-pointer adjustments.
[Bug rtl-optimization/114208] RTL DSE deletes a store that is not dead
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114208 --- Comment #3 from Georg-Johann Lay --- (In reply to Andrew Pinski from comment #1) > I wonder if this is related to r14-6674-g4759383245ac97 . Seems unrelated: When I reverse-apply r14-6674 then the issue does not go away.
[Bug other/114191] Flags "Warning" and "Target" don't mix well in target.opt files
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114191 --- Comment #2 from Georg-Johann Lay --- (In reply to Richard Biener from comment #1) > Wmisspelled-isr > Target C C++ Var(avr_warn_misspelled_isr) Init(1) > Warn if the ISR is misspelled, ... > > should eventually work? With that, the warnings appear as they should, but the option is not recognized: $ avr-gcc signal.c -S -Wmisspelled-isr error: unrecognized command-line option '-Wmisspelled-isr' $ avr-gcc signal.c -S -Wno-misspelled-isr error: unrecognized command-line option '-Wno-misspelled-isr' $ avr-gcc signal.c -S -Werror=misspelled-isr error: '-Werror=misspelled-isr': '-Wmisspelled-isr' is not an option that controls warnings
[Bug other/114191] Flags "Warning" and "Target" don't mix well in target.opt files
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114191 --- Comment #3 from Georg-Johann Lay --- (In reply to Richard Biener from comment #1) > How did you specify 'Target'? Like: Wmisspelled-isr Target Warning C C++ Var(avr_warn_misspelled_isr) Init(1) Warn if the ISR is misspelled, ...
[Bug rtl-optimization/114208] RTL DSE deletes a store that is not dead
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114208 --- Comment #5 from Georg-Johann Lay --- (In reply to Richard Biener from comment #4) > Did it ever work? No. I allowed -mfuse-add=3 to reproduce this PR because there seems to be a problem with DSE, and for the case that someone is going to fix it before it bites an important target. The mfuse-add optimization tries to avoid the broken parts of DSE and works around it; documented are only -mfuse-add=0...2 It was added Feb 2024 as PR114100. > I suppose 'st Y+,r20 is' post-inc so maybe DSE mishandles this somehow. That post-inc is only generated after .dse2: .split2 splits some move insns: These cores don't have reg+offset addressing, so the backend must pretend to support it. Then .split2 generates pointer-adjust + mem-access + undo-pointer-adjust. The address adjustments are plain additions of the address register (frame pointer in this case) and have according REG_CFA_ADJUST_CFA notes. Then .dse2 removes some non-dead stores. The 'st Y+,r20' you mentioned is only generated by .avr-fuse-add which runs after .dse2. I'd guess that GCC is not ready for targets with such tight addressing modes? (without reg+offset addressing; stack-pointer cannot be used either, the only SP accesses are PUSH and POP). ad "needs-bisection": -mfuse-add is a new target optimization added as PR114100 in Feb 2024, so bi-secting won't work because -mfuse-add is not recognized prior to that date.
[Bug rtl-optimization/114243] New: -fsplit-wide-types bloats code by more than 50%
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114243 Bug ID: 114243 Summary: -fsplit-wide-types bloats code by more than 50% Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: gjl at gcc dot gnu.org Target Milestone: --- Created attachment 57616 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57616&action=edit pi-sigma.c: C99 test case Compile the attached test case with: $ avr-gcc pi-sigma.c -c -Os -mmcu=atmega8 -fstack-usage && avr-size pi-sigma.o Then the code sizes are for respective versions of the compiler: avr-gcc-v8: 624 avr-gcc-v14: 1008 which is an increase of code size of more than 60% ! The stack usage also increases by a lot. According to pi-sigma.su: avr-gcc-v8: --- pi-sigma.c:80:7:sigma 30 static pi-sigma.c:86:7:pi_n14 static avr-gcc-v14: pi-sigma.c:80:7:sigma 86 static pi-sigma.c:86:7:pi_n36 static That is for the 1st function the stack use almost triples! With -fno-split-wide-types the performace of v14 code is similar to v8. Target: avr Configured with: ../../source/gcc-master/configure --target=avr --disable-nls --with-dwarf2 --with-gnu-as --with-gnu-ld --disable-shared --enable-languages=c,c++ Thread model: single Supported LTO compression algorithms: zlib gcc version 14.0.1 20240303 (experimental) (GCC)
[Bug rtl-optimization/90706] [10/11/12/13 Regression] Useless code generated for stack / register operations on AVR
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90706 --- Comment #24 from Georg-Johann Lay --- (In reply to Georg-Johann Lay from comment #23) > As it appears, this bug is not fixed completely. For the -mmcu=avrtiny > architecture, there is still bloat for even the smallest test cases like: Different story, f'up to PR113927.
[Bug rtl-optimization/114243] [avr] -fsplit-wide-types bloats code by more than 50%
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114243 --- Comment #1 from Georg-Johann Lay --- May be related to PR110093. As Vladimir noted in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110093#c5 the problem is that data flow analysis cannot cope with the subregs generated from lower-subregs, and register alloc chokes at it.
[Bug target/81473] [avr] build fails due to INT8_MIN and friends.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81473 --- Comment #4 from Georg-Johann Lay --- This was fixed long ago.
[Bug tree-optimization/114252] New: Introducing bswapsi reduces code performance
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114252 Bug ID: 114252 Summary: Introducing bswapsi reduces code performance Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: gjl at gcc dot gnu.org Target Milestone: --- Created attachment 57628 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57628&action=edit GNU-C test case typedef __UINT8_TYPE__ uint8_t; typedef __UINT32_TYPE__ uint32_t; typedef uint8_t __attribute__((vector_size(4))) v4u8_t; uint32_t func1 (const uint8_t *buf) { v4u8_t v4 = { buf[1], buf[0], buf[3], buf[2] }; return (uint32_t) v4; } Compile the code with $ avr-gcc code.c -S -Os -dp with v13 the result is: func1: mov r30,r24 ; 37 [c=4 l=1] movqi_insn/0 mov r31,r25 ; 38 [c=4 l=1] movqi_insn/0 ldd r22,Z+1 ; 39 [c=4 l=1] movqi_insn/3 ld r23,Z ; 40 [c=4 l=1] movqi_insn/3 ldd r24,Z+3 ; 41 [c=4 l=1] movqi_insn/3 ldd r25,Z+2 ; 42 [c=4 l=1] movqi_insn/3 /* epilogue start */ ret ; 45 [c=0 l=1] return which is good code: insn 37, 38 move the address to pointer register Z, and then follow 4 loads, one for each byte. When compiled with v14 however: func1: mov r30,r24 ; 23 [c=4 l=2] *movhi/0 mov r31,r25 ld r22,Z ; 24 [c=16 l=4] *movsi/2 ldd r23,Z+1 ldd r24,Z+2 ldd r25,Z+3 rcall __bswapsi2 ; 25 [c=16 l=1] *bswapsi2.libgcc mov r31,r23 ; 32 [c=4 l=1] movqi_insn/0 mov r23,r25 ; 33 [c=4 l=1] movqi_insn/0 mov r25,r31 ; 34 [c=4 l=1] movqi_insn/0 mov r31,r22 ; 35 [c=4 l=1] movqi_insn/0 mov r22,r24 ; 36 [c=4 l=1] movqi_insn/0 mov r24,r31 ; 37 [c=4 l=1] movqi_insn/0 /* epilogue start */ ret ; 40 [c=0 l=1] return Target: avr Configured with: ../../source/gcc-master/configure --target=avr --disable-nls --with-dwarf2 --with-gnu-as --with-gnu-ld --disable-shared --enable-languages=c,c++ Thread model: single Supported LTO compression algorithms: zlib gcc version 14.0.1 20240303 (experimental) (GCC)
[Bug target/114252] Introducing bswapsi reduces code performance
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114252 --- Comment #3 from Georg-Johann Lay --- (In reply to Richard Biener from comment #1) > but somehow we end up doing a libcall? It's not a libcall in the GCC sense, for the compiler it's just an ordinary insn. The backend then prints this as a transparent call to libgcc. Purpose is that many functions have a small, known footprint as they are implemented in assembly. An ordinary call would clobber all callee-used regs, so using a transparent call gives better code than a real call. Notice this is the nsn: (define_insn "*bswapsi2.libgcc" [(set (reg:SI 22) (bswap:SI (reg:SI 22))) (clobber (reg:CC REG_CC))] "reload_completed" "%~call __bswapsi2" [(set_attr "type" "xcall")]) However, for the purpose of this PR, no bswap is needed in the 1st place; just have a look at the v13 code. It just loads the bytes as they belong into the target value; while v14 loads all 32 bits in one chunk and then starts fiddling and moving around the constituent bytes.
[Bug target/114252] Introducing bswapsi reduces code performance
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114252 --- Comment #5 from Georg-Johann Lay --- (In reply to Richard Biener from comment #4) > So bswap on a value is just register shuffling, right? The point is that there is no need for bswap in the first place, just have a look at the code that v13 generates. It's 4 QI loads and that's it, no shuffling required at all. But v14 dropped that, and the bswapsi (presumably due to previous flawed tree optmizations) is introduced by some tree pass. There's nothing the backend can do about it. So would you explain why you think it's a "target" issue? Maybe the PR title I used is confusing and does not hit the point?
[Bug target/114252] Introducing bswapsi reduces code performance
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114252 --- Comment #8 from Georg-Johann Lay --- (In reply to Richard Biener from comment #7) > Note I do understand what you are saying, just the middle-end in detecting > and using __builtin_bswap32 does what it does everywhere else - it checks > whether the target implements the operation. > > The middle-end doesn't try to actually compare costs (it has no idea of the > bswapsi costs), But even when the bswapsi insn costs nothing, the v14 code has these additional 6 movqi insns 32...37 compared to v13 code. In order to have the same performance like v13 code, a bswapsi would have to cost negative 6 insns. And an optimizer that assumes negative costs is not reasonable, in particular because the recognition of bswap opportunities serves optimization -- or is supposed to serve it as far as I understand. > and it most definitely doesn't see how AVR is special in > having only QImode registers and thus the created SImode load (which the > target supports!) will end up as four registers. Even when the bswap insn would cost nothing the code is worse. > The only thing that maybe would make sense with AVR exposing bswapsi is > users calling __builtin_bswap but since it always expands as a libcall > even that makes no sense. It makes perfect sense when C/C++ code uses __builtin_bswap32: * With current bswapsi insn, the code does a call that performs SI:22 = bswap(SI:22) with NO additionall register pressure. * Without bswap insn, the code does a real ABI call that performs SI:22 = bswap(SI:22) PLUS IT CLOBBERS r18, r19, r20, r21, r26, r27, r30 and r31; which are the most powerful GPRs. > So my preferred fix would be to remove bswapsi from avr.md? Is there a way that the backend can fold a call to an insn that performs better that a call? Like in TARGET_FOLD_BUILTIN? As far as I know, the backend can only fold target builtins, but not common builtins? Tree fold cannot fold to an insn obviously, but it could fold to inline asm, no? Or can the target change an optabs entry so it expands to an insn that's more profitable that a respective call? (like avr.md's bswap insn with transparent call is more profitable than a real call). The avr backend does this for many other stuff, too: divmod, SI and PSI multiplications, parity, popcount, clz, ffs, > Does it benefit from recognizing bswap done with shifts on an int? I don't fully understand that question. You mean to write code that shifts bytes around like in uint32_t res = 0; res |= ((uint32_t) buf[0]) << 24; res |= ((uint32_t) buf[1]) << 16; res |= (uint32_t) buf[2] << 8; res |= buf[3]; return res; is better than a bswapsi call?
[Bug target/114252] Introducing bswapsi reduces code performance
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114252 --- Comment #9 from Georg-Johann Lay --- ...and I don't see why a register allocator would or should fix flaws from tree optimizers.
[Bug target/114252] Introducing bswapsi reduces code performance
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114252 --- Comment #12 from Georg-Johann Lay --- (In reply to Richard Biener from comment #10) > I think the target controls the "libcall" ABI that's used for calls to > libgcc, You have a pointer how to do it or an example? IIRC I looked into it quite a while ago, and it didn't allow to specify/adjust call_used_regs[] etc. > I think the target should implement an inline bswap, possibly via a > define_insn_and_split or define_split so the byte ops are only exposed > at a desired point; important points being lower_subreg (split-wide-types) > and register allocation - possibly lower_subreg should itself know > how to handle bswap (though the degenerate AVR case is quite special). That would result in SUBREGs all over the place. As Vladimir pointed out in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110093#c5 DFA doesn't handle subregs properly, and register alloc then uses extra reloads, bloating the code (not only in PR110093 but also 114243. Unlikely any pass will untangle the mess of four (set (subreg:QI (SI)) (subreg:QI (SI))) > Yeah. Or comparing to open-coding the bswap without going through the call. > I don't have a AVR libgcc around, but libgcc2.s has > > #ifdef L_bswapsi2 > SItype > __bswapsi2 (SItype u) > { > return u) & 0xff00u) >> 24) > | (((u) & 0x00ffu) >> 8) > | (((u) & 0xff00u) << 8) > | (((u) & 0x00ffu) << 24)); > } > #endif The libgcc side is not a problem at all, libgcc/config/avr/lib1funcs.S has: ;; swap two registers with different register number .macro bswap a, b eor \a, \b eor \b, \a eor \a, \b .endm #if defined (L_bswapsi2) ;; swap bytes ;; r25:r22 = bswap32 (r25:r22) DEFUN __bswapsi2 bswap r22, r25 bswap r23, r24 ret ENDF __bswapsi2 #endif /* defined (L_bswapsi2) */ #if defined (L_bswapdi2) ;; swap bytes ;; r25:r18 = bswap64 (r25:r18) DEFUN __bswapdi2 bswap r18, r25 bswap r19, r24 bswap r20, r23 bswap r21, r22 ret ENDF __bswapdi2 #endif /* defined (L_bswapdi2) */ There's currently no handcrafted bswap16 though.
[Bug target/114252] Introducing bswapsi reduces code performance
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114252 --- Comment #14 from Georg-Johann Lay --- The code in the example is not a perfect bswap, it needs additional shuffling of bytes. The tree passes must know that bswap is not a perfect fit. There must be *some* criterion that depends on the permutation, and when a bswap is closer to the bswapped-permutation that a non-bswapped permutation is to the original one.
[Bug target/110220] [13/14 Regression] ICE in patch_jump_insn, at cfgrtl.cc:1295 - avr/xmega
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110220 Georg-Johann Lay changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED Component|rtl-optimization|target --- Comment #10 from Georg-Johann Lay --- Fixed in v13.3+.
[Bug target/105523] Wrong warning array subscript [0] is outside array bounds
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105523 --- Comment #34 from Georg-Johann Lay --- @Senthil: Can this PR be closed? Or will it be backported?
[Bug target/96055] avr: atmega324pb not supported
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96055 Georg-Johann Lay changed: What|Removed |Added Known to work||12.1.0 Resolution|--- |FIXED Severity|normal |enhancement Priority|P3 |P5 Status|UNCONFIRMED |RESOLVED --- Comment #3 from Georg-Johann Lay --- Closed as fixed in v12+. ATmega324PB is present in the sources (gcc/config/avr/avr-mcus.def, Author=Matwey V. Kornilov) since v12.1 at least. If you want to use it with older versions of the compiler (and newer than v5.1), please follow the explanation in the avr-gcc wiki at https://gcc.gnu.org/wiki/avr-gcc#avr-gcc_v5_and_newer
[Bug target/53935] [avr][c++] missing warning for non-const data in progmem
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53935 Georg-Johann Lay changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED Known to work||8.1.0 Keywords||addr-space --- Comment #2 from Georg-Johann Lay --- Closed as fixed in v7+.
[Bug other/109910] GCC prologue/epilogue saves/restores callee-saved registers that are never changed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109910 Georg-Johann Lay changed: What|Removed |Added Last reconfirmed||2023-08-04 Status|UNCONFIRMED |NEW Ever confirmed|0 |1
[Bug target/105523] Wrong warning array subscript [0] is outside array bounds
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105523 Georg-Johann Lay changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #35 from Georg-Johann Lay --- Fixed in v14.
[Bug tree-optimization/56456] [meta-bug] bogus/missing -Warray-bounds
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56456 Bug 56456 depends on bug 105523, which changed state. Bug 105523 Summary: Wrong warning array subscript [0] is outside array bounds https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105523 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED
[Bug rtl-optimization/101188] [postreload] Uses content of a clobbered register
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101188 Georg-Johann Lay changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #18 from Georg-Johann Lay --- Fixed in v14.
[Bug rtl-optimization/110093] [12/13/14 Regression][avr] Move frenzy leading to code bloat
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110093 --- Comment #2 from Georg-Johann Lay --- Meanwhile (2023-08-22) the generated code from above got worse once again and even pops a frame: long add (long aa, long bb, long cc) { if (cc < 0) return aa - cc; return aa + bb; } > avr-gcc -Os -S -dp add: push r4 ; 83 [c=4 l=1] pushqi1/0 push r5 ; 84 [c=4 l=1] pushqi1/0 push r6 ; 85 [c=4 l=1] pushqi1/0 push r7 ; 86 [c=4 l=1] pushqi1/0 push r8 ; 87 [c=4 l=1] pushqi1/0 push r9 ; 88 [c=4 l=1] pushqi1/0 push r10 ; 89 [c=4 l=1] pushqi1/0 push r11 ; 90 [c=4 l=1] pushqi1/0 push r14 ; 91 [c=4 l=1] pushqi1/0 push r15 ; 92 [c=4 l=1] pushqi1/0 push r16 ; 93 [c=4 l=1] pushqi1/0 push r17 ; 94 [c=4 l=1] pushqi1/0 push r28 ; 95 [c=4 l=1] pushqi1/0 push r29 ; 96 [c=4 l=1] pushqi1/0 ; SP -= 4 ; 100 [c=4 l=2] *addhi3_sp rcall . rcall . in r28,__SP_L__ ; 126 [c=4 l=2] *movhi/7 in r29,__SP_H__ /* prologue: function */ /* frame size = 4 */ /* stack size = 18 */ .L__stack_usage = 18 mov r8,r22 ; 69 [c=4 l=1] movqi_insn/0 mov r9,r23 ; 70 [c=4 l=1] movqi_insn/0 mov r10,r24 ; 71 [c=4 l=1] movqi_insn/0 mov r11,r25 ; 72 [c=4 l=1] movqi_insn/0 std Y+1,r18 ; 73 [c=4 l=1] movqi_insn/2 std Y+2,r19 ; 74 [c=4 l=1] movqi_insn/2 std Y+3,r20 ; 75 [c=4 l=1] movqi_insn/2 std Y+4,r21 ; 76 [c=4 l=1] movqi_insn/2 mov r4,r14 ; 77 [c=4 l=1] movqi_insn/0 mov r5,r15 ; 78 [c=4 l=1] movqi_insn/0 mov r6,r16 ; 79 [c=4 l=1] movqi_insn/0 mov r7,r17 ; 80 [c=4 l=1] movqi_insn/0 sbrs r7,7; 123 [c=4 l=2] *sbrx_branchhi rjmp .L2 mov r25,r11 ; 67 [c=4 l=4] *movsi/0 mov r24,r10 mov r23,r9 mov r22,r8 sub r22,r4 ; 68 [c=16 l=4] *subsi3/0 sbc r23,r5 sbc r24,r6 sbc r25,r7 .L1: /* epilogue start */ ; SP += 4 ; 106 [c=4 l=4] *addhi3_sp pop __tmp_reg__ pop __tmp_reg__ pop __tmp_reg__ pop __tmp_reg__ pop r29 ; 107 [c=4 l=1] popqi pop r28 ; 108 [c=4 l=1] popqi pop r17 ; 109 [c=4 l=1] popqi pop r16 ; 110 [c=4 l=1] popqi pop r15 ; 111 [c=4 l=1] popqi pop r14 ; 112 [c=4 l=1] popqi pop r11 ; 113 [c=4 l=1] popqi pop r10 ; 114 [c=4 l=1] popqi pop r9 ; 115 [c=4 l=1] popqi pop r8 ; 116 [c=4 l=1] popqi pop r7 ; 117 [c=4 l=1] popqi pop r6 ; 118 [c=4 l=1] popqi pop r5 ; 119 [c=4 l=1] popqi pop r4 ; 120 [c=4 l=1] popqi ret ; 121 [c=0 l=1] return_from_epilogue .L2: ldd r22,Y+1 ; 65 [c=16 l=4] *movsi/2 ldd r23,Y+2 ldd r24,Y+3 ldd r25,Y+4 add r22,r8 ; 66 [c=16 l=4] *addsi3/0 adc r23,r9 adc r24,r10 adc r25,r11 rjmp .L1 ; 124 [c=4 l=1] jump .ident "GCC: (GNU) 14.0.0 20230822 (experimental)"
[Bug rtl-optimization/110093] [12/13/14 Regression][avr] Move frenzy leading to code bloat
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110093 Georg-Johann Lay changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2023-08-22 Status|UNCONFIRMED |NEW
[Bug rtl-optimization/110093] [12/13/14 Regression][avr] Move frenzy leading to code bloat
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110093 --- Comment #4 from Georg-Johann Lay --- (In reply to Vladimir Makarov from comment #3) > I propose to avoid the above RTL code by switching off subreg3 > pass (or -fsplit-wide-types) for AVR by default as it was for gcc-8. Thanks for looking into this. With v8, I don't see a difference with -f[no-]split-wide-types, everything works fine. Since v10 r280033 the default is -fsplit-wide-types-early, but that option has no effect on testcase + master, only -fno-split-wide-types seems to "fix" the problem, regardless of -f[no-]split-wide-types-early. >From my experience, -fno-split-wide-types has no clear edge over -fsplit-wide-types, which very much depends on the code. This is the reason why -fsplit-wide-types is still the default. So are you saying that the bug is actually in lower-subreg.cc ?
[Bug libstdc++/111639] HAVE_ACOSF etc. are wrong on avr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111639 --- Comment #2 from Georg-Johann Lay --- (In reply to Jonathan Wakely from comment #0) > The in avr-libc does things like this: > > extern double acos(double __x) __ATTR_CONST__; > #define acosf acos/**< The alias for acos(). */ This is no more the case with current AVR-LibC, which uses proper prototypes and symbols for acos, acosf and acosl etc. Here is math.h from the AVR-LibC v2.1 release (Jan 2022) : https://github.com/avrdudes/avr-libc/blob/c466ef11ebf6cf774b7148dbd78c250789989ce0/include/math.h (which has only acos and acosf, where the alias is implemented using assembly name __asm("")). The next release will also include long double prototypes, and they are proper prototypes (without __asm("") names). math.h from current HEAD: https://github.com/avrdudes/avr-libc/blob/main/include/math.h
[Bug libstdc++/111639] HAVE_ACOSF etc. are wrong on avr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111639 --- Comment #4 from Georg-Johann Lay --- (In reply to Jonathan Wakely from comment #3) > Which versions of avr-libc are supported with gcc? The versions are only very loosely coupled. Anything from AVR-LibC v1.8 on (or maybe even older) should be fine with avr-gcc v5+.
[Bug libstdc++/111639] HAVE_ACOSF etc. are wrong on avr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111639 --- Comment #6 from Georg-Johann Lay --- May I ask, are you working on getting libstdc++ to work for avr?
[Bug c++/43745] [avr] g++ puts VTABLES in SRAM
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43745 Georg-Johann Lay changed: What|Removed |Added Last reconfirmed|2012-01-07 00:00:00 |2024-8-2 Status|RESOLVED|NEW Resolution|WONTFIX |--- Version|4.7.0 |15.0
[Bug target/116295] [avr] unrecognizable insn when loading from address-space __flash
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116295 Georg-Johann Lay changed: What|Removed |Added Keywords||addr-space, ||ice-on-valid-code Assignee|unassigned at gcc dot gnu.org |gjl at gcc dot gnu.org Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2024-08-08 Ever confirmed|0 |1 Target||avr Target Milestone|--- |15.0
[Bug target/116295] New: [avr] unrecognizable insn when loading from address-space __flash
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116295 Bug ID: 116295 Summary: [avr] unrecognizable insn when loading from address-space __flash Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: gjl at gcc dot gnu.org Target Milestone: --- Created attachment 58877 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58877&action=edit ice-flash.c: GNU-C99 test case long val; const __flash long* load4_flash (const __flash long *p) { val += *p++; val += *p++; return p; } triggers an ICE when compiled with $ avr-gcc ice-flash.c -S -Os It occurs in some situations when a value from __flash is loaded: * The device has no LPMx instruction. * More then 2 bytes are loaded. * Pass mfuse-add finds an optimization opportunity. The bug can be worked around with -mno-fuse-add. ice-flash.c: In function 'load4_flash': ice-flash.c:8:1: error: unrecognizable insn: 8 | } | ^ (insn 52 36 9 2 (parallel [ (set (reg:SI 22 r22) (mem:SI (post_inc:HI (reg:HI 30 r30)) [1 S4 A8 AS1])) (clobber (reg:CC 36 cc)) ]) "ice-flash.c":5:9 -1 (expr_list:REG_UNUSED (reg:CC 36 cc) (expr_list:REG_INC (reg:HI 30 r30) (nil during RTL pass: cprop_hardreg ice-flash.c:8:1: internal compiler error: in extract_insn, at recog.cc:2848
[Bug target/116295] [avr] unrecognizable insn when loading from address-space __flash
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116295 Georg-Johann Lay changed: What|Removed |Added Target Milestone|15.0|14.3 Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #3 from Georg-Johann Lay --- Fixed in v14.3+
[Bug target/113934] Switch avr to LRA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113934 --- Comment #4 from Georg-Johann Lay --- Would someone please explain what has to be done? It's likely more than just #define TARGET_LRA_P hook_bool_void_true
[Bug target/113934] Switch avr to LRA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113934 --- Comment #6 from Georg-Johann Lay --- ...to be more specific: TARGET_CANNOT_SUBSTITUTE_MEM_EQUIV_P explains the function of the hook from the perspective of someone who is implementing a register allocator, but there is no explanation whether it is a good idea (or even required) to implement it for some specific target. What form can "subst" take? When it's purpose it to avoid spills, then why not always true? (Nobody wants stills when they can be avoided). TARGET_LEGITIMIZE_ADDRESS_DISPLACEMENT: How would I describe addressing capabilities for different named address-spaces? What kind of target code can I use to investigate the effect of the hook? Or can it inferred simply from the device's register layout? TARGET_SPILL_CLASS: Can't we just return GENERAL_REGS as a spill class? TARGET_COMPUTE_PRESSURE_CLASSES: Requests that we should compute pressure classes. Now I know everything about it ...kidding. Again it's from the perspective of someone who is writing a register allocator, but of no use for someone who has to provide an implementation. TARGET_ADDITIONAL_ALLOCNO_CLASS_P: Similar issue. TARGET_REGISTER_PRIORITY: When some registers are preferred over others and hence we give them a higher priority, might that lead to more MOVs or spills? Finally: Who will fix fallout like ICEs (spill fails), performance issues, etc? Just reporting them here as PR will likely not help much, because AVR is ternary and hence any PR has priority P4 or less. For example, Newlib dropped AVR support because nobody did fix all the spill fail ICEs when building Newlib for AVR. lra just perform 2 rounds, and when it doesn't find an allocation it just bails out with spill fail ICE.
[Bug target/113934] Switch avr to LRA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113934 --- Comment #7 from Georg-Johann Lay --- ...more questions: What's the connexion between TARGET_REGISTER_PRIORITY and ADJUST_REG_ALLOC_ORDER / reg_alloc_order[]. What about reload_completed? Does semantics stay the same? What about reg_renumber[]. And reload_in_progress becomes lra_in_progress or what?
[Bug target/113934] Switch avr to LRA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113934 --- Comment #8 from Georg-Johann Lay --- ...more questions: TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS: Same issue: This hook can change a reload class. The purpose is clear for regalloc guys, but when and d why and how would I do it for a specific backend? The hook has two "reg_class_t" parameters as inputs, and no parameter does even have a name. "default hook always returns given class" ... Which one? There are two indestinguishible ones.
[Bug rtl-optimization/116321] New: [lra][avr] internal compiler error: in avr_out_lpm_no_lpmx, at config/avr/avr.cc:4572
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116321 Bug ID: 116321 Summary: [lra][avr] internal compiler error: in avr_out_lpm_no_lpmx, at config/avr/avr.cc:4572 Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: gjl at gcc dot gnu.org Target Milestone: --- typedef __UINT64_TYPE__ uint64_t; uint64_t fun64 (const __flash uint64_t *p) { return *p; } runs into an ICE: $ avr-gcc lra-bug.c -S -Os -da -mlra during RTL pass: shorten dump file: lra-bug.c.354r.shorten lra-bug.c: In function 'fun64': lra-bug.c:6:1: internal compiler error: in avr_out_lpm_no_lpmx, at config/avr/avr.cc:4572 6 | } | ^ The respective line in avr.cc reads: gcc_assert (REG_Z == REGNO (addr)); because the only addressing modes for AS1 __flash are REG and POST_INC of REG_Z (reg:HI 30). However, the insn fed into the function as produced by LRA is like found in lra-bug.c.317r.reload: (insn 48 47 49 2 (set (reg:QI 25 r25 [+7 ]) (mem:QI (reg/f:HI 28 r28 [60]) [1 *p_2(D)+7 S1 A8 AS1])) "lra-bug.c":6:1 86 {movqi_insn_split} (nil)) This insn clearly violates avr.cc's REGNO_MODE_CODE_OK_FOR_BASE_P which only allows REG_Z (regno 30) as register for non-generic address-spaces like AS1. And avr.cc'c MODE_CODE_BASE_REG_CLASS has: if (!ADDR_SPACE_GENERIC_P (as)) { return POINTER_Z_REGS; } but reg:HI 28 in insn 48 is not an element of POINTER_Z_REGS. Target: avr Configured with: ../../source/gcc-master/configure --target=avr --disable-nls --with-dwarf2 --with-gnu-as --with-gnu-ld --disable-shared --with-long-double=64 --enable-languages=c,c++
[Bug rtl-optimization/116321] [lra][avr] internal compiler error: in avr_out_lpm_no_lpmx, at config/avr/avr.cc:4572
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116321 Georg-Johann Lay changed: What|Removed |Added Keywords||ice-on-valid-code, ra Status|UNCONFIRMED |NEW Last reconfirmed||2024-08-10 Blocks||113934, 113932, 56183 Target||avr Ever confirmed|0 |1 Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56183 [Bug 56183] [meta-bug][avr] Problems with register allocation https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113932 [Bug 113932] [meta-bug] Targets which should be ported to LRA https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113934 [Bug 113934] Switch avr to LRA
[Bug target/113934] Switch avr to LRA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113934 --- Comment #10 from Georg-Johann Lay --- (In reply to Segher Boessenkool from comment #9) > (In reply to Georg-Johann Lay from comment #4) > > Would someone please explain what has to be done? > > > > It's likely more than just > > > > #define TARGET_LRA_P hook_bool_void_true > > That is what you start with, though. Or more likely, you have a -mlra > flag to enable/disable it during development. You can do that *right now*, > and that enables other people to help you out with this, etc. :-) Done: https://gcc.gnu.org/r15-2865 > Possibly some things will not work. Ya, it's easier to break than I thought. LRA already breaks for one of the random programs I had lying around: PR116321
[Bug other/116322] New: regenerate-opt-urls.py usage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116322 Bug ID: 116322 Summary: regenerate-opt-urls.py usage Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: other Assignee: unassigned at gcc dot gnu.org Reporter: gjl at gcc dot gnu.org Target Milestone: --- $ ./regenerate-opt-urls.py -h usage: regenerate-opt-urls.py [-h] [--unit-test] base_html_dir src_gcc_dir [...] Usage (from build/gcc subdirectory): ../../src/gcc/regenerate-opt-urls.py HTML/gcc-14.0.0/ ../../src Running the script terminates with an error: $ ../../../source/gcc-master/gcc/regenerate-opt-urls.py HTML/gcc-15.0.0/ ../../../source/gcc-master/ [...] FileNotFoundError: [Errno 2] No such file or directory: 'HTML/gcc-15.0.0/gdc/Option-Index.html' The problem is obviously that GCC hasn't been configured for D, which is clear because the target does not support D. The regenerate-opt-urls.py should document how to re-generate onyl specific option files, namely the one that is associated to a changed .opt file (somewhere in gcc/config/$target).
[Bug rtl-optimization/116324] New: [lra] error: inconsistent operand constraints in an 'asm'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116324 Bug ID: 116324 Summary: [lra] error: inconsistent operand constraints in an 'asm' Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: gjl at gcc dot gnu.org Target Milestone: --- Created attachment 58896 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58896&action=edit lra-bug2.c: GNU-C99 test case This error occurs when we try to build avr libgcc with -mlra: $ avr-gcc lra-bug2.c -S -mlra -Os or $ avr-gcc lra-bug2.c -S -mlra In function '__f7_clr', inlined from '__f7_madd_msub' at lra-bug2.c:77:3: lra-bug2.c:42:3: error: inconsistent operand constraints in an 'asm' 42 | __asm ("%~call %x[f]" | ^ The input constraint is like "z" (cc) where cc is a void* that perfectly fits into 16-bit regster Z (reg:HI 30) which has register constraint "z". Target: avr Configured with: ../../source/gcc-master/configure --target=avr --disable-nls --with-dwarf2 --with-gnu-as --with-gnu-ld --disable-shared --with-long-double=64 --enable-languages=c,c++
[Bug rtl-optimization/116324] [lra] error: inconsistent operand constraints in an 'asm'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116324 Georg-Johann Lay changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2024-08-10 Blocks||113934, 113932, 56183 Status|UNCONFIRMED |NEW Target||avr Keywords||ra, rejects-valid Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56183 [Bug 56183] [meta-bug][avr] Problems with register allocation https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113932 [Bug 113932] [meta-bug] Targets which should be ported to LRA https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113934 [Bug 113934] Switch avr to LRA
[Bug target/113934] Switch avr to LRA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113934 --- Comment #11 from Georg-Johann Lay --- LRA even breaks building libgcc: PR116324
[Bug rtl-optimization/116325] New: [lra] error: unable to generate reloads for:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116325 Bug ID: 116325 Summary: [lra] error: unable to generate reloads for: Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: gjl at gcc dot gnu.org Target Milestone: --- Created attachment 58897 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58897&action=edit pr60040-2.c: GNU-C99 test case from gcc.target/avr $ avr-gcc pr60040-2.c -mlra -S -Os /pr60040-2.c:112:1: error: unable to generate reloads for: 112 | } | ^ (call_insn 44 43 45 3 (parallel [ (set (reg:HI 24 r24) (call (mem:HI (reg/f:HI 79 [ ops_25(D)->blank ]) [0 *_26 S2 A8]) (const_int 0 [0]))) (use (const_int 0 [0])) ]) "gcc.target/avr/pr60040-2.c":66:10 774 {call_value_insn} (expr_list:REG_DEAD (reg/f:HI 79 [ ops_25(D)->blank ]) (expr_list:REG_DEAD (reg:SI 20 r20) (expr_list:REG_DEAD (reg:SI 16 r16) (expr_list:REG_DEAD (reg:SI 12 r12) (expr_list:REG_DEAD (reg:SI 8 r8) (expr_list:REG_UNUSED (reg:HI 24 r24) (expr_list:REG_CALL_DECL (nil) (nil (expr_list:HI (use (reg:HI 24 r24)) (expr_list:SI (use (reg:SI 20 r20)) (expr_list:SI (use (reg:SI 16 r16)) (expr_list:SI (use (reg:SI 12 r12)) (expr_list:SI (use (reg:SI 8 r8)) (nil))) during RTL pass: reload gcc.target/avr/pr60040-2.c:112:1: internal compiler error: in curr_insn_transform, at lra-constraints.cc:4283 The insn in an indirect call, which should use the Z register (reg:HI 30) for the target address. Target: avr Configured with: ../../source/gcc-master/configure --target=avr --disable-nls --with-dwarf2 --with-gnu-as --with-gnu-ld --disable-shared --with-long-double=64 --enable-languages=c,c++
[Bug rtl-optimization/116325] [lra] error: unable to generate reloads for:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116325 Georg-Johann Lay changed: What|Removed |Added Ever confirmed|0 |1 Target||avr Status|UNCONFIRMED |NEW Keywords||ice-on-valid-code, ra Blocks||56183, 113932, 113934 Last reconfirmed||2024-08-10 Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56183 [Bug 56183] [meta-bug][avr] Problems with register allocation https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113932 [Bug 113932] [meta-bug] Targets which should be ported to LRA https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113934 [Bug 113934] Switch avr to LRA
[Bug rtl-optimization/116326] New: [lra] internal compiler error: in get_reload_reg, at lra-constraints.cc:755
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116326 Bug ID: 116326 Summary: [lra] internal compiler error: in get_reload_reg, at lra-constraints.cc:755 Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: gjl at gcc dot gnu.org Target Milestone: --- Created attachment 58898 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58898&action=edit GNU-C99 test case $ avr-gcc addr-space-1-0-i.c -S -mlraduring RTL pass: reload addr-space-1-0-i.c: In function 'main': addr-space-1-0-i.c:85:1: internal compiler error: in get_reload_reg, at lra-constraints.cc:755 85 | } | ^ Target: avr Configured with: ../../source/gcc-master/configure --target=avr --disable-nls --with-dwarf2 --with-gnu-as --with-gnu-ld --disable-shared --with-long-double=64 --enable-languages=c,c++
[Bug rtl-optimization/116326] [lra] internal compiler error: in get_reload_reg, at lra-constraints.cc:755
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116326 Georg-Johann Lay changed: What|Removed |Added Last reconfirmed||2024-08-10 Keywords||ice-on-valid-code, ra Blocks||56183, 113932, 113934 Target||avr Status|UNCONFIRMED |NEW Ever confirmed|0 |1 --- Comment #1 from Georg-Johann Lay --- The opening should read: $ avr-gcc addr-space-1-0-i.c -S -mlra during RTL pass: reload addr-space-1-0-i.c: In function 'main': ... Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56183 [Bug 56183] [meta-bug][avr] Problems with register allocation https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113932 [Bug 113932] [meta-bug] Targets which should be ported to LRA https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113934 [Bug 113934] Switch avr to LRA
[Bug target/116236] [LRA] [M68K] ICE insn does not satisfy its constraints
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116236 Georg-Johann Lay changed: What|Removed |Added CC||gjl at gcc dot gnu.org --- Comment #7 from Georg-Johann Lay --- (In reply to Richard Biener from comment #2) > Docs say > > Legitimate addresses are defined in two variants: a strict variant and a > non-strict one. The @var{strict} parameter chooses which variant is > desired by the caller. > > The strict variant is used in the reload pass. It must be defined so > that any pseudo-register that has not been allocated a hard register is > considered a memory reference. I don't quite understand this sentence. Does that mean that legitimate_address_p has to accept MEM as (part of) a valid address, even when only a hard reg is allowed as address? Moreover legitimate_address_p seems outdated / incomplete and TARGET_ADDR_SPACE_LEGITIMATE_ADDRESS_P the right hook to use.
[Bug other/116322] regenerate-opt-urls.py usage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116322 --- Comment #2 from Georg-Johann Lay --- And it may be easier to use when we had a $builddir/gcc/regenerate-opt-urls.py built by configure 1) $builddir/gcc/regenerate-opt-urls.py would know where $srcdir is. 2) $builddir/gcc/regenerate-opt-urls.py would know what HTML/$version to use and could issue an error to run "make html" when it does not exist. 3) Shebang like #!/usr/bin/env python3 may not work on some build machines even when they have Python3 installed. configure can find a required Python version or higher: AM_PATH_PYTHON([], [AC_MSG_NOTICE([using $PYTHON to run regenerate-opt-urls.py])]) AC_CONFIG_FILES([regenerate-opt-urls.py], [chmod +x regenerate-opt-urls.py]) Though GCC is using some older version of autotools, and I don't know how well and reliable AM_PATH_PYTHON works there.
[Bug rtl-optimization/116321] [lra][avr] internal compiler error: in avr_out_lpm_no_lpmx, at config/avr/avr.cc:4572
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116321 --- Comment #1 from Georg-Johann Lay --- What I do not understand is when I also set -mlog=legitimate_address_p then I only get logs that have strict=0 and not a single one with strict=1, like: avr_addr_space_legitimate_address_p[fun64:split5(357)]: ret=true, mode=QI strict=0 reload_completed=1 reload_in_progress=0 (reg_renumber): (reg/f:HI 28 r28 [60]) This is for pass .split5 that runy way after reload, and strict=0 doesn't make much sense to me.