[Bug target/92729] [avr] Convert the backend to MODE_CC so it can be kept in future releases

2020-11-14 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92729

--- Comment #12 from Georg-Johann Lay  ---
Simulator: avrtest core simulator hosted on SourceForge as part of WinAVR.

Libc: avr-libc trunk hosted on nongnu.org. There are several patches not yet
integrated: recent xtiny devices, fixes in libm to adjust to the recent
double64 additions, and extensions for the build environment to handle the new
avr-gcc configure options for double multilib layout. Patches are pending for
some time; you'll have to resolve conflicts.

Binutils is vanilla from sourceware.org.

[Bug target/92729] [avr] Convert the backend to MODE_CC so it can be kept in future releases

2020-11-15 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92729

--- Comment #13 from Georg-Johann Lay  ---
FYI, avrtest is here:
https://sourceforge.net/p/winavr/code/HEAD/tree/trunk/avrtest/

[Bug target/92729] [avr] Convert the backend to MODE_CC so it can be kept in future releases

2020-11-15 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92729

--- Comment #15 from Georg-Johann Lay  ---
I built the tools by hand so I knew what I had...

Dunno about gcc/buildbot policies concerning avr. As avr as a 3ary target, that
BE's quality is of no consideration when releasing the compiler. Again, I
added/ran tests by myself when working on the BE. However, test coverage is
low, and there are no performance tests. And there is no performance test suite
I know of that would work reasonably for AVR, or one that has been designed for
AVR/avr-gcc .

And be warned that the avr BE has many kludges, work-arounds and hacks. Some
are historical, but most of them work around shortcomings and flaws in the
middle-ends (nobody will fix middle-end issues that hamper a 3ary target).

[Bug target/108287] AVR build: gcc/config/avr/t-avr tries to edit the source tree

2023-01-21 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108287

--- Comment #4 from Georg-Johann Lay  ---
Well, updating or creating some auto-generated files is intentional.

What's not supported as of GCC documentation is configure'ing in the source
tree:

https://gcc.gnu.org/install/configure.html

> First, we **highly** recommend that GCC be built into a separate directory
> from the sources which does not reside within the source tree.
> This is how we generally build GCC; building where srcdir == objdir
> should still work, but doesn’t get extensive testing; building where
> objdir is a subdirectory of srcdir is unsupported.

The reason why it does not work for you might be:

1) Maybe you changed avr-mcus.def to support more devices. This change will
trigger more changes, for example to auto-generated documentation (texi) bits. 
This means you are basically a maintainer, which in turn means you migth have
more jobs to do, or tools to use than a simple user who just builds GCC from
source.

2) When you get the sources from some repo like git, the checked-out sources
might have timestamps that don't reflect their true state. This triggers make
to re-build auto-generated files, even though no prerequisite was changed and
the targets need not be rebuilt.

To fix this, run
  ./contrib/gcc_update --touch
from the top-level source dir.  This script will touch some source files and
fix their timestamps.  You obviously need write permission for that.

[Bug target/108287] AVR build: gcc/config/avr/t-avr tries to edit the source tree

2023-01-21 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108287

--- Comment #5 from Georg-Johann Lay  ---
...ok, yes, building outside srcdir won't fix this one.  But points 1) and 2)
still apply.

[Bug target/106307] error when I do a test on a pointer on Arduino 1.8.19

2023-01-21 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106307

Georg-Johann Lay  changed:

   What|Removed |Added

   Last reconfirmed||2023-01-21
 Ever confirmed|0   |1
 Status|UNCONFIRMED |WAITING

[Bug target/99435] avr: incorrect I/O address ranges for some cores

2023-01-25 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99435

Georg-Johann Lay  changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
 Ever confirmed|0   |1
   Last reconfirmed||2023-01-25

--- Comment #2 from Georg-Johann Lay  ---
Still wainting for a reply.

[Bug target/100962] Poor optimization of AVR code when using structs in __flash

2023-01-25 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100962

Georg-Johann Lay  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |WORKSFORME

--- Comment #5 from Georg-Johann Lay  ---
The code is optimized fine with -Os.

With -Og, you can expect less optimized code.  For the provided code and -Og,
you can improve code quality by means of -mstrict-X (where I am not sure
whether it would be appropriate to have -mstrict-X as the default).

[Bug target/97276] A whole if-block is ignored by avr-gcc 9.3.0

2023-02-23 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97276

Georg-Johann Lay  changed:

   What|Removed |Added

 Target|atxmega32a4 |avr

--- Comment #2 from Georg-Johann Lay  ---
Can you provide the pre-compiled source pwm.i?  Just add -save-temps to the
compile options.

[Bug target/97276] A whole if-block is ignored by avr-gcc 9.3.0

2023-02-23 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97276

--- Comment #4 from Georg-Johann Lay  ---
Created attachment 54518
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54518&action=edit
pwn-i.c pre-compiled test case

Ok, I found it and attached a cleaned-up version.

IIUC correctly, the relevant options you are using to compile are: -O1
-mmcu=atxmega32a4 -g -ggdb -std=gnu99

With these options (and with -fverbose-asm to easier navigate in asm) I could
not reproduce the problem with avr-gcc v8.5.  The respective part of .s reads
(I dropped -g for legibility, but with -g it's same):

 ;  pwm-i.c:287: if (last_brightness < 181 && j >= 181)
ldi r30,lo8(-76) ; , ;  320 [c=4 l=1]  movqi_insn/1
cp r30,r11   ; , last_brightness ;  193 [c=4 l=1] 
cmpqi3/1
brsh .+2 ;   ;  194 [c=16 l=2]  branch
rjmp .L16; 
 ;  pwm-i.c:287: if (last_brightness < 181 && j >= 181)
cpi r22,lo8(-75) ;  iftmp.3_5,   ;  196 [c=4 l=1]  cmpqi3/2
brsh .+2 ;   ;  197 [c=16 l=2]  branch
rjmp .L16; 
 ;  pwm-i.c:289: slot->top = 0xfe00;
st X+,r8 ;  tmp226   ;  200 [c=4 l=3]  *movhi/3
st X,r9  ;  tmp226
sbiw r26,1
 ;  pwm-i.c:290: slot->mask = ~mask;
movw r30,r24 ;  tmp200, mask ;  321 [c=4 l=1]  *movhi/0
com r30  ;  tmp200   ;  201 [c=8 l=2]  one_cmplhi2
com r31  ;  tmp200
 ;  pwm-i.c:290: slot->mask = ~mask;
adiw r26,2   ;  slot_172->mask   ;  202 [c=4 l=4]  *movhi/3
st X+,r30;  tmp200
st X,r31 ;  tmp200
sbiw r26,2+1 ;  slot_172->mask
 ;  pwm-i.c:291: ++slot;
adiw r26,4   ;  slot,;  203 [c=4 l=1]  addhi3_clobber/0

[Bug target/97276] A whole if-block is ignored by avr-gcc 9.3.0

2023-02-23 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97276

--- Comment #5 from Georg-Johann Lay  ---
... also tried v9.2 via

https://godbolt.org/z/9r3vMj1e3

and just like with v8.5, the respective block is around asm line 350.

[Bug rtl-optimization/90706] [10/11/12/13 Regression] Useless code generated for stack / register operations on AVR

2023-03-04 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90706

Georg-Johann Lay  changed:

   What|Removed |Added

  Known to work||8.5.0

--- Comment #19 from Georg-Johann Lay  ---
(In reply to CVS Commits from comment #18)
> https://gcc.gnu.org/g:2639f9d2313664e6b4ed2f8131fefa60aeeb0518
> 
> commit r13-6424-g2639f9d2313664e6b4ed2f8131fefa60aeeb0518
> Author: Vladimir N. Makarov 
> Date:   Thu Mar 2 16:29:05 2023 -0500
> 
> IRA: Use minimal cost for hard register movement

Thank you; the code looks clean now. (For my test case from comment #16 I
needed -fno-split wide-types which is a different story).

Is there any chance your fix will be back-ported?

[Bug target/104988] Zero register (R1) clobbered by __udivmodsi4 for AVR

2022-04-03 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104988

Georg-Johann Lay  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Georg-Johann Lay  ---
As you already found out this PR is invalid, thus closing.

[Bug target/99184] [avr] wrong double to 16-Bit and 32-Bit integers in libgcc/libf7

2022-09-18 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99184

--- Comment #1 from Georg-Johann Lay  ---
As a work-around, one can cast to an intermediate 64-bit integer:

// For [u]int64_t and uint32_t, do #include 
double x = 2.9;
int x_int = (int) (int64_t) x;
uint32_t x_u32 = (uint32_t) (uint64_t) x;

[Bug target/107201] New: [avr] -nodevicelib not working for devices -mmcu=avr...

2022-10-10 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107201

Bug ID: 107201
   Summary: [avr] -nodevicelib not working for devices
-mmcu=avr...
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gjl at gcc dot gnu.org
  Target Milestone: ---

The -nodevicelib option can be used so that the executable is not linked
against -l when a device is specified as -mmcu=.  This is
useful if such a library is not avilable.

This is achieved by the following spec in the device-specs file specs-:

*avrlibc_devicelib:
%{!nodevicelib:-lavr64dd64}

However, in a spec function, the driver in
./gcc/config/avr/driver-avr.c[c]::avr_devicespecs_file() removes that option
because it thinks that -mmcu=avr* is a device *family* like avr25 or avrxmega2
etc.:

#if defined (WITH_AVRLIBC)
 " %{mmcu=avr*:" X_NODEVLIB "} %{!mmcu=*:" X_NODEVLIB "}",
#else

where X_NODEVLIB resolves to "%

[Bug target/107201] [avr] -nodevicelib not working for devices -mmcu=avr...

2022-10-11 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107201

--- Comment #1 from Georg-Johann Lay  ---
Created attachment 53691
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53691&action=edit
pr107201.diff: Proposed patch.

This proposed patch (effectively) sets macro X_NODEVLIB to "" in all of
./config/avr/driver-avr.cc.

-nodevicelib is a known driver option from avr.opt, so there should be no need
to explicitly remove it by hand by means of %

[Bug target/100962] Poor optimization of AVR code when using structs in __flash

2021-10-26 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100962

--- Comment #4 from Georg-Johann Lay  ---
Did you try option -mstrictX?

And try to make a problem-report self-contained.

[Bug libstdc++/101867] avr libc build error for libstdc++

2021-10-26 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101867

--- Comment #16 from Georg-Johann Lay  ---
--with-avrlibc is default, so setting it is void. C.f. install info.

[Bug target/103975] DWARF .debug_frame incorrect for ISRs on AVR; pushing SREG creates off-by-one error

2022-10-26 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103975

--- Comment #5 from Georg-Johann Lay  ---
If someone is going to fix this, the following changes might also play a role:

* v8+ may emit optimized ISR prologues / epilogues using PR81268: gcc will just
emit pseudo-instruction __gcc_isr which will be resolved by gas.  Debug info
might be incorrect or missing; gas would have to add respective debug info.

* v12+ PR92729 changed condition code from implicit cc0 to explicit REG_CC and
introduced a new hard register "cc" with hard register number REG_CC = 36. The
highest hard regno before that transition was 35.

[Bug target/99435] avr: incorrect I/O address ranges for some cores

2022-10-26 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99435

--- Comment #1 from Georg-Johann Lay  ---
I am really confused.

To all of my knowledge, IN and OUT can address a range of 64 bytes.  For
example, the opcode of OUT is

1011 1AAr  

where "r" bits encode for the register number (2^5 = 32 of them) and "A" bits
encode absolute target addresses (2^6 = 64 of them). So there isn't even enough
space in the instruction encoding to provide an address range as clained by
this PR.

Similar for, say, SBI with opcode encoding

1011 1010  Abbb

where "A" bits encode for absolute target address (2^5 = 32 of them) and "b"
encode target bit number (2^3 = 8 of them).

Are you sure you didn't just stumble upon a typo in the data sheet?

All AVRs are using these encodings.  The only difference is between Xmega and
non-Xmega which use different, implicit SFR_OFFSETs (which don't affect the
encoding or the number of address that can be encoded).

[Bug rtl-optimization/90706] [10/11/12/13 Regression] Useless code generated for stack / register operations on AVR

2022-11-01 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90706

--- Comment #13 from Georg-Johann Lay  ---
Created attachment 53812
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53812&action=edit
Test case with 32-bit integer.

This problem is still present in current master (future v13) and also occurs
with 32-bit integers.

> avr-gcc -S -Os -mul.c -fdump-rtl-ira

With v8, mul.s has 15 instructions.

With newer versions, mul.s has 26 additional instructions: 
* 12 silly, useless stores into / loads from frame.
* 12 instructions to setup the frame.
* More instructions due to sub-optimal register alloc.
* Uses 6 bytes stack frame where v8 needs no frame at all.

In the IRA dump, there is:

Pass 0 for finding pseudo/allocno costs
a0 (r53,l0) best NO_REGS, allocno NO_REGS
a2 (r49,l0) best GENERAL_REGS, allocno GENERAL_REGS
a1 (r48,l0) best NO_REGS, allocno NO_REGS
...
Pass 1 for finding pseudo/allocno costs
r53: preferred NO_REGS, alternative NO_REGS, allocno NO_REGS
r49: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS
r48: preferred NO_REGS, alternative NO_REGS, allocno NO_REGS
...
  Spill a0(r53,l0)
  Spill a1(r48,l0)
  Allocno a2r49 of GENERAL_REGS(30) ...

So there are 2 register spills for no reason that lead to that code bloat.

[Bug web/107610] Broken 'onlinedocs' after "Porting the Docs to Sphinx"

2022-11-10 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107610

--- Comment #4 from Georg-Johann Lay  ---
Also affected are other bits of the web page that are auto-generated, like

https://gcc.gnu.org/install/configure.html

And with the new URLs, "deep" links like

https://gcc.gnu.org/install/configuration.html#avr

ceased to work, too, even though the old ./gcc/doc/install.texi generated
(working) anchors like:

@html

@end html
@item --with-avrlibc

So the porting-to-sphinx dropped them, which is really sad.

[Bug target/107842] New: [avr] Set --param=min-pagesize=0 in the backend

2022-11-23 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107842

Bug ID: 107842
   Summary: [avr] Set --param=min-pagesize=0 in the backend
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gjl at gcc dot gnu.org
  Target Milestone: ---

The AVR backend should set --param=min-pagesize=0 in v12+, or otherwise we will
see warnings for each and every SFR access like:

typedef __UINT8_TYPE__ uint8_t;

#define SREG (*(volatile uint8_t*) (0x3F + __AVR_SFR_OFFSET__ ))

void bar (void)
{
SREG = 0;
}

> avr-gcc -c foo-i.c -mmcu=atmega8 -Os -Wall
foo-i.c: In function 'bar':
foo-i.c:7:6: warning: array subscript 0 is outside array bounds of 'volatile
uint8_t[0]' {aka 'volatile unsigned char[]'} [-Warray-bounds]
7 | SREG = 0;
  | ~^~~~

[Bug target/107842] [avr] Set --param=min-pagesize=0 in the backend

2022-11-23 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107842

Georg-Johann Lay  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|UNCONFIRMED |RESOLVED

--- Comment #1 from Georg-Johann Lay  ---
Dupe, but I don't know wheter only AVR is annoyed by this.

*** This bug has been marked as a duplicate of bug 105523 ***

[Bug target/105523] Wrong warning array subscript [0] is outside array bounds

2022-11-23 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105523

Georg-Johann Lay  changed:

   What|Removed |Added

 CC||gjl at gcc dot gnu.org

--- Comment #7 from Georg-Johann Lay  ---
*** Bug 107842 has been marked as a duplicate of this bug. ***

[Bug target/106307] error when I do a test on a pointer on Arduino 1.8.19

2022-11-23 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106307

--- Comment #1 from Georg-Johann Lay  ---
We'd need at least a test case so we can reproduce th issue. Thanks.

[Bug libstdc++/104875] libstdc++-v3/src/c++11/codecvt.cc:312:24: warning: left shift count >= width of type

2022-11-23 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104875

--- Comment #3 from Georg-Johann Lay  ---
Is this fixed now?

[Bug rtl-optimization/90706] [10/11/12/13 Regression] Useless code generated for stack / register operations on AVR

2022-12-16 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90706

Georg-Johann Lay  changed:

   What|Removed |Added

 CC||gjl at gcc dot gnu.org

--- Comment #16 from Georg-Johann Lay  ---
Created attachment 54113
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54113&action=edit
More elaborate C test case.

This is a more complicated test case, compile with

> avr-gcc -c pi-i.c -mmcu=atmega8 -Os -mcall-prologues -fno-tree-loop-optimize 
> -fno-move-loop-invariants && avr-size pi-i.o

Code sizes are:

664 with avr-gcc v8.5
992 with avr-gcc v11.3
834 with avr-gcc master with the change from comment #13

So there is a clear improvement with patch #13, but size is still +25% compared
to v8. What also has an effect is -fno-split-wide-types.

The test case mostly operates on float; unfortunately I don't have a similar
test-case for 32-bit integers at hand.

[Bug target/113824] New: AVR: ATA5795 in wrong multilib set

2024-02-08 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113824

Bug ID: 113824
   Summary: AVR: ATA5795 in wrong multilib set
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gjl at gcc dot gnu.org
  Target Milestone: ---

This device is currently filed in avr5, where according to
https://github.com/avrdudes/avr-libc/issues/874#issuecomment-1933051758 is
should be in avr4.

[Bug target/113824] AVR: ATA5795 in wrong multilib set

2024-02-08 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113824

Georg-Johann Lay  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #4 from Georg-Johann Lay  ---
Fixed in v12.4 and v13.3+

[Bug target/113824] AVR: ATA5795 in wrong multilib set

2024-02-08 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113824

Georg-Johann Lay  changed:

   What|Removed |Added

   Target Milestone|--- |13.3

[Bug rtl-optimization/101188] [11/12/13 Regression] [postreload] Uses content of a clobbered register

2024-02-09 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101188

Georg-Johann Lay  changed:

   What|Removed |Added

 Resolution|FIXED   |---
 Status|RESOLVED|REOPENED
Summary|[postreload] Uses content   |[11/12/13 Regression]
   |of a clobbered register |[postreload] Uses content
   ||of a clobbered register

--- Comment #19 from Georg-Johann Lay  ---
Reopened for back-porting.

[Bug target/105523] Wrong warning array subscript [0] is outside array bounds

2024-02-12 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105523

Georg-Johann Lay  changed:

   What|Removed |Added

   Target Milestone|--- |13.3

--- Comment #37 from Georg-Johann Lay  ---
Back-ported to v13.3

[Bug other/113927] New: [avr-tiny] Sets up a stack-frame even for trivial code

2024-02-15 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113927

Bug ID: 113927
   Summary: [avr-tiny] Sets up a stack-frame even for trivial code
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gjl at gcc dot gnu.org
  Target Milestone: ---

Code like

char func (char c)
{
return c;
}

compiles as expected to

func:
/* prologue: function */
/* frame size = 0 */
/* stack size = 0 */
.L__stack_usage = 0
/* epilogue start */
ret

with  avr-gcc -S -Os -mmcu=attiny26 -da , but for attiny40 (Reduced Tiny with
16 GPRs only) the result is:

func:
push r28
push r29
push __tmp_reg__
in r28,__SP_L__
in r29,__SP_H__
/* prologue: function */
/* frame size = 1 */
/* stack size = 3 */
.L__stack_usage = 3
/* epilogue start */
pop __tmp_reg__
pop r29
pop r28
ret

In .asmcons, i.e. just prior to register allocation, the code reads:

(insn 13 4 2 2 (set (reg:QI 46)
(reg:QI 24 r24 [ c ])) "main.c":2:1 86 {movqi_insn_split}
 (expr_list:REG_DEAD (reg:QI 24 r24 [ c ])
(nil)))
(insn 2 13 3 2 (set (reg/v:QI 44 [ c ])
(reg:QI 46)) "main.c":2:1 86 {movqi_insn_split}
 (expr_list:REG_DEAD (reg:QI 46)
(nil)))
(note 3 2 10 2 NOTE_INSN_FUNCTION_BEG)
(insn 10 3 11 2 (set (reg/i:QI 24 r24)
(reg/v:QI 44 [ c ])) "main.c":4:1 86 {movqi_insn_split}
 (expr_list:REG_DEAD (reg/v:QI 44 [ c ])
(nil)))
(insn 11 10 0 2 (use (reg/i:QI 24 r24)) "main.c":4:1 -1
 (nil))

so everything is fine and this PR is not a dup of PR110093.  According to
Vladimir Makarov, PR110093 is because DFA cannot handle subregs, but the RTL
code above does not have subregs.  What's the case is that IRA has very high
register costs, for example in .ira:

Pass 0 for finding pseudo/allocno costs

a1 (r46,l0) best NO_REGS, allocno NO_REGS
a0 (r44,l0) best NO_REGS, allocno NO_REGS

  a0(r44,l0) costs: POINTER_X_REGS:65535000 POINTER_Y_REGS:65535000
POINTER_Z_REGS:65535000 BASE_POINTER_REGS:65535000 POINTER_REGS:65535000
SIMPLE_LD_REGS:65535000 GENERAL_REGS:65535000 MEM:3000

whereas the .ira for attiny26 (ordinary core with 32 GPRs):

Pass 0 for finding pseudo/allocno costs

a0 (r46,l0) best GENERAL_REGS, allocno GENERAL_REGS

  a0(r46,l0) costs: POINTER_X_REGS:4000 POINTER_Y_REGS:4000 POINTER_Z_REGS:4000
BASE_POINTER_REGS:4000 POINTER_REGS:4000 ADDW_REGS:4000 SIMPLE_LD_REGS:4000
LD_REGS:4000 NO_LD_REGS:4000 GENERAL_REGS:4000 MEM:4000

../../source/gcc-master/configure --target=avr --disable-nls --with-dwarf2
--with-gnu-as --with-gnu-ld --disable-shared --enable-languages=c,c++

[Bug target/113927] [avr-tiny] Sets up a stack-frame even for trivial code

2024-02-15 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113927

Georg-Johann Lay  changed:

   What|Removed |Added

 Resolution|--- |FIXED
   Keywords|missed-optimization |
   Target Milestone|--- |13.3
  Component|other   |target
 Status|UNCONFIRMED |RESOLVED

--- Comment #3 from Georg-Johann Lay  ---
Fixed in v13.3+

[Bug target/113934] Switch avr to LRA

2024-02-16 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113934

--- Comment #1 from Georg-Johann Lay  ---
What's the LRA way to do LEGITIMIZE_RELOAD_ADDRESS?

[Bug other/113974] New: Attribute common ignored

2024-02-17 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113974

Bug ID: 113974
   Summary: Attribute common ignored
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gjl at gcc dot gnu.org
  Target Milestone: ---

__attribute__((common,used))
static int cc;

when this code is compiled with -S -fdata-sections then cc is not put into
.lcomm (and is not .local .comm either):

.section.bss.cc,"aw",@nobits
.align 4
.type   cc, @object
.size   cc, 4
cc:
.zero   4
.ident  "GCC: (GNU) 13.2.1 20231022"

with -fno-data-sections, though, it works as expected:

.local  cc
.comm   cc,4,4

[Bug middle-end/113974] Attribute common ignored

2024-02-18 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113974

--- Comment #3 from Georg-Johann Lay  ---
Then the documentation should make that clear that with -fno-data-sections the
object goes in COMM, but with -fdata-sections it does not and the attribute
"common" is ignored.

Better still, the compiler would behave as documented irrespective of
-f[no]-data-sections.

This is an issue of the compiler, not of the assembler.

Presumably clang just copied gcc behaviour back then?

[Bug target/97276] A whole if-block is ignored by avr-gcc 9.3.0

2024-02-20 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97276

Georg-Johann Lay  changed:

   What|Removed |Added

   Last reconfirmed||2024-02-20
 Ever confirmed|0   |1
 Status|UNCONFIRMED |WAITING

[Bug target/114100] New: [avr] Inefficient indirect addressing on Reduced Tiny

2024-02-25 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114100

Bug ID: 114100
   Summary: [avr] Inefficient indirect addressing on Reduced Tiny
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gjl at gcc dot gnu.org
  Target Milestone: ---

The Reduced Tiny core does not support indirect addressing with offset, which
basically means that every indirect memory access with a size of more than one
byte is effectively POST_INC or PRE_DEC.  The lack of that addressing mode is
currently handled by pretending to support it, and then let the insn printers
add and subtract again offsets as needed on the fly.

For example, the following C code

   int vars[10];

   void inc_var2 (void) {
  ++vars[2];
   }

is compiled to:

   ldi r30,lo8(vars) ;  14   [c=4 l=2]  *movhi/4
   ldi r31,hi8(vars)
   subi r30,lo8(-(4));  15   [c=8 l=6]  *movhi/2
   sbci r31,hi8(-(4))
   ld r20,Z+
   ld r21,Z
   subi r30,lo8((4+1))
   sbci r31,hi8((4+1))
   subi r20,-1 ;  16   [c=4 l=2]  *addhi3_clobber/1
   sbci r21,-1
   subi r30,lo8(-(4+1));  17   [c=4 l=4]  *movhi/3
   sbci r31,hi8(-(4+1))
   st Z,r21
   st -Z,r20

where the code could be:

   ldi r30,lo8(vars+4);  28   [c=4 l=2]  *movhi/4
   ldi r31,hi8(vars+4)
   ld r20,Z+  ;  17   [c=8 l=2]  *movhi/2
   ld r21,Z+
   subi r20,-1;  19   [c=4 l=2]  *addhi3_clobber/1
   sbci r21,-1
   st -Z,r21  ;  30   [c=4 l=2]  *movhi/3
   st -Z,r20

[Bug target/114100] [avr] Inefficient indirect addressing on Reduced Tiny

2024-02-25 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114100

Georg-Johann Lay  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
   Priority|P3  |P4
   Keywords||missed-optimization
 Target||avr

[Bug middle-end/114111] New: [avr] Expensive code instead of conditional branch.

2024-02-26 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114111

Bug ID: 114111
   Summary: [avr] Expensive code instead of conditional branch.
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gjl at gcc dot gnu.org
  Target Milestone: ---

Created attachment 57541
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57541&action=edit
addcc.c: C test case

Compile the code with avr-gcc -S -Os -dp:

int add_ge0 (int x, char c) {
return x + (c >= 0);
}

int add_eq0 (int x, char c) {
return x + (c == 0);
}

int add_le0 (int x, char c) {
return x + (c <= 0);
}

int add_ge1 (int x, char c) {
return x + (c >= 1);
}

int add_ltm3 (int x, char c) {
return x + (c < -3);
}

int add_bit6 (int x, char c) {
return x + !!(c & (1 << 6));
}

int add_nbit6 (int x, char c) {
return x + !(c & (1 << 6));
}

All these could be performed by a test and the addition of x in an if-block. 
But what the compiler does is to extend the 8-bit value c to 16 bit, then
complement it, then shift the MSB to the LSB:

add_ge0:
mov __tmp_reg__,r22  ;  23  [c=12 l=3]  *extendqihi2/0
lsl r0  
sbc r23,r23
com r22  ;  24  [c=8 l=2]  *one_cmplhi2
com r23
bst r23,7;  31  [c=16 l=4]  *lshrhi3_const/3
clr r22
clr r23
bld r22,0
add r24,r22  ;  26  [c=8 l=2]  *addhi3/0
adc r25,r23
ret  ;  29  [c=0 l=1]  return

Even when it does a conditional to set the addend, it should rather have the
addition in the if-block (and moving x to R18 adds even more bloat):

add_eq0:
mov r18,r24  ;  44  [c=4 l=1]  movqi_insn/0
mov r19,r25  ;  45  [c=4 l=1]  movqi_insn/0
ldi r24,lo8(1)   ;  46  [c=4 l=2]  *movhi/4
ldi r25,0   
cp r22, __zero_reg__ ;  47  [c=4 l=1]  cmpqi3/0
breq .L3 ;  48  [c=4 l=1]  branch
ldi r24,0;  43  [c=4 l=2]  *movhi/1
ldi r25,0   
.L3:
add r24,r18  ;  42  [c=8 l=2]  *addhi3/0
adc r25,r19
ret  ;  51  [c=0 l=1]  return

...
.ident  "GCC: (GNU) 14.0.1 20240212 (experimental)"

With avr-gcc 3.4.6 from around 2006, the generated code is as follows:

add_ge0:
sbrs r22,7   ;  38  *sbrx_branch[length = 2]
adiw r24,1   ;  15  *addhi3/2   [length = 1]
.L2:
ret  ;  37  return  [length = 1]

add_eq0:
tst r22  ;  13  tstqi   [length = 1]
brne .L4 ;  14  branch  [length = 1]
adiw r24,1   ;  15  *addhi3/2   [length = 1]
.L4:
ret  ;  35  return  [length = 1]

etc.  So at some point in time GCC lost all that smartness.

Appears to be around emit_stor_flag and friends; as far as I can see it doesn't
even try to work out costs.

[Bug target/114132] New: [avr] Code sets up a frame pointer without need

2024-02-27 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114132

Bug ID: 114132
   Summary: [avr] Code sets up a frame pointer without need
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gjl at gcc dot gnu.org
  Target Milestone: ---

$ avr-gcc -S -Os -mmcu=attiny40 

of 

void funcab_c (long x, char c) {
}

sets up a frame-pointer without need.

Arguments x and c occupy all of the argument registers R25..R20, so that no arg
registers are left.  Then there is this implementation of
TARGET_FRAME_POINTER_REQUIRED in avr.cc:

static bool
avr_frame_pointer_required_p (void)
{
  return (cfun->calls_alloca
  || cfun->calls_setjmp
  || cfun->has_nonlocal_label
  || crtl->args.info.nregs == 0
  || get_frame_size () > 0);
}

Problem is that crtl->args.info.nregs == 0 does not discriminate between need
for arg pointer and no need for arg pointer (but all arg regs are used up, like
in the example).

[Bug target/114132] [avr] Code sets up a frame pointer without need

2024-02-27 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114132

Georg-Johann Lay  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
   Priority|P3  |P4
 Target||avr

[Bug target/114132] [avr] Code sets up a frame pointer without need

2024-02-29 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114132

Georg-Johann Lay  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Georg-Johann Lay  ---
Fixed in v14.

[Bug target/114100] [avr] Inefficient indirect addressing on Reduced Tiny

2024-03-01 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114100

Georg-Johann Lay  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Georg-Johann Lay  ---
Improved in v14

[Bug other/114191] New: Flags "Warning" and "Target" don't mix well in target.opt files

2024-03-01 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114191

Bug ID: 114191
   Summary: Flags "Warning" and "Target" don't mix well in
target.opt files
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gjl at gcc dot gnu.org
  Target Milestone: ---

In an .opt file, a backend can define target-specific diagnostic options, for
example gcc/config/avr/avr.opt has:

Wmisspelled-isr
Warning C C++ Var(avr_warn_misspelled_isr) Init(1)
Warn if the ISR is misspelled, ...

This is a "Target" option however (so it should be listed with --help=target,
which it currently is not). However, specifying the "Target" flag in avr.opt
makes the option no more recognizable:

$ avr-gcc main.c -c -Wall -Wmisspelled-isr
cc1: error: unrecognized command-line option '-Wmisspelled-isr'

I can reproduce this for target avr, but it likely affects all other targets as
well.

Set the component to "other". As it appears, there is no bugzilla component for
such internal problems.

[Bug rtl-optimization/114208] New: DSE deletes a store that is not dead

2024-03-02 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114208

Bug ID: 114208
   Summary: DSE deletes a store that is not dead
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gjl at gcc dot gnu.org
  Target Milestone: ---

Created attachment 57594
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57594&action=edit
Reduced C test case

$ avr-gcc -mmcu=attiny40 bug-dse.c -S -Os -dp -mfuse-add=3 -fdse

the following C test case:

struct S { char a, b; };

__attribute__((__noinline__,__noclone__))
void test (const struct S *s)
{
if (s->a != 3 || s->b != 4)
__builtin_abort();
}

int main (void)
{
struct S s = { 3, 4 };
test (&s);

  return 0;
}

Then with DSE off (-fno-dse), main has a store of 3 into s.a:

main:
...
ldi r20,lo8(3)   ;  22  [c=4 l=1]  movqi_insn/1
ld __tmp_reg__,Y+;  24  [c=4 l=1]  *addhi3/3
st Y+,r20;  48  [c=4 l=1]  movqi_insn/2
ldi r20,lo8(4)   ;  27  [c=4 l=1]  movqi_insn/1
st Y,r20 ;  30  [c=4 l=1]  movqi_insn/2
...

but with DSE on, pass .dse2 removes the first store (insn 48, and in the wake
also insn 22) that sets s.a to 3:

main:
...
ldi r20,lo8(4)   ;  27  [c=4 l=1]  movqi_insn/1
subi r28,-2  ;  29  [c=4 l=2]  *addhi3/3
sbci r29,-1
st Y,r20 ;  30  [c=4 l=1]  movqi_insn/2
...

Configured with: ../../source/gcc-master/configure --target=avr --disable-nls
--with-dwarf2 --with-gnu-as --with-gnu-ld --disable-shared
--enable-languages=c,c++
Thread model: single
Supported LTO compression algorithms: zlib
gcc version 14.0.1 20240302 (experimental) (GCC)

[Bug rtl-optimization/114208] RTL DSE deletes a store that is not dead

2024-03-02 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114208

--- Comment #2 from Georg-Johann Lay  ---
(In reply to Andrew Pinski from comment #1)
> I wonder if this is related to r14-6674-g4759383245ac97 .

Not unlikely. PR112525 tries to eliminate dead stores for arguments that are
passed.  It seems like that change misses some required conditions like
frame-pointer / arg-pointer adjustments.

[Bug rtl-optimization/114208] RTL DSE deletes a store that is not dead

2024-03-02 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114208

--- Comment #3 from Georg-Johann Lay  ---
(In reply to Andrew Pinski from comment #1)
> I wonder if this is related to r14-6674-g4759383245ac97 .

Seems unrelated: When I reverse-apply r14-6674 then the issue does not go away.

[Bug other/114191] Flags "Warning" and "Target" don't mix well in target.opt files

2024-03-04 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114191

--- Comment #2 from Georg-Johann Lay  ---
(In reply to Richard Biener from comment #1)
> Wmisspelled-isr
> Target C C++ Var(avr_warn_misspelled_isr) Init(1)
> Warn if the ISR is misspelled, ...
> 
> should eventually work?

With that, the warnings appear as they should, but the option is not
recognized:

$ avr-gcc signal.c -S -Wmisspelled-isr
error: unrecognized command-line option '-Wmisspelled-isr'

$ avr-gcc signal.c -S -Wno-misspelled-isr
error: unrecognized command-line option '-Wno-misspelled-isr'

$ avr-gcc signal.c -S -Werror=misspelled-isr
error: '-Werror=misspelled-isr': '-Wmisspelled-isr' is not an option that
controls warnings

[Bug other/114191] Flags "Warning" and "Target" don't mix well in target.opt files

2024-03-04 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114191

--- Comment #3 from Georg-Johann Lay  ---
(In reply to Richard Biener from comment #1)
> How did you specify 'Target'?

Like:

Wmisspelled-isr
Target Warning C C++ Var(avr_warn_misspelled_isr) Init(1)
Warn if the ISR is misspelled, ...

[Bug rtl-optimization/114208] RTL DSE deletes a store that is not dead

2024-03-04 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114208

--- Comment #5 from Georg-Johann Lay  ---
(In reply to Richard Biener from comment #4)
> Did it ever work?
No.  I allowed -mfuse-add=3 to reproduce this PR because there seems to be a
problem with DSE, and for the case that someone is going to fix it before it
bites an important target.  The mfuse-add optimization tries to avoid the
broken parts of DSE and works around it; documented are only -mfuse-add=0...2 
It was added Feb 2024 as PR114100.

>  I suppose 'st Y+,r20 is' post-inc so maybe DSE mishandles this somehow.
That post-inc is only generated after .dse2: .split2 splits some move insns:
These cores don't have reg+offset addressing, so the backend must pretend to
support it.  Then .split2 generates pointer-adjust + mem-access +
undo-pointer-adjust.  The address adjustments are plain additions of the
address register (frame pointer in this case) and have according
REG_CFA_ADJUST_CFA notes.  Then .dse2 removes some non-dead stores.  The 'st
Y+,r20' you mentioned is only generated by .avr-fuse-add which runs after
.dse2.

I'd guess that GCC is not ready for targets with such tight addressing modes?
(without reg+offset addressing; stack-pointer cannot be used either, the only
SP accesses are PUSH and POP).

ad "needs-bisection": -mfuse-add is a new target optimization added as PR114100
in Feb 2024, so bi-secting won't work because -mfuse-add is not recognized
prior to that date.

[Bug rtl-optimization/114243] New: -fsplit-wide-types bloats code by more than 50%

2024-03-05 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114243

Bug ID: 114243
   Summary: -fsplit-wide-types bloats code by more than 50%
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gjl at gcc dot gnu.org
  Target Milestone: ---

Created attachment 57616
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57616&action=edit
pi-sigma.c: C99 test case

Compile the attached test case with:

$ avr-gcc pi-sigma.c -c -Os -mmcu=atmega8 -fstack-usage && avr-size pi-sigma.o

Then the code sizes are for respective versions of the compiler:

avr-gcc-v8:   624
avr-gcc-v14: 1008

which is an increase of code size of more than 60% !

The stack usage also increases by a lot. According to pi-sigma.su:

avr-gcc-v8:
---
pi-sigma.c:80:7:sigma   30  static
pi-sigma.c:86:7:pi_n14  static

avr-gcc-v14:

pi-sigma.c:80:7:sigma   86  static
pi-sigma.c:86:7:pi_n36  static

That is for the 1st function the stack use almost triples!

With -fno-split-wide-types the performace of v14 code is similar to v8.

Target: avr
Configured with: ../../source/gcc-master/configure --target=avr --disable-nls
--with-dwarf2 --with-gnu-as --with-gnu-ld --disable-shared
--enable-languages=c,c++ 
Thread model: single
Supported LTO compression algorithms: zlib
gcc version 14.0.1 20240303 (experimental) (GCC)

[Bug rtl-optimization/90706] [10/11/12/13 Regression] Useless code generated for stack / register operations on AVR

2024-03-05 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90706

--- Comment #24 from Georg-Johann Lay  ---
(In reply to Georg-Johann Lay from comment #23)
> As it appears, this bug is not fixed completely.  For the -mmcu=avrtiny
> architecture, there is still bloat for even the smallest test cases like:

Different story, f'up to PR113927.

[Bug rtl-optimization/114243] [avr] -fsplit-wide-types bloats code by more than 50%

2024-03-05 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114243

--- Comment #1 from Georg-Johann Lay  ---
May be related to PR110093.  As Vladimir noted in

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110093#c5

the problem is that data flow analysis cannot cope with the subregs generated
from lower-subregs, and register alloc chokes at it.

[Bug target/81473] [avr] build fails due to INT8_MIN and friends.

2024-03-05 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81473

--- Comment #4 from Georg-Johann Lay  ---
This was fixed long ago.

[Bug tree-optimization/114252] New: Introducing bswapsi reduces code performance

2024-03-06 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114252

Bug ID: 114252
   Summary: Introducing bswapsi reduces code performance
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gjl at gcc dot gnu.org
  Target Milestone: ---

Created attachment 57628
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57628&action=edit
GNU-C test case

typedef __UINT8_TYPE__ uint8_t;
typedef __UINT32_TYPE__ uint32_t;

typedef uint8_t __attribute__((vector_size(4))) v4u8_t;

uint32_t func1 (const uint8_t *buf) {
v4u8_t v4 = { buf[1], buf[0], buf[3], buf[2] };

return (uint32_t) v4;
}

Compile the code with

$ avr-gcc code.c -S -Os -dp

with v13 the result is:


func1:
mov r30,r24  ;  37  [c=4 l=1]  movqi_insn/0
mov r31,r25  ;  38  [c=4 l=1]  movqi_insn/0
ldd r22,Z+1  ;  39  [c=4 l=1]  movqi_insn/3
ld r23,Z ;  40  [c=4 l=1]  movqi_insn/3
ldd r24,Z+3  ;  41  [c=4 l=1]  movqi_insn/3
ldd r25,Z+2  ;  42  [c=4 l=1]  movqi_insn/3
/* epilogue start */
ret  ;  45  [c=0 l=1]  return

which is good code: insn 37, 38 move the address to pointer register Z, and
then follow 4 loads, one for each byte.

When compiled with v14 however:

func1:
mov r30,r24  ;  23  [c=4 l=2]  *movhi/0
mov r31,r25
ld r22,Z ;  24  [c=16 l=4]  *movsi/2
ldd r23,Z+1
ldd r24,Z+2
ldd r25,Z+3
rcall __bswapsi2 ;  25  [c=16 l=1]  *bswapsi2.libgcc
mov r31,r23  ;  32  [c=4 l=1]  movqi_insn/0
mov r23,r25  ;  33  [c=4 l=1]  movqi_insn/0
mov r25,r31  ;  34  [c=4 l=1]  movqi_insn/0
mov r31,r22  ;  35  [c=4 l=1]  movqi_insn/0
mov r22,r24  ;  36  [c=4 l=1]  movqi_insn/0
mov r24,r31  ;  37  [c=4 l=1]  movqi_insn/0
/* epilogue start */
ret  ;  40  [c=0 l=1]  return


Target: avr
Configured with: ../../source/gcc-master/configure --target=avr --disable-nls
--with-dwarf2 --with-gnu-as --with-gnu-ld --disable-shared
--enable-languages=c,c++
Thread model: single
Supported LTO compression algorithms: zlib
gcc version 14.0.1 20240303 (experimental) (GCC)

[Bug target/114252] Introducing bswapsi reduces code performance

2024-03-06 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114252

--- Comment #3 from Georg-Johann Lay  ---
(In reply to Richard Biener from comment #1)
> but somehow we end up doing a libcall?

It's not a libcall in the GCC sense, for the compiler it's just an ordinary
insn.  The backend then prints this as a transparent call to libgcc.

Purpose is that many functions have a small, known footprint as they are
implemented in assembly. An ordinary call would clobber all callee-used regs,
so using a transparent call gives better code than a real call.  Notice this is
the nsn:

(define_insn "*bswapsi2.libgcc"
  [(set (reg:SI 22)
(bswap:SI (reg:SI 22)))
   (clobber (reg:CC REG_CC))]
  "reload_completed"
  "%~call __bswapsi2"
  [(set_attr "type" "xcall")])

However, for the purpose of this PR, no bswap is needed in the 1st place; just
have a look at the v13 code. It just loads the bytes as they belong into the
target value; while v14 loads all 32 bits in one chunk and then starts fiddling
and moving around the constituent bytes.

[Bug target/114252] Introducing bswapsi reduces code performance

2024-03-06 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114252

--- Comment #5 from Georg-Johann Lay  ---
(In reply to Richard Biener from comment #4)
> So bswap on a value is just register shuffling, right?

The point is that there is no need for bswap in the first place, just have a
look at the code that v13 generates.  It's 4 QI loads and that's it, no
shuffling required at all.

But v14 dropped that, and the bswapsi (presumably due to previous flawed tree
optmizations) is introduced by some tree pass.

There's nothing the backend can do about it.  So would you explain why you
think it's a "target" issue?

Maybe the PR title I used is confusing and does not hit the point?

[Bug target/114252] Introducing bswapsi reduces code performance

2024-03-07 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114252

--- Comment #8 from Georg-Johann Lay  ---
(In reply to Richard Biener from comment #7)
> Note I do understand what you are saying, just the middle-end in detecting
> and using __builtin_bswap32 does what it does everywhere else - it checks
> whether the target implements the operation.
> 
> The middle-end doesn't try to actually compare costs (it has no idea of the
> bswapsi costs),

But even when the bswapsi insn costs nothing, the v14 code has these additional
6 movqi insns 32...37 compared to v13 code.  In order to have the same
performance like v13 code, a bswapsi would have to cost negative 6 insns.  And
an optimizer that assumes negative costs is not reasonable, in particular
because the recognition of bswap opportunities serves optimization -- or is
supposed to serve it as far as I understand.

> and it most definitely doesn't see how AVR is special in
> having only QImode registers and thus the created SImode load (which the
> target supports!) will end up as four registers.

Even when the bswap insn would cost nothing the code is worse.

> The only thing that maybe would make sense with AVR exposing bswapsi is
> users calling __builtin_bswap but since it always expands as a libcall
> even that makes no sense.

It makes perfect sense when C/C++ code uses __builtin_bswap32:

* With current bswapsi insn, the code does a call that performs SI:22 =
bswap(SI:22) with NO additionall register pressure.

* Without bswap insn, the code does a real ABI call that performs SI:22 =
bswap(SI:22) PLUS IT CLOBBERS r18, r19, r20, r21, r26, r27, r30 and r31; which
are the most powerful GPRs.

> So my preferred fix would be to remove bswapsi from avr.md?

Is there a way that the backend can fold a call to an insn that performs better
that a call? Like in TARGET_FOLD_BUILTIN?  As far as I know, the backend can
only fold target builtins, but not common builtins?  Tree fold cannot fold to
an insn obviously, but it could fold to inline asm, no?

Or can the target change an optabs entry so it expands to an insn that's more
profitable that a respective call? (like avr.md's bswap insn with transparent
call is more profitable than a real call).

The avr backend does this for many other stuff, too:

divmod, SI and PSI multiplications, parity, popcount, clz, ffs, 

> Does it benefit from recognizing bswap done with shifts on an int?

I don't fully understand that question. You mean to write code that shifts
bytes around like in
uint32_t res = 0;
res |= ((uint32_t) buf[0]) << 24;
res |= ((uint32_t) buf[1]) << 16;
res |= (uint32_t) buf[2] << 8;
res |= buf[3];
return res;
is better than a bswapsi call?

[Bug target/114252] Introducing bswapsi reduces code performance

2024-03-07 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114252

--- Comment #9 from Georg-Johann Lay  ---
...and I don't see why a register allocator would or should fix flaws from tree
optimizers.

[Bug target/114252] Introducing bswapsi reduces code performance

2024-03-07 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114252

--- Comment #12 from Georg-Johann Lay  ---
(In reply to Richard Biener from comment #10)
> I think the target controls the "libcall" ABI that's used for calls to
> libgcc,

You have a pointer how to do it or an example? IIRC I looked into it quite a
while ago, and it didn't allow to specify/adjust call_used_regs[] etc.

> I think the target should implement an inline bswap, possibly via a
> define_insn_and_split or define_split so the byte ops are only exposed
> at a desired point;  important points being lower_subreg (split-wide-types)
> and register allocation - possibly lower_subreg should itself know
> how to handle bswap (though the degenerate AVR case is quite special).

That would result in SUBREGs all over the place.  As Vladimir pointed out in 

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110093#c5

DFA doesn't handle subregs properly, and register alloc then uses extra
reloads, bloating the code (not only in PR110093 but also 114243.  Unlikely any
pass will untangle the mess of four (set (subreg:QI (SI)) (subreg:QI (SI)))



> Yeah.  Or comparing to open-coding the bswap without going through the call.
> I don't have a AVR libgcc around, but libgcc2.s has
> 
> #ifdef L_bswapsi2
> SItype
> __bswapsi2 (SItype u)
> {
>   return u) & 0xff00u) >> 24)
>   | (((u) & 0x00ffu) >>  8)
>   | (((u) & 0xff00u) <<  8)
>   | (((u) & 0x00ffu) << 24));
> }
> #endif 

The libgcc side is not a problem at all, libgcc/config/avr/lib1funcs.S has:

;; swap two registers with different register number
.macro bswap a, b
eor \a, \b
eor \b, \a
eor \a, \b
.endm

#if defined (L_bswapsi2)
;; swap bytes
;; r25:r22 = bswap32 (r25:r22)
DEFUN __bswapsi2
bswap r22, r25
bswap r23, r24
ret
ENDF __bswapsi2
#endif /* defined (L_bswapsi2) */

#if defined (L_bswapdi2)
;; swap bytes
;; r25:r18 = bswap64 (r25:r18)
DEFUN __bswapdi2
bswap r18, r25
bswap r19, r24
bswap r20, r23
bswap r21, r22
ret
ENDF __bswapdi2
#endif /* defined (L_bswapdi2) */


There's currently no handcrafted bswap16 though.

[Bug target/114252] Introducing bswapsi reduces code performance

2024-03-07 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114252

--- Comment #14 from Georg-Johann Lay  ---
The code in the example is not a perfect bswap, it needs additional shuffling
of bytes.  The tree passes must know that bswap is not a perfect fit.  There
must be *some* criterion that depends on the permutation, and when a bswap is
closer to the bswapped-permutation that a non-bswapped permutation is to the
original one.

[Bug target/110220] [13/14 Regression] ICE in patch_jump_insn, at cfgrtl.cc:1295 - avr/xmega

2023-08-01 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110220

Georg-Johann Lay  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED
  Component|rtl-optimization|target

--- Comment #10 from Georg-Johann Lay  ---
Fixed in v13.3+.

[Bug target/105523] Wrong warning array subscript [0] is outside array bounds

2023-08-01 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105523

--- Comment #34 from Georg-Johann Lay  ---
@Senthil: Can this PR be closed? Or will it be backported?

[Bug target/96055] avr: atmega324pb not supported

2023-08-01 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96055

Georg-Johann Lay  changed:

   What|Removed |Added

  Known to work||12.1.0
 Resolution|--- |FIXED
   Severity|normal  |enhancement
   Priority|P3  |P5
 Status|UNCONFIRMED |RESOLVED

--- Comment #3 from Georg-Johann Lay  ---
Closed as fixed in v12+.

ATmega324PB is present in the sources (gcc/config/avr/avr-mcus.def,
Author=Matwey V. Kornilov) since v12.1 at least.

If you want to use it with older versions of the compiler (and newer than
v5.1), please follow the explanation in the avr-gcc wiki at
https://gcc.gnu.org/wiki/avr-gcc#avr-gcc_v5_and_newer

[Bug target/53935] [avr][c++] missing warning for non-const data in progmem

2023-08-01 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53935

Georg-Johann Lay  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED
  Known to work||8.1.0
   Keywords||addr-space

--- Comment #2 from Georg-Johann Lay  ---
Closed as fixed in v7+.

[Bug other/109910] GCC prologue/epilogue saves/restores callee-saved registers that are never changed

2023-08-04 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109910

Georg-Johann Lay  changed:

   What|Removed |Added

   Last reconfirmed||2023-08-04
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

[Bug target/105523] Wrong warning array subscript [0] is outside array bounds

2023-08-09 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105523

Georg-Johann Lay  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #35 from Georg-Johann Lay  ---
Fixed in v14.

[Bug tree-optimization/56456] [meta-bug] bogus/missing -Warray-bounds

2023-08-09 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56456
Bug 56456 depends on bug 105523, which changed state.

Bug 105523 Summary: Wrong warning array subscript [0] is outside array bounds
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105523

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug rtl-optimization/101188] [postreload] Uses content of a clobbered register

2023-08-09 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101188

Georg-Johann Lay  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #18 from Georg-Johann Lay  ---
Fixed in v14.

[Bug rtl-optimization/110093] [12/13/14 Regression][avr] Move frenzy leading to code bloat

2023-08-22 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110093

--- Comment #2 from Georg-Johann Lay  ---
Meanwhile (2023-08-22) the generated code from above got worse once again and
even pops a frame:

long add (long aa, long bb, long cc)
{
if (cc < 0)
return aa - cc;
return aa + bb;
}

> avr-gcc -Os -S -dp

add:
push r4  ;  83  [c=4 l=1]  pushqi1/0
push r5  ;  84  [c=4 l=1]  pushqi1/0
push r6  ;  85  [c=4 l=1]  pushqi1/0
push r7  ;  86  [c=4 l=1]  pushqi1/0
push r8  ;  87  [c=4 l=1]  pushqi1/0
push r9  ;  88  [c=4 l=1]  pushqi1/0
push r10 ;  89  [c=4 l=1]  pushqi1/0
push r11 ;  90  [c=4 l=1]  pushqi1/0
push r14 ;  91  [c=4 l=1]  pushqi1/0
push r15 ;  92  [c=4 l=1]  pushqi1/0
push r16 ;  93  [c=4 l=1]  pushqi1/0
push r17 ;  94  [c=4 l=1]  pushqi1/0
push r28 ;  95  [c=4 l=1]  pushqi1/0
push r29 ;  96  [c=4 l=1]  pushqi1/0
 ; SP -= 4   ;  100 [c=4 l=2]  *addhi3_sp
rcall . 
rcall . 
in r28,__SP_L__  ;  126 [c=4 l=2]  *movhi/7
in r29,__SP_H__
/* prologue: function */
/* frame size = 4 */
/* stack size = 18 */
.L__stack_usage = 18
mov r8,r22   ;  69  [c=4 l=1]  movqi_insn/0
mov r9,r23   ;  70  [c=4 l=1]  movqi_insn/0
mov r10,r24  ;  71  [c=4 l=1]  movqi_insn/0
mov r11,r25  ;  72  [c=4 l=1]  movqi_insn/0
std Y+1,r18  ;  73  [c=4 l=1]  movqi_insn/2
std Y+2,r19  ;  74  [c=4 l=1]  movqi_insn/2
std Y+3,r20  ;  75  [c=4 l=1]  movqi_insn/2
std Y+4,r21  ;  76  [c=4 l=1]  movqi_insn/2
mov r4,r14   ;  77  [c=4 l=1]  movqi_insn/0
mov r5,r15   ;  78  [c=4 l=1]  movqi_insn/0
mov r6,r16   ;  79  [c=4 l=1]  movqi_insn/0
mov r7,r17   ;  80  [c=4 l=1]  movqi_insn/0
sbrs r7,7;  123 [c=4 l=2]  *sbrx_branchhi
rjmp .L2
mov r25,r11  ;  67  [c=4 l=4]  *movsi/0
mov r24,r10
mov r23,r9
mov r22,r8
sub r22,r4   ;  68  [c=16 l=4]  *subsi3/0
sbc r23,r5
sbc r24,r6
sbc r25,r7
.L1:
/* epilogue start */
 ; SP += 4   ;  106 [c=4 l=4]  *addhi3_sp
pop __tmp_reg__
pop __tmp_reg__
pop __tmp_reg__
pop __tmp_reg__
pop r29  ;  107 [c=4 l=1]  popqi
pop r28  ;  108 [c=4 l=1]  popqi
pop r17  ;  109 [c=4 l=1]  popqi
pop r16  ;  110 [c=4 l=1]  popqi
pop r15  ;  111 [c=4 l=1]  popqi
pop r14  ;  112 [c=4 l=1]  popqi
pop r11  ;  113 [c=4 l=1]  popqi
pop r10  ;  114 [c=4 l=1]  popqi
pop r9   ;  115 [c=4 l=1]  popqi
pop r8   ;  116 [c=4 l=1]  popqi
pop r7   ;  117 [c=4 l=1]  popqi
pop r6   ;  118 [c=4 l=1]  popqi
pop r5   ;  119 [c=4 l=1]  popqi
pop r4   ;  120 [c=4 l=1]  popqi
ret  ;  121 [c=0 l=1]  return_from_epilogue
.L2:
ldd r22,Y+1  ;  65  [c=16 l=4]  *movsi/2
ldd r23,Y+2
ldd r24,Y+3
ldd r25,Y+4
add r22,r8   ;  66  [c=16 l=4]  *addsi3/0
adc r23,r9
adc r24,r10
adc r25,r11
rjmp .L1 ;  124 [c=4 l=1]  jump

.ident  "GCC: (GNU) 14.0.0 20230822 (experimental)"

[Bug rtl-optimization/110093] [12/13/14 Regression][avr] Move frenzy leading to code bloat

2023-08-22 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110093

Georg-Johann Lay  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2023-08-22
 Status|UNCONFIRMED |NEW

[Bug rtl-optimization/110093] [12/13/14 Regression][avr] Move frenzy leading to code bloat

2023-08-30 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110093

--- Comment #4 from Georg-Johann Lay  ---
(In reply to Vladimir Makarov from comment #3)
> I propose to avoid the above RTL code by switching off subreg3
> pass (or -fsplit-wide-types) for AVR by default as it was for gcc-8.

Thanks for looking into this.

With v8, I don't see a difference with -f[no-]split-wide-types, everything
works fine.

Since v10 r280033 the default is -fsplit-wide-types-early, but that option has
no effect on testcase + master, only -fno-split-wide-types seems to "fix" the
problem, regardless of -f[no-]split-wide-types-early.

>From my experience, -fno-split-wide-types has no clear edge over
-fsplit-wide-types, which very much depends on the code.  This is the reason
why -fsplit-wide-types is still the default.

So are you saying that the bug is actually in lower-subreg.cc ?

[Bug libstdc++/111639] HAVE_ACOSF etc. are wrong on avr

2023-09-30 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111639

--- Comment #2 from Georg-Johann Lay  ---
(In reply to Jonathan Wakely from comment #0)
> The  in avr-libc does things like this:
> 
> extern double acos(double __x) __ATTR_CONST__;
> #define acosf acos/**< The alias for acos().  */

This is no more the case with current AVR-LibC, which uses proper prototypes
and symbols for acos, acosf and acosl etc.

Here is math.h from the AVR-LibC v2.1 release (Jan 2022) :
https://github.com/avrdudes/avr-libc/blob/c466ef11ebf6cf774b7148dbd78c250789989ce0/include/math.h
(which has only acos and acosf, where the alias is implemented using assembly
name __asm("")).

The next release will also include long double prototypes, and they are proper
prototypes (without __asm("") names).

math.h from current HEAD:
https://github.com/avrdudes/avr-libc/blob/main/include/math.h

[Bug libstdc++/111639] HAVE_ACOSF etc. are wrong on avr

2023-10-01 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111639

--- Comment #4 from Georg-Johann Lay  ---
(In reply to Jonathan Wakely from comment #3)
> Which versions of avr-libc are supported with gcc?

The versions are only very loosely coupled.  Anything from AVR-LibC v1.8 on (or
maybe even older) should be fine with avr-gcc v5+.

[Bug libstdc++/111639] HAVE_ACOSF etc. are wrong on avr

2023-10-01 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111639

--- Comment #6 from Georg-Johann Lay  ---
May I ask, are you working on getting libstdc++ to work for avr?

[Bug c++/43745] [avr] g++ puts VTABLES in SRAM

2024-08-02 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43745

Georg-Johann Lay  changed:

   What|Removed |Added

   Last reconfirmed|2012-01-07 00:00:00 |2024-8-2
 Status|RESOLVED|NEW
 Resolution|WONTFIX |---
Version|4.7.0   |15.0

[Bug target/116295] [avr] unrecognizable insn when loading from address-space __flash

2024-08-08 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116295

Georg-Johann Lay  changed:

   What|Removed |Added

   Keywords||addr-space,
   ||ice-on-valid-code
   Assignee|unassigned at gcc dot gnu.org  |gjl at gcc dot gnu.org
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2024-08-08
 Ever confirmed|0   |1
 Target||avr
   Target Milestone|--- |15.0

[Bug target/116295] New: [avr] unrecognizable insn when loading from address-space __flash

2024-08-08 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116295

Bug ID: 116295
   Summary: [avr] unrecognizable insn when loading from
address-space __flash
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gjl at gcc dot gnu.org
  Target Milestone: ---

Created attachment 58877
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58877&action=edit
ice-flash.c: GNU-C99 test case

long val;

const __flash long* load4_flash (const __flash long *p)
{
val += *p++;
val += *p++;
return p;
}

triggers an ICE when compiled with

$ avr-gcc ice-flash.c -S -Os

It occurs in some situations when a value from __flash is loaded:

* The device has no LPMx instruction.
* More then 2 bytes are loaded.
* Pass mfuse-add finds an optimization opportunity.

The bug can be worked around with -mno-fuse-add.

ice-flash.c: In function 'load4_flash':
ice-flash.c:8:1: error: unrecognizable insn:
8 | }
  | ^
(insn 52 36 9 2 (parallel [
(set (reg:SI 22 r22)
(mem:SI (post_inc:HI (reg:HI 30 r30)) [1  S4 A8 AS1]))
(clobber (reg:CC 36 cc))
]) "ice-flash.c":5:9 -1
 (expr_list:REG_UNUSED (reg:CC 36 cc)
(expr_list:REG_INC (reg:HI 30 r30)
(nil
during RTL pass: cprop_hardreg
ice-flash.c:8:1: internal compiler error: in extract_insn, at recog.cc:2848

[Bug target/116295] [avr] unrecognizable insn when loading from address-space __flash

2024-08-08 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116295

Georg-Johann Lay  changed:

   What|Removed |Added

   Target Milestone|15.0|14.3
 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #3 from Georg-Johann Lay  ---
Fixed in v14.3+

[Bug target/113934] Switch avr to LRA

2024-08-09 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113934

--- Comment #4 from Georg-Johann Lay  ---
Would someone please explain what has to be done?

It's likely more than just

#define TARGET_LRA_P hook_bool_void_true

[Bug target/113934] Switch avr to LRA

2024-08-09 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113934

--- Comment #6 from Georg-Johann Lay  ---
...to be more specific:

TARGET_CANNOT_SUBSTITUTE_MEM_EQUIV_P explains the function of the hook from the
perspective of someone who is implementing a register allocator, but there is
no explanation whether it is a good idea (or even required) to implement it for
some specific target.  What form can "subst" take?  When it's purpose it to
avoid spills, then why not always true? (Nobody wants stills when they can be
avoided).

TARGET_LEGITIMIZE_ADDRESS_DISPLACEMENT: How would I describe addressing
capabilities for different named address-spaces?  What kind of target code can
I use to investigate the effect of the hook? Or can it inferred simply from the
device's register layout?

TARGET_SPILL_CLASS:  Can't we just return GENERAL_REGS as a spill class?

TARGET_COMPUTE_PRESSURE_CLASSES:  Requests that we should compute pressure
classes.  Now I know everything about it ...kidding.  Again it's from the
perspective of someone who is writing a register allocator, but of no use for
someone who has to provide an implementation.

TARGET_ADDITIONAL_ALLOCNO_CLASS_P: Similar issue.

TARGET_REGISTER_PRIORITY: When some registers are preferred over others and
hence we give them a higher priority, might that lead to more MOVs or spills?

Finally: Who will fix fallout like ICEs (spill fails), performance issues, etc?
Just reporting them here as PR will likely not help much, because AVR is
ternary and hence any PR has priority P4 or less.  For example, Newlib dropped
AVR support because nobody did fix all the spill fail ICEs when building Newlib
for AVR.  lra just perform 2 rounds, and when it doesn't find an allocation it
just bails out with spill fail ICE.

[Bug target/113934] Switch avr to LRA

2024-08-09 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113934

--- Comment #7 from Georg-Johann Lay  ---
...more questions:

What's the connexion between TARGET_REGISTER_PRIORITY and
ADJUST_REG_ALLOC_ORDER  / reg_alloc_order[].

What about reload_completed?  Does semantics stay the same? What about
reg_renumber[].  And reload_in_progress becomes lra_in_progress or what?

[Bug target/113934] Switch avr to LRA

2024-08-09 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113934

--- Comment #8 from Georg-Johann Lay  ---
...more questions:

TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS: Same issue: This hook can change a
reload class.  The purpose is clear for regalloc guys, but when and d why and
how would I do it for a specific backend?  The hook has two "reg_class_t"
parameters as inputs, and no parameter does even have a name. "default hook
always returns given class" ... Which one? There are two indestinguishible
ones.

[Bug rtl-optimization/116321] New: [lra][avr] internal compiler error: in avr_out_lpm_no_lpmx, at config/avr/avr.cc:4572

2024-08-10 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116321

Bug ID: 116321
   Summary: [lra][avr] internal compiler error: in
avr_out_lpm_no_lpmx, at config/avr/avr.cc:4572
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gjl at gcc dot gnu.org
  Target Milestone: ---

typedef __UINT64_TYPE__ uint64_t;

uint64_t fun64 (const __flash uint64_t *p)
{
  return *p;
}

runs into an ICE:

$ avr-gcc lra-bug.c -S -Os -da -mlra
during RTL pass: shorten
dump file: lra-bug.c.354r.shorten
lra-bug.c: In function 'fun64':
lra-bug.c:6:1: internal compiler error: in avr_out_lpm_no_lpmx, at
config/avr/avr.cc:4572
6 | }
  | ^

The respective line in avr.cc reads:

  gcc_assert (REG_Z == REGNO (addr));

because the only addressing modes for AS1 __flash are REG and POST_INC of REG_Z
(reg:HI 30).  However, the insn fed into the function as produced by LRA is
like found in lra-bug.c.317r.reload:

(insn 48 47 49 2 (set (reg:QI 25 r25 [+7 ])
(mem:QI (reg/f:HI 28 r28 [60]) [1 *p_2(D)+7 S1 A8 AS1]))
"lra-bug.c":6:1 86 {movqi_insn_split}
 (nil))

This insn clearly violates avr.cc's REGNO_MODE_CODE_OK_FOR_BASE_P which only
allows REG_Z (regno 30) as register for non-generic address-spaces like AS1.
And avr.cc'c MODE_CODE_BASE_REG_CLASS has:

  if (!ADDR_SPACE_GENERIC_P (as))
{
  return POINTER_Z_REGS;
}

but reg:HI 28 in insn 48 is not an element of POINTER_Z_REGS.


Target: avr
Configured with: ../../source/gcc-master/configure --target=avr --disable-nls
--with-dwarf2 --with-gnu-as --with-gnu-ld --disable-shared
--with-long-double=64 --enable-languages=c,c++

[Bug rtl-optimization/116321] [lra][avr] internal compiler error: in avr_out_lpm_no_lpmx, at config/avr/avr.cc:4572

2024-08-10 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116321

Georg-Johann Lay  changed:

   What|Removed |Added

   Keywords||ice-on-valid-code, ra
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-08-10
 Blocks||113934, 113932, 56183
 Target||avr
 Ever confirmed|0   |1


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56183
[Bug 56183] [meta-bug][avr] Problems with register allocation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113932
[Bug 113932] [meta-bug] Targets which should be ported to LRA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113934
[Bug 113934] Switch avr to LRA

[Bug target/113934] Switch avr to LRA

2024-08-10 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113934

--- Comment #10 from Georg-Johann Lay  ---
(In reply to Segher Boessenkool from comment #9)
> (In reply to Georg-Johann Lay from comment #4)
> > Would someone please explain what has to be done?
> > 
> > It's likely more than just
> > 
> > #define TARGET_LRA_P hook_bool_void_true
> 
> That is what you start with, though.  Or more likely, you have a -mlra
> flag to enable/disable it during development.  You can do that *right now*,
> and that enables other people to help you out with this, etc. :-)

Done: https://gcc.gnu.org/r15-2865

> Possibly some things will not work.

Ya, it's easier to break than I thought.  LRA already breaks for one of the
random programs I had lying around: PR116321

[Bug other/116322] New: regenerate-opt-urls.py usage

2024-08-10 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116322

Bug ID: 116322
   Summary: regenerate-opt-urls.py usage
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: other
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gjl at gcc dot gnu.org
  Target Milestone: ---

$ ./regenerate-opt-urls.py  -h
usage: regenerate-opt-urls.py [-h] [--unit-test] base_html_dir src_gcc_dir

[...]

Usage (from build/gcc subdirectory):
  ../../src/gcc/regenerate-opt-urls.py HTML/gcc-14.0.0/ ../../src

Running the script terminates with an error:

$ ../../../source/gcc-master/gcc/regenerate-opt-urls.py HTML/gcc-15.0.0/
../../../source/gcc-master/
[...]
FileNotFoundError: [Errno 2] No such file or directory:
'HTML/gcc-15.0.0/gdc/Option-Index.html'

The problem is obviously that GCC hasn't been configured for D, which is clear
because the target does not support D.

The regenerate-opt-urls.py should document how to re-generate onyl specific
option files, namely the one that is associated to a changed .opt file
(somewhere in gcc/config/$target).

[Bug rtl-optimization/116324] New: [lra] error: inconsistent operand constraints in an 'asm'

2024-08-10 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116324

Bug ID: 116324
   Summary: [lra] error: inconsistent operand constraints in an
'asm'
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gjl at gcc dot gnu.org
  Target Milestone: ---

Created attachment 58896
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58896&action=edit
lra-bug2.c: GNU-C99 test case

This error occurs when we try to build avr libgcc with -mlra:

$ avr-gcc lra-bug2.c -S -mlra -Os 

or

$ avr-gcc lra-bug2.c -S -mlra

In function '__f7_clr',
inlined from '__f7_madd_msub' at lra-bug2.c:77:3:
lra-bug2.c:42:3: error: inconsistent operand constraints in an 'asm'
   42 |   __asm ("%~call %x[f]"
  |   ^

The input constraint is like "z" (cc) where cc is a void* that perfectly fits
into 16-bit regster Z (reg:HI 30) which has register constraint "z".

Target: avr
Configured with: ../../source/gcc-master/configure --target=avr --disable-nls
--with-dwarf2 --with-gnu-as --with-gnu-ld --disable-shared
--with-long-double=64 --enable-languages=c,c++

[Bug rtl-optimization/116324] [lra] error: inconsistent operand constraints in an 'asm'

2024-08-10 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116324

Georg-Johann Lay  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2024-08-10
 Blocks||113934, 113932, 56183
 Status|UNCONFIRMED |NEW
 Target||avr
   Keywords||ra, rejects-valid


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56183
[Bug 56183] [meta-bug][avr] Problems with register allocation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113932
[Bug 113932] [meta-bug] Targets which should be ported to LRA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113934
[Bug 113934] Switch avr to LRA

[Bug target/113934] Switch avr to LRA

2024-08-10 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113934

--- Comment #11 from Georg-Johann Lay  ---
LRA even breaks building libgcc: PR116324

[Bug rtl-optimization/116325] New: [lra] error: unable to generate reloads for:

2024-08-10 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116325

Bug ID: 116325
   Summary: [lra] error: unable to generate reloads for:
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gjl at gcc dot gnu.org
  Target Milestone: ---

Created attachment 58897
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58897&action=edit
pr60040-2.c: GNU-C99 test case from gcc.target/avr

$ avr-gcc pr60040-2.c -mlra -S -Os

/pr60040-2.c:112:1: error: unable to generate reloads for:
  112 | }
  | ^
(call_insn 44 43 45 3 (parallel [
(set (reg:HI 24 r24)
(call (mem:HI (reg/f:HI 79 [ ops_25(D)->blank ]) [0 *_26 S2
A8])
(const_int 0 [0])))
(use (const_int 0 [0]))
]) "gcc.target/avr/pr60040-2.c":66:10 774 {call_value_insn}
 (expr_list:REG_DEAD (reg/f:HI 79 [ ops_25(D)->blank ])
(expr_list:REG_DEAD (reg:SI 20 r20)
(expr_list:REG_DEAD (reg:SI 16 r16)
(expr_list:REG_DEAD (reg:SI 12 r12)
(expr_list:REG_DEAD (reg:SI 8 r8)
(expr_list:REG_UNUSED (reg:HI 24 r24)
(expr_list:REG_CALL_DECL (nil)
(nil
(expr_list:HI (use (reg:HI 24 r24))
(expr_list:SI (use (reg:SI 20 r20))
(expr_list:SI (use (reg:SI 16 r16))
(expr_list:SI (use (reg:SI 12 r12))
(expr_list:SI (use (reg:SI 8 r8))
(nil)))
during RTL pass: reload
gcc.target/avr/pr60040-2.c:112:1: internal compiler error: in
curr_insn_transform, at lra-constraints.cc:4283

The insn in an indirect call, which should use the Z register (reg:HI 30) for
the target address.

Target: avr
Configured with: ../../source/gcc-master/configure --target=avr --disable-nls
--with-dwarf2 --with-gnu-as --with-gnu-ld --disable-shared
--with-long-double=64 --enable-languages=c,c++

[Bug rtl-optimization/116325] [lra] error: unable to generate reloads for:

2024-08-10 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116325

Georg-Johann Lay  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Target||avr
 Status|UNCONFIRMED |NEW
   Keywords||ice-on-valid-code, ra
 Blocks||56183, 113932, 113934
   Last reconfirmed||2024-08-10


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56183
[Bug 56183] [meta-bug][avr] Problems with register allocation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113932
[Bug 113932] [meta-bug] Targets which should be ported to LRA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113934
[Bug 113934] Switch avr to LRA

[Bug rtl-optimization/116326] New: [lra] internal compiler error: in get_reload_reg, at lra-constraints.cc:755

2024-08-10 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116326

Bug ID: 116326
   Summary: [lra] internal compiler error: in get_reload_reg, at
lra-constraints.cc:755
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: gjl at gcc dot gnu.org
  Target Milestone: ---

Created attachment 58898
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58898&action=edit
GNU-C99 test case

$ avr-gcc addr-space-1-0-i.c -S -mlraduring RTL pass: reload
addr-space-1-0-i.c: In function 'main':
addr-space-1-0-i.c:85:1: internal compiler error: in get_reload_reg, at
lra-constraints.cc:755
   85 | }
  | ^

Target: avr
Configured with: ../../source/gcc-master/configure --target=avr --disable-nls
--with-dwarf2 --with-gnu-as --with-gnu-ld --disable-shared
--with-long-double=64 --enable-languages=c,c++

[Bug rtl-optimization/116326] [lra] internal compiler error: in get_reload_reg, at lra-constraints.cc:755

2024-08-10 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116326

Georg-Johann Lay  changed:

   What|Removed |Added

   Last reconfirmed||2024-08-10
   Keywords||ice-on-valid-code, ra
 Blocks||56183, 113932, 113934
 Target||avr
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #1 from Georg-Johann Lay  ---
The opening should read:

$ avr-gcc addr-space-1-0-i.c -S -mlra
during RTL pass: reload
addr-space-1-0-i.c: In function 'main':
...


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56183
[Bug 56183] [meta-bug][avr] Problems with register allocation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113932
[Bug 113932] [meta-bug] Targets which should be ported to LRA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113934
[Bug 113934] Switch avr to LRA

[Bug target/116236] [LRA] [M68K] ICE insn does not satisfy its constraints

2024-08-10 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116236

Georg-Johann Lay  changed:

   What|Removed |Added

 CC||gjl at gcc dot gnu.org

--- Comment #7 from Georg-Johann Lay  ---
(In reply to Richard Biener from comment #2)
> Docs say
> 
> Legitimate addresses are defined in two variants: a strict variant and a
> non-strict one.  The @var{strict} parameter chooses which variant is
> desired by the caller.
> 
> The strict variant is used in the reload pass.  It must be defined so
> that any pseudo-register that has not been allocated a hard register is
> considered a memory reference.

I don't quite understand this sentence.

Does that mean that legitimate_address_p has to accept MEM as
(part of) a valid address, even when only a hard reg is
allowed as address?

Moreover legitimate_address_p seems outdated / incomplete and
TARGET_ADDR_SPACE_LEGITIMATE_ADDRESS_P the right hook to use.

[Bug other/116322] regenerate-opt-urls.py usage

2024-08-10 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116322

--- Comment #2 from Georg-Johann Lay  ---
And it may be easier to use when we had a $builddir/gcc/regenerate-opt-urls.py
built by configure

1) $builddir/gcc/regenerate-opt-urls.py would know where $srcdir is.

2) $builddir/gcc/regenerate-opt-urls.py would know what HTML/$version to use
and could issue an error to run "make html" when it does not exist.

3) Shebang like #!/usr/bin/env python3 may not work on some build machines even
when they have Python3 installed.  configure can find a required Python version
or higher:

AM_PATH_PYTHON([], [AC_MSG_NOTICE([using $PYTHON to run
regenerate-opt-urls.py])])

AC_CONFIG_FILES([regenerate-opt-urls.py], [chmod +x regenerate-opt-urls.py])

Though GCC is using some older version of autotools, and I don't know how well
and reliable AM_PATH_PYTHON works there.

[Bug rtl-optimization/116321] [lra][avr] internal compiler error: in avr_out_lpm_no_lpmx, at config/avr/avr.cc:4572

2024-08-10 Thread gjl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116321

--- Comment #1 from Georg-Johann Lay  ---
What I do not understand is when I also set -mlog=legitimate_address_p then I
only get logs that have strict=0 and not a single one with strict=1, like:

avr_addr_space_legitimate_address_p[fun64:split5(357)]: ret=true, mode=QI
strict=0 reload_completed=1 reload_in_progress=0 (reg_renumber):
(reg/f:HI 28 r28 [60])

This is for pass .split5 that runy way after reload, and strict=0 doesn't make
much sense to me.

  1   2   3   4   5   6   >