Re: [avr-gcc-list] [bug] cbi optimization error for 8-bit AVRs

2014-11-09 Thread Joern Rennecke
On 8 November 2014 00:32, Szikra István  wrote:

> Unfortunately GCC support for AVR (among other things) is not always
> flawless. And it changes from versions to version (and not always for the
> better).

regression tests can help to catch instances where the compiler looses
functionality.

I'm trying to make gcc 5.0 do the right thing.  As a matter of fact, I
get a preprocessor failure and an ICE (Internal Compiler Error) there.

> Since I’m a long time AVR developer I have a lot of compiler versions
> installed (WinAVR 2006-2010, and Atmel Studio 6.2 with GCC 4.8.1), but I
> could test my code with only one (the latest).
>
> I run into some trouble with clearing port bits not translating from C into
> cbi in assembler. It is caused by my bits template, but I do not know why.
> It seems to me like a bug in GCC. Maybe someone here can shed some light on
> the reason, or suggest a fix.
>
> Here is the code:

May I include that as a test case in the gcc.target/avr testsuite?

___
AVR-GCC-list mailing list
AVR-GCC-list@nongnu.org
https://lists.nongnu.org/mailman/listinfo/avr-gcc-list


Re: [avr-gcc-list] [bug] cbi optimization error for 8-bit AVRs

2014-11-09 Thread Joern Rennecke
On 8 November 2014 00:32, Szikra István  wrote:
> Hi everyone!
>
> My problem in sort: I’m getting
> in  r24, 0x18
> ldi r25, 0x00
> andir24, 0xEF
> out 0x18, r24
> instead of
> cbi 0x18, 4
> .
>
> I’m trying to write efficient modern C/C++ code for multiple platforms
> including AVR 8 bit controllers.
>
> Unfortunately GCC support for AVR (among other things) is not always
> flawless. And it changes from versions to version (and not always for the
> better).
> Since I’m a long time AVR developer I have a lot of compiler versions
> installed (WinAVR 2006-2010, and Atmel Studio 6.2 with GCC 4.8.1), but I
> could test my code with only one (the latest).
>
> I run into some trouble with clearing port bits not translating from C into
> cbi in assembler. It is caused by my bits template, but I do not know why.
> It seems to me like a bug in GCC. Maybe someone here can shed some light on
> the reason, or suggest a fix.

The transformation would seem prima facia best suited for the combine pass.
When we see the store, we know it's only 8 bit wide, thus we can perform the
arithmetic in 8 bit.  However, the store is to a volatile (actually,
I/O) address.
Now, gcc is not very good at optimizing code involving volatile (the
target-independent
code would likely already have simplified the arithmetic if these
weren't in the way -
buy you'd really need a completely different test case without
volatile to keep the
computation relevant).
These memory / I/O accesses require extra ordering constraints, so
large parts of the optimizers just punt when they encounter a volatile
reference.
There is some scope to tweak this - I've attached at proof of concept
patch to optimize
your code (based on gcc 5.0).  However, this opens the possibility
that it'll break something with respect to volatile ordering now - or
even later down the line.

ISTR we have some more detailed dependency checks at least in some places. in
fact, if there weren't, we should already see the existing cbi pattern
misbehaving.
consider:
typedef unsigned char uint8_t;

main ()
{
int i = (*(volatile uint8_t *)((0x18) + 0x20)) ;
(*(volatile uint8_t *)((0x17) + 0x20)) = 0x0f;
(*(volatile uint8_t *)((0x18) + 0x20))  = i & ~4;
}
if the complier was oblivious to the intervening write it could
generate a cbi here -
but it doesn't - well, at least not at -O2 ...

Another possible approach would be to use a peephole2 pattern or a
target-specific
optimization pass to do the 16->8 bit arithmetic transformation.
However, if you work after combine (as is definitely the case with peephole2),
you need to
diff --git a/gcc/config/avr/avr.md b/gcc/config/avr/avr.md
index cc85b79..6e7e353 100644
--- a/gcc/config/avr/avr.md
+++ b/gcc/config/avr/avr.md
@@ -679,8 +679,8 @@ (define_expand "mov"
 ;; "movqi_insn"
 ;; "movqq_insn" "movuqq_insn"
 (define_insn "mov_insn"
-  [(set (match_operand:ALL1 0 "nonimmediate_operand" "=r,d,Qm   ,r 
,q,r,*r")
-(match_operand:ALL1 1 "nox_general_operand"   "r Y00,n Ynn,r 
Y00,Qm,r,q,i"))]
+  [(set (match_operand:ALL1 0 "movdst_operand" "=r,d,Qm   ,r 
,q,r,*r")
+(match_operand:ALL1 1 "nox_general_operand" "r Y00,n Ynn,r 
Y00,Qm,r,q,i"))]
   "(register_operand (operands[0], mode)
|| reg_or_0_operand (operands[1], mode)) &&
/* skip if operands are out of lds/sts memory access range(0x40..0xbf)
@@ -3086,6 +3086,20 @@ (define_peephole2 ; andi
 operands[1] = GEN_INT (INTVAL (operands[1]) & INTVAL (operands[2]));
   })
 
+; standard promotion can get us a 16 bit int where only an 8 bit value
+; is required.  combine-split extend-binop-store to binop-store.
+
+(define_split
+  [(set (match_operand:QI 0 "any_qi_mem_operand")
+   (match_operator 3 "avr_binary_operator"
+[(match_operand 1 "nonmemory_operand")
+ (match_operand 2 "nonmemory_operand")]))
+   (clobber (match_operand:QI 4 "register_operand"))]
+  ""
+  [(set (match_dup 4) (match_dup 3))
+   (set (match_dup 0) (match_dup 4))]
+  "");
+
 ;;|
 ;; ior
 
diff --git a/gcc/config/avr/predicates.md b/gcc/config/avr/predicates.md
index 4b456a5..ea8e095 100644
--- a/gcc/config/avr/predicates.md
+++ b/gcc/config/avr/predicates.md
@@ -69,6 +69,24 @@ (define_predicate "nop_general_operand"
   (and (match_operand 0 "general_operand")
(match_test "!avr_mem_flash_p (op)")))
 
+; allow a QImode memory operand, including volatile / I/O.
+(define_predicate "any_qi_mem_operand"
+  (and (match_code "mem")
+   (match_test "address_operand (XEXP (op, 0), Pmode)")
+   (match_test "mode == QImode")))
+
+;; We also allow volatile memory here, so that the combiner can work with
+;; stores for I/O accesses as combiner-splitter results.
+;; As this is for a destination location, that should generally be safe
+;; for combine, as we should he handling the last insn then, which stays
+;; in place.
+;; ??? Could th