On 10/03/14 22:37, DJ Delorie wrote:
The use of "volatile" disables many of GCC's optimizations. I
consider this a bug in GCC, but at the moment it needs to be "fixed"
in the backends on a case-by-case basis.
Hi,
I've looked into the differences between the steps taken when using a
variable declared volatile, and when it isn't but I'm getting a bit stuck.
Taking the following code as an example:
----------------------------------------------------------
typedef struct
{
unsigned char no0 :1;
unsigned char no1 :1;
unsigned char no2 :1;
unsigned char no3 :1;
unsigned char no4 :1;
unsigned char no5 :1;
unsigned char no6 :1;
unsigned char no7 :1;
} __BITS8;
union un_if0h
{
unsigned char if0h;
__BITS8 BIT;
};
#define IF0H (*(volatile union un_if0h *)0xFFFE1).if0h
#define IF0H_bit (*(volatile union un_if0h *)0xFFFE1).BIT
void test(void)
{
IF0H_bit.no5 = 1;
}
----------------------------------------------------------
and compiling it with -Os and -da once as-is and once with IF0H_bit not
declared volatile.
The generated RTL is basically the same until the 'combine' stage
--------------------non-volatile start--------------------
Trying 5 -> 7:
Failed to match this instruction:
(parallel [
(set (reg:QI 45 [ MEM[(union un_if0h *)65505B].BIT.no5 ])
(mem/j:QI (const_int -31 [0xffffffffffffffe1]) [0
MEM[(union un_if0h *)65505B].BIT.no5+0 S1 A8]))
(set (reg/f:HI 43)
(const_int -31 [0xffffffffffffffe1]))
])
Failed to match this instruction:
(parallel [
(set (reg:QI 45 [ MEM[(union un_if0h *)65505B].BIT.no5 ])
(mem/j:QI (const_int -31 [0xffffffffffffffe1]) [0
MEM[(union un_if0h *)65505B].BIT.no5+0 S1 A8]))
(set (reg/f:HI 43)
(const_int -31 [0xffffffffffffffe1]))
])
Trying 7 -> 8:
Successfully matched this instruction:
(set (reg:QI 46)
(ior:QI (mem/j:QI (reg/f:HI 43) [0 MEM[(union un_if0h
*)65505B].BIT.no5+0 S1 A8])
(const_int 32 [0x20])))
deferring deletion of insn with uid = 7.
modifying insn i3 8: r46:QI=[r43:HI]|0x20
deferring rescan insn with uid = 8.
---------------------non-volatile end---------------------
----------------------volatile start----------------------
Trying 5 -> 7:
Failed to match this instruction:
(parallel [
(set (reg:QI 45 [ MEM[(volatile union un_if0h *)65505B].BIT.no5 ])
(mem/v/j:QI (const_int -31 [0xffffffffffffffe1]) [0
MEM[(volatile union un_if0h *)65505B].BIT.no5+0 S1 A8]))
(set (reg/f:HI 43)
(const_int -31 [0xffffffffffffffe1]))
])
Failed to match this instruction:
(parallel [
(set (reg:QI 45 [ MEM[(volatile union un_if0h *)65505B].BIT.no5 ])
(mem/v/j:QI (const_int -31 [0xffffffffffffffe1]) [0
MEM[(volatile union un_if0h *)65505B].BIT.no5+0 S1 A8]))
(set (reg/f:HI 43)
(const_int -31 [0xffffffffffffffe1]))
])
Trying 7 -> 8:
Failed to match this instruction:
(set (reg:QI 46)
(ior:QI (mem/v/j:QI (reg/f:HI 43) [0 MEM[(volatile union un_if0h
*)65505B].BIT.no5+0 S1 A8])
(const_int 32 [0x20])))
-----------------------volatile end-----------------------
Bearing in mind that I'm new to all this and may be missing something
blindingly obvious, what would cause 7->8 to fail when declared volatile
and not when not? Does something need adding to rl78-virt.md to allow
it to match?
It doesn't seem like this is due to missing an optimization step that
combines insns (hmm, "combine?") but rather to not recognizing that a
single, existing insn is possible and so splitting the operation up into
multiple steps.
The 'Failed to match' string comes after calling 'recog' but I'm either
too blind or too stupid to find the implementation.
The result of this (as I mentioned in my first post) is that this is
produced:
28 _test:
29 0000 C9 F2 E1 FF movw r10, #-31
30 0004 AD F2 movw ax, r10
31 0006 16 movw hl, ax
32 0007 8B mov a, [hl]
33 0008 6C 20 or a, #32
34 000a 9B mov [hl], a
35 000b D7 ret
instead of this:
28 _test:
29 0000 71 5A E1 set1 0xfffe1.5
30 0003 D7 ret
Surely the optimized code is also valid for a volatile variable? In
fact, I would have thought it *more* valid as it performs the entire
operation in a single instruction instead of splitting it into a very
definite read-modify-write sequence?
Since operations on memory-mapped hardware registers are your
bread-and-butter on a microcontroller, 'curing' this would bring
significant gains.
Am I missing something (non-)obvious?
Regards,
Richard