On 10/03/14 22:37, DJ Delorie wrote:

The use of "volatile" disables many of GCC's optimizations.  I
consider this a bug in GCC, but at the moment it needs to be "fixed"
in the backends on a case-by-case basis.


Hi,

I've looked into the differences between the steps taken when using a variable declared volatile, and when it isn't but I'm getting a bit stuck.

Taking the following code as an example:
----------------------------------------------------------
typedef struct
{
   unsigned char no0 :1;
   unsigned char no1 :1;
   unsigned char no2 :1;
   unsigned char no3 :1;
   unsigned char no4 :1;
   unsigned char no5 :1;
   unsigned char no6 :1;
   unsigned char no7 :1;
} __BITS8;

union un_if0h
{
   unsigned char if0h;
   __BITS8 BIT;
};

#define IF0H     (*(volatile union un_if0h *)0xFFFE1).if0h
#define IF0H_bit (*(volatile union un_if0h *)0xFFFE1).BIT

void test(void)
{
   IF0H_bit.no5 = 1;
}

----------------------------------------------------------

and compiling it with -Os and -da once as-is and once with IF0H_bit not declared volatile.

The generated RTL is basically the same until the 'combine' stage

--------------------non-volatile start--------------------
Trying 5 -> 7:
Failed to match this instruction:
(parallel [
        (set (reg:QI 45 [ MEM[(union un_if0h *)65505B].BIT.no5 ])
(mem/j:QI (const_int -31 [0xffffffffffffffe1]) [0 MEM[(union un_if0h *)65505B].BIT.no5+0 S1 A8]))
        (set (reg/f:HI 43)
            (const_int -31 [0xffffffffffffffe1]))
    ])
Failed to match this instruction:
(parallel [
        (set (reg:QI 45 [ MEM[(union un_if0h *)65505B].BIT.no5 ])
(mem/j:QI (const_int -31 [0xffffffffffffffe1]) [0 MEM[(union un_if0h *)65505B].BIT.no5+0 S1 A8]))
        (set (reg/f:HI 43)
            (const_int -31 [0xffffffffffffffe1]))
    ])

Trying 7 -> 8:
Successfully matched this instruction:
(set (reg:QI 46)
(ior:QI (mem/j:QI (reg/f:HI 43) [0 MEM[(union un_if0h *)65505B].BIT.no5+0 S1 A8])
        (const_int 32 [0x20])))
deferring deletion of insn with uid = 7.
modifying insn i3     8: r46:QI=[r43:HI]|0x20
deferring rescan insn with uid = 8.
---------------------non-volatile end---------------------

----------------------volatile start----------------------
Trying 5 -> 7:
Failed to match this instruction:
(parallel [
        (set (reg:QI 45 [ MEM[(volatile union un_if0h *)65505B].BIT.no5 ])
(mem/v/j:QI (const_int -31 [0xffffffffffffffe1]) [0 MEM[(volatile union un_if0h *)65505B].BIT.no5+0 S1 A8]))
        (set (reg/f:HI 43)
            (const_int -31 [0xffffffffffffffe1]))
    ])
Failed to match this instruction:
(parallel [
        (set (reg:QI 45 [ MEM[(volatile union un_if0h *)65505B].BIT.no5 ])
(mem/v/j:QI (const_int -31 [0xffffffffffffffe1]) [0 MEM[(volatile union un_if0h *)65505B].BIT.no5+0 S1 A8]))
        (set (reg/f:HI 43)
            (const_int -31 [0xffffffffffffffe1]))
    ])

Trying 7 -> 8:
Failed to match this instruction:
(set (reg:QI 46)
(ior:QI (mem/v/j:QI (reg/f:HI 43) [0 MEM[(volatile union un_if0h *)65505B].BIT.no5+0 S1 A8])
        (const_int 32 [0x20])))
-----------------------volatile end-----------------------

Bearing in mind that I'm new to all this and may be missing something blindingly obvious, what would cause 7->8 to fail when declared volatile and not when not? Does something need adding to rl78-virt.md to allow it to match?

It doesn't seem like this is due to missing an optimization step that combines insns (hmm, "combine?") but rather to not recognizing that a single, existing insn is possible and so splitting the operation up into multiple steps.

The 'Failed to match' string comes after calling 'recog' but I'm either too blind or too stupid to find the implementation.

The result of this (as I mentioned in my first post) is that this is produced:

  28                                    _test:
  29 0000 C9 F2 E1 FF                           movw    r10, #-31
  30 0004 AD F2                                 movw    ax, r10
  31 0006 16                                    movw    hl, ax
  32 0007 8B                                    mov     a, [hl]
  33 0008 6C 20                                 or      a, #32
  34 000a 9B                                    mov     [hl], a
  35 000b D7                                    ret

instead of this:
  28                                    _test:
  29 0000 71 5A E1                              set1    0xfffe1.5
  30 0003 D7                                    ret

Surely the optimized code is also valid for a volatile variable? In fact, I would have thought it *more* valid as it performs the entire operation in a single instruction instead of splitting it into a very definite read-modify-write sequence?

Since operations on memory-mapped hardware registers are your bread-and-butter on a microcontroller, 'curing' this would bring significant gains.

Am I missing something (non-)obvious?

Regards,

Richard

Reply via email to