movli.l atomics on SH4A

olegendo at gcc dot gnu.org Mon, 16 Apr 2012 02:15:22 -0700

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52941


--- Comment #4 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-04-16 09:14:57 
UTC ---
Created attachment 27164
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27164
WIP patch

The attached patch adds support for movco.l/movli.l insns on SH4A for
-msoft-atomic.  It also adds a new option -mhard-atomic.  For SImode all
hard-atomic patterns should be working.  I've started implementing some of the
QImode and HImode hard-atomic patterns
(atomic_{ior|xor|and|add|sub}_fetch{hi|qi}_hard so far) to see how it would
turn out.

I'm currently using a 4 byte lookup table to get the endian dependent bit
positions for the shift insns which are required to extract/insert the
subwords.  The HImode variants could also be done without the LUT, but I didn't
want to introduce a special case for that.  Ideally, the the LUT would go into
the constant pool, which would allow it to be shared among multiple atomic
insns and also would eliminate the need to branch around it after the atomic
insn.  However, I don't know how to reliably get the the address of a constant
in the constant pool (by using the mova insn).

The atomic_{ior|xor|and|add|sub}_fetch{hi|qi}_hard patterns in the patch seem
to be working OK, but the atomic sequence code turns out rather big.  The
address calculation code could be moved out of the atomic insns so that it
could be CSE'd, but I guess that it would most likely increase register
pressure.  The extu.{b|w} insn in the sequnces can definitely be done before
that in a separate insn, so that it can be eliminated by other passes.  Still,
because of the code size for HImode / QImode hard atomic sequences I think it
would be better to also have a copy of them in libgcc and emit function calls
when compiling with -Os.

Feedback appreciated :)

[Bug target/52941] SH Target: Add support for movco.l / movli.l atomics on SH4A

Reply via email to