On 7/22/24 11:07 AM, Roger Sayle wrote:
> > Whilst 33-bit pseudo-rotations almost certainly doesn't occur
frequently
in real code, this provides a very useful building block. Conventionally,
rotations require 2 cycles per bit; one cycle to shift the top-bit out
out of the source, and one cycle to shift this bit into the destination.
The above pseudo rotation has twice the throughput, but leaves the
upper bits in a unusual configuration. It turns out that masking the
top-bits out with AND after the pseudo-rotation provides a fast form
of lshr with high shift counts, and even ashr can be improved using
either a sign-extension instruction or sign-extension sequence after
a lshr.
The H8 uses rotate through the carry in a variety of ways as well. For
example, a SImode shift by 15:
shlr.w // Move bit into the carry
mov.w // Move low half word into high half word
xor.w // clear low half word
rotxr.l // rotate right to restore carry bit
> > Unfortunately without real hardware or a simulator to test on, I can't
be 100% confident in this code, but on paper, shifts should now be much
faster. This patch has been tested on a cross-compiler to arc-linux
hosted on x86_64 with no new failures in the compilation tests.
Now that Claudiu has left Synopsys, is anyone able to test these changes?
I see that Synopsys has QEMU for the arc. In theory that should be
sufficient if it has user mode emulation support. But I don't have the
time to really dig into that. Not sure if you want to take that on or not.
I can throw this patch into my tester, but I'm not sure it's going to
give significantly more coverage than you've done. I have a dummy
simulator that always returns 0. So it'll pretend to run all the
execution tests, but it really just verifies they compile, assemble and
link.
Claudiu is still listed as the maintainer for the port, so let's give
him time to chime in. He may also have suggestions on how to get things
set up for doing real executions tests going forward.
Jeff