On Wed, 06 Sep 2023 09:47:05 PDT (-0700), jeffreya...@gmail.com wrote:


On 9/6/23 10:22, Palmer Dabbelt wrote:
On Wed, 06 Sep 2023 09:07:33 PDT (-0700), christoph.muell...@vrull.eu
wrote:
From: Christoph Müllner <christoph.muell...@vrull.eu>

This patch implements the expansion of the strlen builtin for RV32/RV64
for xlen-aligned aligned strings if Zbb or XTheadBb instructions are
available.
The inserted sequences are:

rv32gc_zbb (RV64 is similar):
      add     a3,a0,4
      li      a4,-1
.L1:  lw      a5,0(a0)
      add     a0,a0,4
      orc.b   a5,a5
      beq     a5,a4,.L1
      not     a5,a5
      ctz     a5,a5
      srl     a5,a5,0x3
      add     a0,a0,a5
      sub     a0,a0,a3

rv64gc_xtheadbb (RV32 is similar):
      add       a4,a0,8
.L2:  ld        a5,0(a0)
      add       a0,a0,8
      th.tstnbz a5,a5
      beqz      a5,.L2
      th.rev    a5,a5
      th.ff1    a5,a5
      srl       a5,a5,0x3
      add       a0,a0,a5
      sub       a0,a0,a4

This allows to inline calls to strlen(), with optimized code for
xlen-aligned strings, resulting in the following benefits over
a call to libc:
* no call/ret instructions
* no stack frame allocation
* no register saving/restoring
* no alignment test

The inlining mechanism is gated by a new switch ('-minline-strlen')
and by the variable 'optimize_size'.

Maybe this is more of a Jeff question, but this looks to me like
something that should be target-agnostic -- maybe we need some backend
work to actually emit the special instruction, but IIRC this is a
somewhat common flavor of instruction and is in other ISAs as well.  It
looks like there's already a strlen insn, so I guess the core issue is
why we need that unspec?

Sorry if I'm just missing something, though...

The generic strlen expansion in GCC doesn't really expand a strlen loop.
  It really just calls into the target code and forces the target to
handle everything.

OK, that explains it.

We could have generic strlen expansion code that kicks in if the target
expander fails.  And we could probably create the necessary opcodes to
express the optimized end-of-string comparison instructions that exist
on various architectures.  I'm not not sure it's worth that much effort
given targets are already doing their own strlen expansions.


If everyone does it this way then I don't think we need to worry about it.


jeff

Reply via email to