On 11/13/22 16:05, Christoph Muellner wrote:
From: Christoph Müllner <christoph.muell...@vrull.eu>

This patch implements expansions for the cmpstrsi and the cmpstrnsi
builtins using Zbb instructions (if available).
This allows to inline calls to strcmp() and strncmp().

The expansion basically emits a peeled comparison sequence (i.e. a peeled
comparison loop) which compares XLEN bits per step if possible.

The emitted sequence can be controlled, by setting the maximum number
of compared bytes (-mstring-compare-inline-limit).

gcc/ChangeLog:

        * config/riscv/riscv-protos.h (riscv_expand_strn_compare): New
          prototype.
        * config/riscv/riscv-string.cc (GEN_EMIT_HELPER3): New helper
          macros.
        (GEN_EMIT_HELPER2): New helper macros.
        (expand_strncmp_zbb_sequence): New function.
        (riscv_emit_str_compare_zbb): New function.
        (riscv_expand_strn_compare): New function.
        * config/riscv/riscv.md (cmpstrnsi): Invoke expansion functions
          for strn_compare.
        (cmpstrsi): Invoke expansion functions for strn_compare.
        * config/riscv/riscv.opt: Add new parameter
          '-mstring-compare-inline-limit'.

Presumably the hybrid inline + out of line approach is to capture the fact that most strings compare unequal early, then punt out to the library if they don't follow that model?  It looks like we're structured for that case by peeling iterations rather than having a fully inlined approach.  Just want to confirm...


I was a bit worried about the "readahead" problem that arises when reading more than a byte and a NUL is found in the first string.  If you're not careful, the readahead of the second string could fault.  But it looks like we avoid that by requiring word alignment on both strings.


+
+/* Emit a string comparison sequence using Zbb instruction.
+
+   OPERANDS[0] is the target (result).
+   OPERANDS[1] is the first source.
+   OPERANDS[2] is the second source.
+   If NO_LENGTH is zero, then:
+   OPERANDS[3] is the length.
+   OPERANDS[4] is the alignment in bytes.
+   If NO_LENGTH is nonzero, then:
+   OPERANDS[3] is the alignment in bytes.

Ugh.  I guess it's inevitable unless we want to drop the array and pass each element individually (in which case we'd pass a NULL_RTX in the case we don't have a length argument).


I'd like to give others a chance to chime in here.  Everything looks sensible here, but I may have missed something.  So give the other maintainers a couple days to chime in before committing.


Jeff

Reply via email to