On 11/14/22 14:49, Christoph Müllner wrote:


We can take this further, but then the following questions pop up:
* how much data processing per loop iteration?

I have no idea because I don't have any real data.  Last time I gathered any data on this issue was circa 1988 :-)


* what about unaligned strings?

I'd punt.  I don't think we can depend on having a high performance unaligned access.  You could do a dynamic check of alignment, but you'd really need to know that they're aligned often enough that the dynamic check can often be recovered.



Happy to get suggestions/opinions for improvement.

I think this is pretty good without additional data that would indicate that handling unaligned cases or a different number of loop peels would be a notable improvement.

Jeff

Reply via email to