stevenschlansker commented on PR #21073: URL: https://github.com/apache/kafka/pull/21073#issuecomment-3635331075
I added a simple benchmark comparing TreeMap lookups with keys having shared prefixes of various lengths. ``` before Benchmark (bytes) Mode Cnt Score Error Units BytesCompareBenchmark.samePrefixLexico 8 thrpt 20 1.628 ± 0.027 ops/ms BytesCompareBenchmark.samePrefixLexico 16 thrpt 20 1.187 ± 0.008 ops/ms BytesCompareBenchmark.samePrefixLexico 32 thrpt 20 0.856 ± 0.007 ops/ms BytesCompareBenchmark.samePrefixLexico 128 thrpt 20 0.313 ± 0.001 ops/ms BytesCompareBenchmark.samePrefixLexico 1024 thrpt 20 0.042 ± 0.001 ops/ms after: Benchmark (bytes) Mode Cnt Score Error Units BytesCompareBenchmark.samePrefixLexico 8 thrpt 20 2.272 ± 0.061 ops/ms BytesCompareBenchmark.samePrefixLexico 16 thrpt 20 1.622 ± 0.022 ops/ms BytesCompareBenchmark.samePrefixLexico 32 thrpt 20 1.501 ± 0.008 ops/ms BytesCompareBenchmark.samePrefixLexico 128 thrpt 20 0.970 ± 0.004 ops/ms BytesCompareBenchmark.samePrefixLexico 1024 thrpt 20 0.304 ± 0.001 ops/ms ``` So it looks like at least on my M3 Mac, using the compiler intrinsic is better across the board - about 30% better for short values and almost 8x faster for very long equal runs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
