Viktor Somogyi-Vass created KAFKA-10650: -------------------------------------------
Summary: Use Murmur3 hashing instead of MD5 in SkimpyOffsetMap Key: KAFKA-10650 URL: https://issues.apache.org/jira/browse/KAFKA-10650 Project: Kafka Issue Type: Improvement Components: core Reporter: Viktor Somogyi-Vass Assignee: Viktor Somogyi-Vass The usage of MD5 has been uncovered during testing Kafka for FIPS (Federal Information Processing Standards) verification. While MD5 isn't a FIPS incompatibility here as it isn't used for cryptographic purposes, I spent some time with this as it isn't ideal either. MD5 is a relatively fast crypto hashing algo but there are much better performing algorithms for hash tables as it's used in SkimpyOffsetMap. By applying Murmur3 (that is implemented in Streams) I could achieve a 3x faster {{put}} operation and the overall segment cleaning sped up by 30% while preserving the same collision rate (both performed within 0.0015 - 0.007, mostly with 0.004 median). -- This message was sent by Atlassian Jira (v8.3.4#803005)