Maciej Bryński created KAFKA-6632:
-------------------------------------
Summary: Very slow hashCode methods in Kafka Connect types
Key: KAFKA-6632
URL: https://issues.apache.org/jira/browse/KAFKA-6632
Project: Kafka
Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Maciej Bryński
hashCode method of ConnectSchema (and Field) is used a lot in SMT.
Example:
[https://github.com/apache/kafka/blob/e5d6c9a79a4ca9b82502b8e7f503d86ddaddb7fb/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/InsertField.java#L164]
Unfortunately it's using Objects.hash which is very slow.
I rewrite this to own implementation and gain 6x speedup.
Microbencharks gives:
* Original ConnectSchema hashCode: 2995ms
* My implementation: 517ms
(100000000 iterations of calculating: hashCode for on new
ConnectSchema(Schema.Type.STRING))
{code:java}
@Override
public int hashCode() {
int result = 5;
result = 31 * result + type.hashCode();
result = 31 * result + (optional ? 1 : 0);
result = 31 * result + (defaultValue == null ? 0 : defaultValue.hashCode());
if (fields != null) {
for (Field f : fields) {
result = 31 * result + f.hashCode();
}
}
result = 31 * result + (keySchema == null ? 0 : keySchema.hashCode());
result = 31 * result + (valueSchema == null ? 0 : valueSchema.hashCode());
result = 31 * result + (name == null ? 0 : name.hashCode());
result = 31 * result + (version == null ? 0 : version);
result = 31 * result + (doc == null ? 0 : doc.hashCode());
if (parameters != null) {
for (String s : parameters.keySet()) {
result = 31 * result + s.hashCode() + parameters.get(s).hashCode();
}
}
return result;
}{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)