Lucas Brutschy created KAFKA-14415: -------------------------------------- Summary: `ThreadCache` is getting slower with every additional state store Key: KAFKA-14415 URL: https://issues.apache.org/jira/browse/KAFKA-14415 Project: Kafka Issue Type: Bug Reporter: Lucas Brutschy
There are a few lines in `ThreadCache` that I think should be optimized. `sizeBytes` is called at least once, and potentially many times in every `put` and is linear in the number of caches (= number of state stores, so typically proportional to number of tasks). That means, with every additional task, every put gets a little slower. The throughput is 30% higher if replace it by constant time update… Compare the throughput of TIME_ROCKS on trunk (green graph): [http://kstreams-benchmark-results.s3-website-us-west-2.amazonaws.com/experiments/stateheavy-3-5-3-4-0-51b7eb7937-jenkins-20221113214104-streamsbench/] This is the throughput of TIME_ROCKS when a constant time `sizeBytes` implementation is used: [http://kstreams-benchmark-results.s3-website-us-west-2.amazonaws.com/experiments/stateheavy-3-5-LUCASCOMPARE-lucas-20221122140846-streamsbench/] So the throughput is ~20% higher. The same seems to apply for the MEM backend (initial throughput >8000 instead of 6000), however, I cannot run the same benchmark here because the memory is filled too quickly. [http://kstreams-benchmark-results.s3-website-us-west-2.amazonaws.com/experiments/stateheavy-3-5-LUCASSTATE-lucas-20221121231632-streamsbench/] -- This message was sent by Atlassian Jira (v8.20.10#820010)