Hey all, We have been running a kafka-stream based service in production for the last couple of years (we have 4 brokers on this specific cluster). Inside this service, which does a lot of stuff - there are 2 GlobalKTables (which are based on 2 compacted topics).
When I restart my app and clean the local state - the restoration of those topics begins, and the weird thing I am noticing is that for *much *fewer messages, 1 topic takes *a lot* more time to complete. Let's call the compacted topics A and B, both are compacted and mainly configured the same: Replication 3 Number of Partitions 36 max.message.bytes 25000000 segment.index.bytes 2097152 min.cleanable.dirty.ratio 0.1 cleanup.policy compact delete.retention.ms 900000 segment.bytes 52428800 segment.ms 3600000 *WIth the exception, that for topic B, we use * cleanup.policy compact,delete retention.ms 604800000 *The behaviour i've noticed for a single partition:* Topic A has 62,552 totalRecordsToBeRestored - and it takes around 20s Topic B has 24,730,506 totalRecordsToBeRestored - and it takes around 1s It is worth mentioning that the data that B holds for each record *is much much bigger *(A holds an integer while B holds a big object)*.* Now, I get the feeling that the reason is because the data in B is always relatively "fresh", while the data in topic A is mostly stale (the business logic behind it suggests that topic A updates at a very low rate - and a lot of keys would never be updated) So, for example, it's probably holding keys that haven't been updated since 2018. Topic B keeps getting updated every couple of milliseconds. Another difference that I just realized i might share is that topic A is being "joined" by 6 other streams, while topic B is only being joined by 2. I find it hard to explain the relation between keeping "old" records and the huge difference in number of records and their size. So I guess I am missing some basic concept when it comes to understanding compacted topics and the way the broker saves and fetches the data OR maybe we have some underlying problem which can explain it. Let me know if you need some more info Thanks! -- Nitay Kufert Backend Team Leader [image: ironSource] <http://www.ironsrc.com> email nita...@ironsrc.com mobile +972-54-5480021 fax +972-77-5448273 skype nitay.kufert.ssa 121 Menachem Begin St., Tel Aviv, Israel ironsrc.com <http://www.ironsrc.com> [image: linkedin] <https://www.linkedin.com/company/ironsource> [image: twitter] <https://twitter.com/ironsource> [image: facebook] <https://www.facebook.com/ironSource> [image: googleplus] <https://plus.google.com/+ironsrc> This email (including any attachments) is for the sole use of the intended recipient and may contain confidential information which may be protected by legal privilege. If you are not the intended recipient, or the employee or agent responsible for delivering it to the intended recipient, you are hereby notified that any use, dissemination, distribution or copying of this communication and/or its content is strictly prohibited. If you are not the intended recipient, please immediately notify us by reply email or by telephone, delete this email and destroy any copies. Thank you.