Hi, Thanks for the great writeup. Few questions though -
sm01 - Is there any evidence behind the new bounds? What data informed [200, 1000] and the floor of 200? Specifically, the measured ShareUpdate vs ShareSnapshot record sizes that make values <200 "mostly snapshots," and what motivated 1000 as the ceiling rather than higher? sm02 - Guidance for choosing a value? Could the KIP offer a starting point as a function of observable behavior (records/sec, in-flight count, or __share_group_state write rate), plus which metrics to watch when tuning? Also, what's the rationale for keeping the default at 500 under the new bounds? sm03 - Expected disk and recovery impact? Any rough before/after numbers for moving a high-throughput group from 500 → 1000 — disk saved per day and added replay time on restart? A concrete example would help operators weigh the tradeoff. Regards, Sushant Mahajan On Sat, 23 May 2026, 01:02 Muralidhar Basani via dev, <[email protected]> wrote: > Hi all, > > I would like to start a discussion on KIP-1349, which allows configurable > snapshot frequency of share groups. > > KIP : > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1349%3A+Configurable+snapshot+frequency+for+share+groups > > Thanks, > Murali >
