Hello all,

Thanks for the feedback.

DJ01/DJ02:

MetadataResponse bumps from v13 to v14. The PartitionMetadata struct gains a 
new 
field PartitionAgeMs (int64, default -1), computed server-side by the broker as 
broker_current_time - partition_creation_time.

Also add the consumer heartbeat flow. when MembershipManager detects a newly 
assigned 
partition, it explicitly invalidates the metadata for the affected topic and 
forces a fresh MetadataRequest
before making the offset reset decision, even if the topic ID is already in the 
cache.

MB0:

The consumer learns the broker's maximum supported MetadataResponse version via 
the
ApiVersions negotiation at connection time. If the negotiated version is 
unsupported, the consumer 
knows the broker does not support PartitionAgeMs at all and can throw an 
UnsupportedVersionException 
immediately, rather than silently falling back to latest and risking data loss 
without any operator-visible signal.

MB1/MB2/MB3:

I have addressed these changes in the KIP.

Best Regards,
Jiunn-Yang

> Chia-Ping Tsai <[email protected]> 於 2026年4月29日 下午4:04 寫道:
> 
> hi David
> 
> I agree with the direction of moving the 'age' resolution from the Heartbeat 
> API to the Metadata API to keep the control plane clean. The main trade-off, 
> as we noted before, is introducing inter-broker clock skew. The Group 
> Coordinator approach provided a single source of truth for time.
> 
> However, realistically, this time skew should be negligible. Given that the 
> max.age threshold will likely be configured in minutes or hours, a typical 
> NTP skew (in milliseconds) between brokers won't impact the fallback decision.
> 
> Best,
> Chia-Ping
> 
>> David Jacot via dev <[email protected]> 於 2026年4月29日 下午3:29 寫道:
>> 
>> Hi all,
>> 
>> Thanks for the KIP!
>> 
>> Sorry, I haven't really followed the previous conversation but I took a
>> quick look at this one.
>> 
>> DJ01: I don't clearly understand the flow with the ConsumerGroupHeartbeat
>> API after reading the KIP. There is a new boolean; the KIP states that
>> partition ages are returned only when this boolean is set. Implicitly, this
>> means that when the consumer receives a new partition, it will issue a new
>> HB request with the boolean set to receive the ages. Is my understanding
>> correct? We should perhaps clarify the flow and also explain how it fits
>> into the existing flow (e.g. list offsets, fetch offsets, etc.).
>> DJ02: It my understanding is correct, I wonder if
>> the ConsumerGroupHeartbeat API is the right place for this given that a new
>> round trip is done anyway. Alternatively, it could simply include the
>> metadata. Generally, we should be rather cautious about not overloading
>> the ConsumerGroupHeartbeat API with unrelated concepts. The API is a
>> control plane API for assigning or revoking partitions. The fact that we
>> don't want to add it to the corresponding Streams API also suggests
>> something is not quite right. What would we do if we want to support
>> Streams in the future?
>> 
>> Best,
>> David
>> 
>>> On Wed, Apr 29, 2026 at 12:28 AM Muralidhar Basani via dev <
>>> [email protected]> wrote:
>>> 
>>> Hi Jiunn,
>>> 
>>> Thank you for this great kip. Good to know about the gap.
>>> 
>>> mb-0 - why a new v2 version bump for RequestPartitionAges field. Can a
>>> tagged field (for ex: on response, PartitionAges on TopicPartitions) be
>>> used here and avoid version bump?
>>> 
>>> mb-1 - For the new config, is there a recommended value or a ConfigDef
>>> validator? Probably it should based on the metadata.max.age.ms ? Sizing
>>> instructions can be part of javadocs I guess.
>>> 
>>> mb-2 - (minor) As there are no changes to Kafka Streams, would it be better
>>> to add this new config auto.offset.reset.latest.max.age to the
>>> StreamsConfig block list (NON_CONFIGURABLE_CONSUMER_DEFAULT_CONFIGS) for a
>>> clear warning, incase users configure it? This is the most familiar
>>> consumer config and users might easily mistakenly configure it. Or may be
>>> it's not worth it to add.
>>> 
>>> mb-3 - (minor) The phrasing "the consumer falls back to earliest" reads as
>>> if the config were being changed per-partition which isn't supported. May
>>> be rephrasing to something like "consumer resolves the initial position to
>>> start offset for that partition" as if earliest was applied to that
>>> partition only and auto.offset.reset config is unchanged.
>>> 
>>> Thanks,
>>> Murali
>>> 
>>>> On Tue, Apr 28, 2026 at 2:48 PM 黃竣陽 <[email protected]> wrote:
>>>> 
>>>> Hi chia,
>>>> 
>>>> I have updated the KIP to include this change.
>>>> 
>>>> Best Regards,
>>>> Jiunn-Yang
>>>> 
>>>>> Chia-Ping Tsai <[email protected]> 於 2026年4月28日 晚上8:03 寫道:
>>>>> 
>>>>> hi Jiunn-Yang
>>>>> 
>>>>> chia_0: Should we expose the partition creation time via the Admin API?
>>>> I assume it would be valuable for users to diagnose and troubleshoot the
>>>> behavior of auto.offset.reset.latest.max.age
>>>>> 
>>>>> Best,
>>>>> Chia-Ping
>>>>> 
>>>>> On 2026/04/28 10:47:58 黃竣陽 wrote:
>>>>>> Hello everyone,
>>>>>> 
>>>>>> I would like to start a discussion on KIP-1327 Prevent Hot Data Loss
>>> on
>>>> Partition Expansion for Latest Policy
>>>>>> <
>>> https://urldefense.com/v3/__https://cwiki.apache.org/confluence/x/KY4mGQ__;!!Ayb5sqE7!qF4q1QzF1RRgP61D7A2xuEai1ky7fepKDKFFvpNBuePikH-ULmT87TvuuZzy5kau5E4y5zMZAmfQQiwZomM$
>>>> 
>>>>>> 
>>>>>> This proposal aims to introduces auto.offset.reset.latest.max.age, a
>>>> consumer config that lets the
>>>>>> latest reset policy distinguish newly expanded (hot) partitions from
>>>> long-existing (cold) ones. Partitions
>>>>>> younger than the configured threshold automatically fall back to
>>>> earliest, preventing silent data loss
>>>>>> during topic expansion without forcing a full historical reprocess.
>>>>>> 
>>>>>> Best regards,
>>>>>> Jiunn-Yang
>>>> 
>>>> 
>>> 

Reply via email to