Thanks Greg! It helps so much. This KIP seems familiar somehow, I should go 
through it again. Really appreciated!
---- Replied Message ----
From Greg Harris<greg.har...@aiven.io.INVALID> Date 10/25/2024 00:40 To 
d...@kafka.apache.org Cc users@kafka.apache.org Subject Re: doc clarification 
about meesage format 
Hey Xiang, 
Thanks for your questions! This is getting to the limit of my knowledge, 
but I'll answer as best I can. 
The partitionLeaderEpoch is only set once during the batch lifetime (during 
Produce), and is not mutated any other time. This includes when data is 
fetched by other replicas and by consumers, and when partition leadership 
changes. 
I believe this field is a record of which partitionLeaderEpoch was active 
at the time the batch was produced, and can be different for different 
batches within a partition as leadership changes. I wouldn't call this 
"outdated", as I think there is an intentional use for this historical 
leadership data in the log [1]. 
[1] 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-101+-+Alter+Replication+Protocol+to+use+Leader+Epoch+rather+than+High+Watermark+for+Truncation
 
Thanks, 
Greg 
On Wed, Oct 23, 2024 at 8:07 PM Xiang Zhang <xiangzhang1...@gmail.com> 
wrote: 
> Thank you Greg for all the knowledge, some follow up questions. 
> 
> Does partitionLeaderEpoch always reflect the latest leader election or an 
> old epoch can be allowed ? If it is the first case, then I agree 
> partitionLeaderEpoch should not be included in CRC computation. But it 
> raises some new questions for me, which is which roles will check the 
> checksum and under what circumstances? I am asking this because after the 
> producing process, any record in the broker log can have an outdated leader 
> epoch field once leader election happens, right ? Do they get updated ? 
> 
> Sorry for all the questions, I have been using Kafka for several years and 
> want to dive deep into it a little bit. I have become more interested and 
> ready to find out on my own. But still look forward to your thoughts on 
> this if the questions above do make some sense. 
> 
> 
> Thanks, 
> XIang 
> 
> Greg Harris <greg.har...@aiven.io.invalid> 于2024年10月24日周四 00:25写道: 
> 
> > Hi Xiang, 
> > 
> > Thanks for your question! That sentence is a justification for why the 
> > partitionLeaderEpoch field is not included in the CRC. 
> > 
> > If you mutate fields which are included in a CRC, you need to recompute 
> the 
> > CRC value. See [1] for mutating the maxTimestamp. Compare that with [2] 
> for 
> > setting the partitionLeaderEpoch. 
> > This makes setting the partitionLeaderEpoch faster than setting the max 
> > timestamp. And because setting the partitionLeaderEpoch happens on every 
> > Produce request, it was optimized in the protocol design. 
> > It does have the tradeoff that corruptions in the partitionLeaderEpoch 
> are 
> > not detected by the CRC, but someone decided this was worth the 
> > optimization to the Produce flow. 
> > 
> > I don't have more information on why this optimization was made for 
> > partitionLeaderEpoch and not maxTimestamp. 
> > 
> > Hope this helps, 
> > Greg 
> > 
> > [1] 
> > 
> > 
> https://github.com/apache/kafka/blob/2d896d9130f121e75ccba2d913bdffa358cf3867/clients/src/main/java/org/apache/kafka/common/record/DefaultRecordBatch.java#L371-L382
>  
> > [2] 
> > 
> > 
> https://github.com/apache/kafka/blob/2d896d9130f121e75ccba2d913bdffa358cf3867/clients/src/main/java/org/apache/kafka/common/record/DefaultRecordBatch.java#L385-L387
>  
> > 
> > 
> > On Tue, Oct 22, 2024 at 7:51 PM Xiang Zhang <xiangzhang1...@gmail.com> 
> > wrote: 
> > 
> > > Hi all, 
> > > 
> > > I am reading official doc here: 
> > > https://kafka.apache.org/documentation/#messageformat, and I could not 
> > > fully understand it. If someone can clarify it for me, it would be much 
> > > appreciated. The sentence is 
> > > 
> > > The partition leader epoch field is not included in the CRC computation 
> > to 
> > > avoid the need to recompute the CRC when this field is assigned for 
> every 
> > > batch that is received by the broker. 
> > > 
> > > I just don’t really get what the highlight part is trying to say. 
> > > 
> > > Regards, 
> > > Xiang Zhang 
> > > 
> > 
>

Reply via email to