Thanks for updating the KIP.
(100) As stated previously, I personally don't think that adding Headers
to RecordMetadata is the right thing to do. To me, Headers store
"application metadata" of a record, but they are not "Kafka native"
record metadata (ie metadata Kafka an reason about). Headers are a black
box to Kafka, similar to key and value.
The JavaDocs of `RecordMetadata` might not be helpful, but I agree with
Andrew and Lianet that its original purpose does not fit to the idea to
add Headers to it.
(200) However, I don't understand the argument about `onComplete()`? Of
course, if the interceptor `onSend()` methods modified the headers,
`onComplete()` would see the modified record, no the original one. But
the same should be true for `onAcknowledgement()` (no matter if we pass
the Headers as parameter of via `RecordMetadata`), right?
Thus, I would personally argue, that extending `onComplete()` and pass
in the Headers might be a good thing to do, too.
[side remark] I personally don't agree to the comment on the linked PR
(even if this is not relevant to this KIP):
it is not recommended but allowable to create a new ProducerRecord in
Interceptor.
In the end, `onSend()` has return type `ProducerRecord`, and while I
agree that it should be used with care, it seems odd to call it an "edge
case", and position it at something that is only "tolerated"... In the
end, if we really want to consider modifying a record `onSend()` as bad
practice, we should rather change the interceptor interface and change
the `onSend()` return type to `void`.
(300) About Lianet's idea to just track "producer local timestamp" and
pass into the callback: I find it interesting, but I am not totally sure
if we might make it too complicated?
In general, I believe that we should actually have a much larger change
to the Kafka message format, and always store the producer provided
"create timestamp" plus the broker side "log append timestamp". For this
case, `RecordMetadata` would be extended to also provide both
timestamps. Of course, changing the message format is totally
out-of-scope for this KIP, and I don't propose to tackle it with this
KIP. However, if we would want to be forward looking (and optimistically
assume that we might change the message format at some point in the
future accordingly), we could actually deprecate the existing
`RecordMetadata#timestamp()` method, and add `#createTimestamp()` and
`#logAppendTimestamp()` methods, and clearly document their semantics
with regard to the corresponding "CreateTime" vs "AppendTime" topic config.
The disadvantage I see is, that we do something that might never happen
in the Kafka message format, and that we would limit the scope of this
KIP to the "measure latency" use-case only.
The KIP also mentioned tracing as use-case which could benefit to access
Headers in the callback what I find convincing. I don't see a good
reason why we would want to exclude this use-case? -- And frankly, there
could be even other use-case which might benefit from an easy access to
Headers in the callbacks.
(400) If we really go with overloading `onAcknowledgement()` (and maybe
also `onCompletion()`), and wondering if we should deprecate the
existing overloads? Given that this idea is current a "rejected
alternative" it ok that the KIP is not specific about it, however, if we
would actually do it this way, we should consider it.
-Matthias
On 7/30/24 12:06 PM, Lianet M. wrote:
Hello Rich, thanks for resurrecting the KIP, seems to fill a gap indeed.
LM1. Specifically related to motivation#1. ProducerRecord already has a
timestamp, passed into the RecordMetadata, that represents the creation
time provided on new ProducerRecord, so couldn't we reuse it to avoid the
extra complexity of having to "include a timestamp in the header when the
message is sent" to be able to compute latency properly. The challenge of
course is that that timestamp may be overwritten (and this is the root
cause of the gap), but that could be resolved just by keeping the original
time and making it available.
RecordMetadata would keep a timestamp (passed from the record creation,
never mutated), and the "effectiveTimestamp" (the one it currently has,
updated with the broker result based on configs). Main advantage would be
not having to add a header for calculating latency. The user simply creates
the record with a timestamp (known existing concept), and we make that
value accessible in the RecordMetadata (where it exists already at some
point, but it's mutated). Thoughts?
LM2. Regardless of the point above, if we think having the headers
available on the onAcknowledgement would be helpful, I definitely see the
case for both alternatives (headers in RecordMetadata and as param). I
share Andrew's feeling because Headers are indeed part of the
ProducerRecord. But then headers will in practice simply contain info
related to the record, so it seems sensible to expect to find headers in
the RecordMetadata as you suggest, ok with me.
Thanks!
Lianet
On Mon, Jul 29, 2024 at 9:41 PM Rich C. <chenjy.r...@gmail.com> wrote:
Hi Andrew,
Thanks for the feedback. I have updated KIP-512 and addressed AS2, AS3, and
AS4. For AS1, let's wait for further responses from the community.
Regards,
Rich
On Mon, Jul 29, 2024 at 5:59 AM Andrew Schofield <
andrew_schofi...@live.com>
wrote:
Hi,
Thanks for adding the detail. It seems quite straightforward to
implement in the producer code.
AS1: Personally, and of course this is a matter of taste and just one
opinion, I don’t like adding Headers to RecordMetadata. It seems to me
that RecordMetadata is information about the record that’s been produced
whereas the Headers are really part of the record itself. So, I prefer
the
alternative which overloads ProducerInterceptor.onAcknowledgement.
AS2: ProducerBatch and FutureRecordMetadata are both internal classes
and do not need to be documented in the KIP.
AS3: This KIP is adding rather than replcaing the constructor for
RecordMetadata.
You should define the value for the Headers if an existing constructor
without headers is used.
AS4: You should add a method `Headers headers()` to RecordMetadata.
I wonder what other community members think about whether it’s a good
idea to extend RecordMetadata with the headers.
Thanks,
Andrew
On 29 Jul 2024, at 05:36, Rich C. <chenjy.r...@gmail.com> wrote:
Hi all,
Thank you for the positive feedback. I added proposal changes to
KIP-512
and included a FAQ section to address some concerns.
Hi Andrew, yes, this KIP focuses on
`ProducerInterceptor.onAcknowledgement`. I added FAQ#3 to explain that.
Hi Matthias, for your question about "RecordMetadata being Kafka
metadata" in
this thread
<
https://lists.apache.org/list?dev@kafka.apache.org:lte=1M:make%20Record%20Headers%20available%20in%20onAcknowledgement
,
I added FAQ#2 to explain that. If I have missed any documentation
regarding
the design of RecordMetadata, please let me know.
Regards,
Rich
On Fri, Jul 26, 2024 at 4:00 PM Andrew Schofield <
andrew_schofi...@live.com>
wrote:
Hi Rich,
Thanks for resurrecting this KIP. It seems like a useful idea to me
and
I’d be interested in seeing the proposed public interfaces.
I note that you specifically called out the
ProducerInterceptor.onAcknowledgement
method, as opposed to the producer Callback.onCompletion method.
Thanks,
Andrew
On 26 Jul 2024, at 04:54, Rich C. <chenjy.r...@gmail.com> wrote:
Hi Kevin,
Thanks for your support.
Hi Matthias,
I apologize for the confusion. I've deleted the Public Interface
sections
for now. I think we should focus on discussing its necessity with the
community. I'll let it sit for a few more days, and if there are no
objections, I will propose changes over the weekend and share them
here
again.
Regards,
Rich
On Thu, Jul 25, 2024 at 5:51 PM Matthias J. Sax <mj...@apache.org>
wrote:
Rich,
thanks for resurrecting this KIP. I was not part of the original
discussion back in the day, but personally agree with your
assessment
that making headers available in the callbacks would make
developer's
life much simpler.
For the KIP itself, starting with "Public Interface" section,
everything
is formatted as "strike through". Can you fix this? It's confusing
as
it's apparently not correctly formatted, but unclear which (if any)
parts should be formatted like this. In general, wiki pages have
history, so strike-through should be used rather rarely but the wiki
page should just contain the latest proposal. (If one want to see
the
history, it's there anyway).
-Matthias
On 7/23/24 6:36 AM, Kevin Lam wrote:
Hi,
Thanks for starting the discussion. Latency Measurement and Tracing
Completeness are both good reasons to support this feature, and
would
be
interested to see this move forward.
On Mon, Jul 22, 2024 at 11:15 PM Rich C. <chenjy.r...@gmail.com>
wrote:
Hi Everyone,
I hope this email finds you well.
I would like to start a discussion on KIP-512. The initial version
of
KIP-512 was created in 2019, and I have resurrected it in 2024
with
more
details about the motivation behind it.
You can view the current version of the KIP here: KIP-512: Make
Record
Headers Available in onAcknowledgement.
<
https://cwiki.apache.org/confluence/display/KAFKA/KIP-512%3A+make+Record+Headers+available+in+onAcknowledgement
Let's focus on discussing the necessity of this feature first. If
we
agree
on its importance, we can then move on to discussing the proposed
changes.
Looking forward to your feedback.
Best regards,
Rich