Re: [DISCUSS] KIP-31 - Message format change proposal

Ewen Cheslack-Postava Thu, 03 Sep 2015 21:35:55 -0700

A few questions:

1. If we update the producers to only support V1, doesn't that mean people
get stuck on the current version of the producer until they can be sure all
their consumers have been upgraded? Is that going to be a problem for
anyone, and does it potentially keep important fixes/enhancements (e.g. the
upcoming client-side network request timeouts) because they have to wait
for all the consumers to upgrade first?

2. Why minute granularity specifically? That's seems fairly specific to
LI's example workload (which, by the way, is a very useful reference point
to include in the KIP, thanks for that). If someone has the memory and can
support it, why not let them go all the way to per-record timestamps (by
making the interval configurable and they can set it really small/to 0)? It
seems to me that some people might be willing to pay that cost for the
application-side simplicity if they want to consume from specific
timestamps. With the proposal as is, seeking to a timestamp still requires
client-side logic to filter out messages that might have earlier timestamps
due to the granularity chosen. The obvious tradeoff here is yet another
config option -- I'm loath to add yet more configs, although this is one
that we can choose a reasonable default for (like 1 minute) and people can
easily change with no impact if they need to adjust it.

3. Why not respect the timestamp passed in as long as it's not some
sentinel value that indicates the broker should fill it in? When you do
know they will be ordered, as they should be with mirror maker (which is
specifically mentioned), this seems really useful. (More about this in
questions below...)

4. I think one obvious answer to (3) is that accepting client's timestamps
could break seeking-by-timestamp since they may not be properly ordered.
However, I think this can break anyway under normal operations -- any
imperfections in clock synchronization could result in older timestamps
being applied to new messages on a new leader broker compared messages
stored previously by the old leader. I think it's probably too common to
just think that NTP will take care of it and then run into weird bugs
because NTP isn't always perfect, sometimes does some unexpected things,
and, of course, does what you tell it to and so is subject to user error.

It would be reasonable to argue that the leader change issue is much less
likely of an issue than if you respect timestamps from producers, where, if
applications actually filled it in, you'd receive a jumbled mess of
timestamps and trying to do the binary search in the index wouldn't
necessarily give you correct results. However, a) we could allow clients to
fill that info in but discourage it where it might cause issues (i.e.
normal applications) and it seems like a significant win for mirrormaker.

I actually think not accepting timestamps is probably the better choice but
a) it seems awkward in the protocol spec because we have to include the
field in the produce requests since we don't want to have to fully decode
them on the broker and b) losing that info during mirroring seems like it
breaks the goal of fixing log retention (at least for mirrors) as well as
the goal of improving searching by timestamp (at least for mirrors).

5. You never actually specified the granularity (or format, but I assume
unix epoch) of the timestamp. Milliseconds? Microseconds? Nanoseconds? This
definitely needs to eventually make it into the protocol docs.

6. Re: the rejected alternative. Are there any other options in changing
the config format that might make it a bit lighter weight? For example, do
we need a full int64? Could we do something relative instead that wouldn't
require as many bytes? Without (5) being specified, it's actually difficult
to evaluate some of these options.

7. Any plan to expose this in the client APIs? This is related to (4). If
they are not exposed anywhere in the API as being associated with messages,
then we can reasonably treat them as an internal implementation detail and
be very clear about what looking up an offset for a timestamp means. If
they are exposed, then they're more like message metadata that applications
are probably going to want preserved, and thus will want the broker to
respect the timestamp. For example, if I saw that consumers could get the
timestamp, I'd want to be able to assign it at the producer so that even if
I have significant batching I still get an accurate timestamp -- basically
I might replace some cases where I use a timestamp embedded in the message
with the timestamp provided by Kafka. (That said, we should consider the
impact that including it in the protocol can have; non-Java clients are
likely to expose it if it is available, whether it's actually a good idea
to or not.)

-Ewen

On Thu, Sep 3, 2015 at 3:47 PM, Jiangjie Qin <j...@linkedin.com.invalid>
wrote:

> Hi Guozhang,
>
> I checked the code again. Actually CRC check probably won't fail. The newly
> added timestamp field might be treated as keyLength instead, so we are
> likely to receive an IllegalArgumentException when try to read the key.
> I'll update the KIP.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Thu, Sep 3, 2015 at 12:48 PM, Jiangjie Qin <j...@linkedin.com> wrote:
>
> > Hi, Guozhang,
> >
> > Thanks for reading the KIP. By "old consumer", I meant the
> > ZookeeperConsumerConnector in trunk now, i.e. without this bug fixed. If
> we
> > fix the ZookeeperConsumerConnector then it will throw exception
> complaining
> > about the unsupported version when it sees message format V1. What I was
> > trying to say is that if we have some ZookeeperConsumerConnector running
> > without the fix, the consumer will complain about CRC mismatch instead of
> > unsupported version.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Thu, Sep 3, 2015 at 12:15 PM, Guozhang Wang <wangg...@gmail.com>
> wrote:
> >
> >> Thanks for the write-up Jiangjie.
> >>
> >> One comment about migration plan: "For old consumers, if they see the
> new
> >> protocol the CRC check will fail"..
> >>
> >> Do you mean this bug in the old consumer cannot be fixed in a
> >> backward-compatible way?
> >>
> >> Guozhang
> >>
> >>
> >> On Thu, Sep 3, 2015 at 8:35 AM, Jiangjie Qin <j...@linkedin.com.invalid
> >
> >> wrote:
> >>
> >> > Hi,
> >> >
> >> > We just created KIP-31 to propose a message format change in Kafka.
> >> >
> >> >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-31+-+Message+format+change+proposal
> >> >
> >> > As a summary, the motivations are:
> >> > 1. Avoid server side message re-compression
> >> > 2. Honor time-based log roll and retention
> >> > 3. Enable offset search by timestamp at a finer granularity.
> >> >
> >> > Feedback and comments are welcome!
> >> >
> >> > Thanks,
> >> >
> >> > Jiangjie (Becket) Qin
> >> >
> >>
> >>
> >>
> >> --
> >> -- Guozhang
> >>
> >
> >
>

-- 
Thanks,
Ewen

Re: [DISCUSS] KIP-31 - Message format change proposal

Reply via email to