Hi Colin,

The first thing that comes to mind is AlwaysStickyPartitioner, but maybe I
could think about it a bit more and come up with something better.

The only reason why I hesitate about the RoundRobin part is that it doesn't
really distinguish that this partitioner will have this behavior on keyed
values. I know this is the case for the RoundRobinPartitioner, but unless
one is familiar with the code, it might not be obvious on first glance.

I can definitely include this in the KIP once we get a name settled.

Thanks,
Justine

On Fri, Jul 12, 2019 at 11:45 AM Colin McCabe <cmcc...@apache.org> wrote:

> On Fri, Jul 12, 2019, at 09:02, Justine Olshan wrote:
> > Hello all,
> >
> > Jun, thanks for taking a look at my KIP! We were also concerned about
> > batches containing a single record so we kept this in mind for the
> > implementation. The decision to switch the sticky partition actually
> > involves returning from the record accumulator and assigning the new
> > partition before the new batch is created. That way all of the records
> will
> > go to this new partition's batch. If you would like to get a better look
> at
> > how this works, please check out the PR:
> > https://github.com/apache/kafka/pull/6997/files. The most important
> lines
> > are in the append method of the RecordAccumulator and doSend in
> > KafkaProducer.
>
> Thanks for the explanation.
>
> >
> > Colin, I think this makes sense to me except for the name
> > StickyRoundRobinPartitioner seems to not really explain the behavior of
> > what would be implemented. Perhaps a name indicating the sticky behavior
> is
> > always used, or that it will be used on keys is more descriptive. Calling
> > it "RoundRobin" seems a bit misleading to me.
>
> Hmm, what name would you propose here?
>
> Keep in mind we don't have to implement the new configurable partitioner
> in the initial PR :)
>
> best,
> Colin
>
> >
> > Thanks again for reviewing,
> > Justine
> >
> > On Thu, Jul 11, 2019 at 6:07 PM Jun Rao <j...@confluent.io> wrote:
> >
> > > Hi, Justine,
> > >
> > > Thanks for the KIP. Nice writeup and great results. Just one comment.
> > >
> > > 100. To add a record to the accumulator, the producer needs to know the
> > > partition id. The decision of whether the record can be added to the
> > > current batch is only made after the accumulator.append() call. So,
> when a
> > > batch is full, it seems that the KIP will try to append the next
> record to
> > > the same partition, which will trigger the creation of a new batch
> with a
> > > single record. After that, new records will be routed to a new
> partition.
> > > If the producer doesn't come back to the first partition in time, the
> > > producer will send a single record batch. In the worse case, it can be
> that
> > > every other batch has only a single record. Is this correct? If so,
> could
> > > we avoid that?
> > >
> > > Jun
> > >
> > > On Thu, Jul 11, 2019 at 5:23 PM Colin McCabe <cmcc...@apache.org>
> wrote:
> > >
> > > > Hi Justine,
> > > >
> > > > I agree that we shouldn't change RoundRobinPartitioner, since its
> > > behavior
> > > > is already specified.
> > > >
> > > > However, we could add a new, separate StickyRoundRobinPartitioner
> class
> > > to
> > > > KIP-480 which just implemented the sticky behavior regardless of
> whether
> > > > the key was null.  That seems pretty easy to add (and it wouldn't
> have to
> > > > implemented right away in the first PR, of course.)  It would be an
> > > option
> > > > for people who wanted to configure this behavior.
> > > >
> > > > best,
> > > > Colin
> > > >
> > > >
> > > > On Wed, Jul 10, 2019, at 08:48, Justine Olshan wrote:
> > > > > Hi M,
> > > > >
> > > > > I'm a little confused by what you mean by extending the behavior
> on to
> > > > the
> > > > > RoundRobinPartitioner.
> > > > > The sticky partitioner plans to remove the round-robin behavior
> from
> > > > > records with no keys. Instead of sending them to each partition in
> > > order,
> > > > > it sends them all to the same partition until the batch is sent.
> > > > > I don't think you can have both round-robin and sticky partition
> > > > behavior.
> > > > >
> > > > > Thank you,
> > > > > Justine Olshan
> > > > >
> > > > > On Wed, Jul 10, 2019 at 1:54 AM M. Manna <manme...@gmail.com>
> wrote:
> > > > >
> > > > > > Thanks for the comments Colin.
> > > > > >
> > > > > > My only concern is that this KIP is addressing a good feature and
> > > > having
> > > > > > that extended to RoundRobinPartitioner means 1 less KIP in the
> > > future.
> > > > > >
> > > > > > Would it be appropriate to extend the support to
> > > RoundRobinPartitioner
> > > > too?
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > On Tue, 9 Jul 2019 at 17:24, Colin McCabe <cmcc...@apache.org>
> > > wrote:
> > > > > >
> > > > > > > Hi M,
> > > > > > >
> > > > > > > The RoundRobinPartitioner added by KIP-369 doesn't interact
> with
> > > this
> > > > > > > KIP.  If you configure your producer to use
> RoundRobinPartitioner,
> > > > then
> > > > > > the
> > > > > > > DefaultPartitioner will not be used.  And the "sticky"
> behavior is
> > > > > > > implemented only in the DefaultPartitioner.
> > > > > > >
> > > > > > > regards,
> > > > > > > Colin
> > > > > > >
> > > > > > >
> > > > > > > On Tue, Jul 9, 2019, at 05:12, M. Manna wrote:
> > > > > > > > Hello Justine,
> > > > > > > >
> > > > > > > > I have one item I wanted to discuss.
> > > > > > > >
> > > > > > > > We are currently in review stage for KAFKA-3333 where we can
> > > choose
> > > > > > > always
> > > > > > > > RoundRobin regardless of null/usable key.
> > > > > > > >
> > > > > > > > If I understood this KIP motivation correctly, you are still
> > > > honouring
> > > > > > > how
> > > > > > > > the hashing of key works for DefaultPartitioner. Would you
> say
> > > that
> > > > > > > having
> > > > > > > > an always "Round-Robin" partitioning with "Sticky" assignment
> > > > > > (efficient
> > > > > > > > batching of records for a partition) doesn't deviate from
> your
> > > > original
> > > > > > > > intention?
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > On Tue, 9 Jul 2019 at 01:00, Justine Olshan <
> > > jols...@confluent.io>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hello all,
> > > > > > > > >
> > > > > > > > > If there are no more comments or concerns, I would like to
> > > start
> > > > the
> > > > > > > vote
> > > > > > > > > on this tomorrow afternoon.
> > > > > > > > >
> > > > > > > > > However, if there are still topics to discuss, feel free to
> > > bring
> > > > > > them
> > > > > > > up
> > > > > > > > > now.
> > > > > > > > >
> > > > > > > > > Thank you,
> > > > > > > > > Justine
> > > > > > > > >
> > > > > > > > > On Tue, Jul 2, 2019 at 4:25 PM Justine Olshan <
> > > > jols...@confluent.io>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hello again,
> > > > > > > > > >
> > > > > > > > > > Another update to the interface has been made to the KIP.
> > > > > > > > > > Please let me know if you have any feedback!
> > > > > > > > > >
> > > > > > > > > > Thank you,
> > > > > > > > > > Justine
> > > > > > > > > >
> > > > > > > > > > On Fri, Jun 28, 2019 at 2:52 PM Justine Olshan <
> > > > > > jols...@confluent.io
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > >> Hi all,
> > > > > > > > > >> I made some changes to the KIP.
> > > > > > > > > >> The idea is to clean up the code, make behavior more
> > > explicit,
> > > > > > > provide
> > > > > > > > > >> more flexibility, and to keep default behavior the same.
> > > > > > > > > >>
> > > > > > > > > >> Now we will change the partition in onNewBatch, and
> specify
> > > > the
> > > > > > > > > >> conditions for this function call (non-keyed values, no
> > > > explicit
> > > > > > > > > >> partitions) in willCallOnNewBatch.
> > > > > > > > > >> This clears up some of the issues with the
> implementation.
> > > I'm
> > > > > > > happy to
> > > > > > > > > >> hear further opinions and discuss this change!
> > > > > > > > > >>
> > > > > > > > > >> Thank you,
> > > > > > > > > >> Justine
> > > > > > > > > >>
> > > > > > > > > >> On Thu, Jun 27, 2019 at 2:53 PM Colin McCabe <
> > > > cmcc...@apache.org>
> > > > > > > > > wrote:
> > > > > > > > > >>
> > > > > > > > > >>> On Thu, Jun 27, 2019, at 01:31, Ismael Juma wrote:
> > > > > > > > > >>> > Thanks for the KIP Justine. It looks pretty good. A
> few
> > > > > > comments:
> > > > > > > > > >>> >
> > > > > > > > > >>> > 1. Should we favor partitions that are not under
> > > > replicated?
> > > > > > > This is
> > > > > > > > > >>> > something that Netflix did too.
> > > > > > > > > >>>
> > > > > > > > > >>> This seems like it could lead to cascading failures,
> right?
> > > > If a
> > > > > > > > > >>> partition becomes under-replicated because there is too
> > > much
> > > > > > > traffic,
> > > > > > > > > the
> > > > > > > > > >>> producer stops sending to it, which puts even more
> load on
> > > > the
> > > > > > > > > remaining
> > > > > > > > > >>> partitions, which are even more likely to fail then,
> etc.
> > > It
> > > > > > also
> > > > > > > will
> > > > > > > > > >>> create unbalanced load patterns on the consumers.
> > > > > > > > > >>>
> > > > > > > > > >>> >
> > > > > > > > > >>> > 2. If there's no measurable performance difference, I
> > > agree
> > > > > > with
> > > > > > > > > >>> Stanislav
> > > > > > > > > >>> > that Optional would be better than Integer.
> > > > > > > > > >>> >
> > > > > > > > > >>> > 3. We should include the javadoc for the newly
> introduced
> > > > > > method
> > > > > > > that
> > > > > > > > > >>> > specifies it and its parameters. In particular, it
> would
> > > > good
> > > > > > to
> > > > > > > > > >>> specify if
> > > > > > > > > >>> > it gets called when an explicit partition id has been
> > > > provided.
> > > > > > > > > >>>
> > > > > > > > > >>> Agreed.
> > > > > > > > > >>>
> > > > > > > > > >>> best,
> > > > > > > > > >>> Colin
> > > > > > > > > >>>
> > > > > > > > > >>> >
> > > > > > > > > >>> > Ismael
> > > > > > > > > >>> >
> > > > > > > > > >>> > On Mon, Jun 24, 2019, 2:04 PM Justine Olshan <
> > > > > > > jols...@confluent.io>
> > > > > > > > > >>> wrote:
> > > > > > > > > >>> >
> > > > > > > > > >>> > > Hello,
> > > > > > > > > >>> > > This is the discussion thread for KIP-480: Sticky
> > > > > > Partitioner.
> > > > > > > > > >>> > >
> > > > > > > > > >>> > >
> > > > > > > > > >>> > >
> > > > > > > > > >>>
> > > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-480%3A+Sticky+Partitioner
> > > > > > > > > >>> > >
> > > > > > > > > >>> > > Thank you,
> > > > > > > > > >>> > > Justine Olshan
> > > > > > > > > >>> > >
> > > > > > > > > >>> >
> > > > > > > > > >>>
> > > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to