Hi Ismael,

Yes, it makes sense to do benchmark. My concern was based on the
observation in KAFKA-3994 where we saw GC problem when creating new lists
in the purgatory.

Thanks,

Jiangjie (Becket) Qin


On Fri, Mar 10, 2017 at 8:54 AM, Ismael Juma <ism...@juma.me.uk> wrote:

> Hi Becket,
>
> Sorry for the delay and thanks for your comments. Comments inline.
>
> On Wed, Mar 1, 2017 at 8:59 PM, Becket Qin <becket....@gmail.com> wrote:
>
> > > The difference is that the user chooses the value type. They are free
> to
> > > choose a mutable or immutable type. A generic interceptor cannot mutate
> > the
> > > value because it doesn't know the type (and it could be immutable). One
> > > could write an interceptor that checked the type of the value at
> runtime
> > > and did things based on that. But again, the user who creates the
> record
> > is
> > > in control.
> > >
> > But there is no generic interceptor, right? The interceptor always takes
> > specific K, V type.
> >
>
> No, generic interceptors exist, are very useful and we use them at
> Confluent.
>
> If we want to let the users control the mutability, users can always call
> > headers.close() before calling producer.send() and that will force the
> > interceptor to create new ProducerRecord object.
> >
>
> If we expect users to call `close()`, then this is an important point to
> document because interceptors would have to catch an exception or check if
> the record has already been closed. I'm not sure if we should allow or
> encourage that.
>
>
> > If we do not allow the users to add headers on existing ProducerRecord
> > objects, each interceptor who wants to add headers will have to create a
> > new ProducerRecord. So we will have to create NUM_INTERCEPTOR times of
> > ProducerRecord, if a producer is sending 100K messages per second, it
> would
> > be a lot of new objects, right?
> >
>
> Note we are allocating one Header instance per header. If you are concerned
> about the number of ProducerRecord instances, then it seems like you should
> also be concerned about the number of Header instances, which is likely to
> be a multiple of the former. In addition, if we use a Map to hold the
> headers, we'll also need to allocate the map entry instances. This is not
> to say that we should be unnecessarily wasteful, but it's worth
> understanding what we're trying to achieve to make sure that we are not
> solving the wrong bottleneck.
>
> It may make sense to do some benchmarking during the implementation phase
> to verify that the solution achieves whatever performance goal we're aiming
> for.
>
> Thanks,
> Ismael
>

Reply via email to