Why shouldn't it be 5 minutes? ;-)
It is a finger in the air number. Based on the testing i did it shows that
there isn't much, if any, overhead when checkpointing a single store on the
commit interval. The default commit interval is 30 seconds, so it could
possibly be set to that. However, i'd prefer to be a little conservative so
5 minutes seemed reasonable.


On Thu, 9 Feb 2017 at 10:25 Michael Noll <mich...@confluent.io> wrote:

> Damian,
>
> could you elaborate briefly why the default value should be 5 minutes?
> What are the considerations, assumptions, etc. that go into picking this
> value?
>
> Right now, in the KIP and in this discussion, "5 mins" looks like a magic
> number to me. :-)
>
> -Michael
>
>
>
> On Thu, Feb 9, 2017 at 11:03 AM, Damian Guy <damian....@gmail.com> wrote:
>
> > I've ran the SimpleBenchmark with checkpoint on and off to see what the
> > impact is. It appears that there is very little impact, if any. The
> numbers
> > with checkpointing on actually look better, but that is likely largely
> due
> > to external influences.
> >
> > In any case, i'm going to suggest we go with a default checkpoint
> interval
> > of 5 minutes. I've update the KIP with this.
> >
> > commit every 10 seconds (no checkpoint)
> > Streams Performance [records/latency/rec-sec/MB-sec source+store]:
> > 10000000/34798/287372.83751939767/29.570664980746017
> > Streams Performance [records/latency/rec-sec/MB-sec source+store]:
> > 10000000/35942/278226.0308274442/28.62945857214401
> > Streams Performance [records/latency/rec-sec/MB-sec source+store]:
> > 10000000/34677/288375.58035585546/29.673847218617528
> > Streams Performance [records/latency/rec-sec/MB-sec source+store]:
> > 10000000/34677/288375.58035585546/29.673847218617528
> > Streams Performance [records/latency/rec-sec/MB-sec source+store]:
> > 10000000/31192/320595.02436522185/32.98922800718133
> >
> >
> > checkpoint every 10 seconds (same as commit interval)
> > Streams Performance [records/latency/rec-sec/MB-sec source+store]:
> > 10000000/36997/270292.185852907/27.81306592426413
> > Streams Performance [records/latency/rec-sec/MB-sec source+store]:
> > 10000000/32087/311652.69423754164/32.069062237043035
> > Streams Performance [records/latency/rec-sec/MB-sec source+store]:
> > 10000000/32895/303997.5680194558/31.281349749202004
> > Streams Performance [records/latency/rec-sec/MB-sec source+store]:
> > 10000000/33476/298721.4720994145/30.738439479029754
> > Streams Performance [records/latency/rec-sec/MB-sec source+store]:
> > 10000000/33196/301241.1133871551/30.99771056753826
> >
> > On Wed, 8 Feb 2017 at 09:02 Damian Guy <damian....@gmail.com> wrote:
> >
> > > Matthias,
> > >
> > > Fair point. I'll update it the KIP.
> > > Thanks
> > >
> > > On Wed, 8 Feb 2017 at 05:49 Matthias J. Sax <matth...@confluent.io>
> > wrote:
> > >
> > > Damian,
> > >
> > > I am not strict about it either. However, if there is no advantage in
> > > disabling it, we might not want to allow it. This would have the
> > > advantage to guard users to accidentally switch it off.
> > >
> > > -Matthias
> > >
> > >
> > > On 2/3/17 2:03 AM, Damian Guy wrote:
> > > > Hi Matthias,
> > > >
> > > > It possibly doesn't make sense to disable it, but then i'm sure
> someone
> > > > will come up with a reason they don't want it!
> > > > I'm happy to change it such that the checkpoint interval must be > 0.
> > > >
> > > > Cheers,
> > > > Damian
> > > >
> > > > On Fri, 3 Feb 2017 at 01:29 Matthias J. Sax <matth...@confluent.io>
> > > wrote:
> > > >
> > > >> Thanks Damian.
> > > >>
> > > >> One more question: "Checkpointing is disabled if the checkpoint
> > interval
> > > >> is set to a value <=0."
> > > >>
> > > >>
> > > >> Does it make sense to disable check pointing? What's the tradeoff
> > here?
> > > >>
> > > >>
> > > >> -Matthias
> > > >>
> > > >>
> > > >> On 2/2/17 1:51 AM, Damian Guy wrote:
> > > >>> Hi Matthias,
> > > >>>
> > > >>> Thanks for the comments.
> > > >>>
> > > >>> 1. TBD - i need to do some performance tests and try and work out a
> > > >>> sensible default.
> > > >>> 2. Yes, you are correct. It could be a multiple of the
> > > >> commit.interval.ms.
> > > >>> But, that would also mean if you change the commit interval - say
> you
> > > >> lower
> > > >>> it, then you might also need to change the checkpoint setting (i.e,
> > you
> > > >>> still only want to checkpoint every n minutes).
> > > >>>
> > > >>> On Wed, 1 Feb 2017 at 23:46 Matthias J. Sax <matth...@confluent.io
> >
> > > >> wrote:
> > > >>>
> > > >>>> Thanks for the KIP Damian.
> > > >>>>
> > > >>>> I am wondering about two things:
> > > >>>>
> > > >>>> 1. what should be the default value for the new parameter?
> > > >>>> 2. why is the new parameter provided in ms?
> > > >>>>
> > > >>>> About (2): because
> > > >>>>
> > > >>>> "the minimum checkpoint interval will be the value of
> > > >>>> commit.interval.ms. In effect the actual checkpoint interval will
> > be
> > > a
> > > >>>> multiple of the commit interval"
> > > >>>>
> > > >>>> it might be easier to just use an parameter that is
> > "number-or-commit
> > > >>>> intervals".
> > > >>>>
> > > >>>>
> > > >>>> -Matthias
> > > >>>>
> > > >>>>
> > > >>>> On 2/1/17 7:29 AM, Damian Guy wrote:
> > > >>>>> Thanks for the comments Eno.
> > > >>>>> As for exactly once, i don't believe this matters as we are just
> > > >>>> restoring
> > > >>>>> the change-log, i.e, the result of the aggregations that
> previously
> > > ran
> > > >>>>> etc. So once initialized the state store will be in the same
> state
> > as
> > > >> it
> > > >>>>> was before.
> > > >>>>> Having the checkpoint in a kafka topic is not ideal as the state
> is
> > > per
> > > >>>>> kafka streams instance. So each instance would need to start
> with a
> > > >>>> unique
> > > >>>>> id that is persistent.
> > > >>>>>
> > > >>>>> Cheers,
> > > >>>>> Damian
> > > >>>>>
> > > >>>>> On Wed, 1 Feb 2017 at 13:20 Eno Thereska <eno.there...@gmail.com
> >
> > > >> wrote:
> > > >>>>>
> > > >>>>>> As a follow up to my previous comment, have you thought about
> > > writing
> > > >>>> the
> > > >>>>>> checkpoint to a topic instead of a local file? That would have
> the
> > > >>>>>> advantage that all metadata continues to be managed by Kafka, as
> > > well
> > > >> as
> > > >>>>>> fit with EoS. The potential disadvantage would be a slower
> > latency,
> > > >>>> however
> > > >>>>>> if it is periodic as you mention, I'm not sure that would be a
> > show
> > > >>>> stopper.
> > > >>>>>>
> > > >>>>>> Thanks
> > > >>>>>> Eno
> > > >>>>>>> On 1 Feb 2017, at 12:58, Eno Thereska <eno.there...@gmail.com>
> > > >> wrote:
> > > >>>>>>>
> > > >>>>>>> Thanks Damian, this is a good idea and will reduce the restore
> > > time.
> > > >>>>>> Looking forward, with exactly once and support for transactions
> in
> > > >>>> Kafka, I
> > > >>>>>> believe we'll have to add some support for rolling back
> > checkpoints,
> > > >>>> e.g.,
> > > >>>>>> when a transaction is aborted. We need to be aware of that and
> > > ideally
> > > >>>>>> anticipate a bit those needs in the KIP.
> > > >>>>>>>
> > > >>>>>>> Thanks
> > > >>>>>>> Eno
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>>> On 1 Feb 2017, at 10:18, Damian Guy <damian....@gmail.com>
> > wrote:
> > > >>>>>>>>
> > > >>>>>>>> Hi all,
> > > >>>>>>>>
> > > >>>>>>>> I would like to start the discussion on KIP-116:
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>
> > > >>>>
> > > >>
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 116+-+Add+State+Store+Checkpoint+Interval+Configuration
> > > >>>>>>>>
> > > >>>>>>>> Thanks,
> > > >>>>>>>> Damian
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>
> > > >>>>
> > > >>>
> > > >>
> > > >>
> > > >
> > >
> > >
> >
>

Reply via email to