Re: [DISCUSS] KIP-607: Add Metrics to Record the Memory Used by RocksDB to Kafka Streams

Guozhang Wang Wed, 13 May 2020 15:25:41 -0700

Hi Bruno,

Sounds good to me.


I think I'm just a bit more curious to see its impact on performance: as
long as we have one INFO level rocksDB metrics, then we'd have to turn on
the scheduled rocksdb metrics recorder whereas previously, we can decide to
not turn on the recorder at all if all are set as DEBUG and we configure at
INFO level in production. But this is an implementation detail anyways and
maybe the impact is negligible after all. We can check and re-discuss this
afterwards :)


Guozhang


On Wed, May 13, 2020 at 9:34 AM Sophie Blee-Goldman <sop...@confluent.io>
wrote:

> Thanks Bruno! I took a look at the revised KIP and it looks good to me.
>
> Sophie
>
> On Wed, May 13, 2020 at 6:59 AM Bruno Cadonna <br...@confluent.io> wrote:
>
> > Hi John,
> >
> > Thank you for the feedback!
> >
> > I agree and I will change the KIP as I stated in my previous e-mail to
> > Guozhang.
> >
> > Best,
> > Bruno
> >
> > On Tue, May 12, 2020 at 3:07 AM John Roesler <vvcep...@apache.org>
> wrote:
> > >
> > > Thanks, all.
> > >
> > > If you don’t mind, I’ll pitch in a few cents’ worth.
> > >
> > > In my life I’ve generally found more granular metrics to be more
> useful,
> > as long as there’s a sane way to roll them up. It does seem nice to see
> it
> > on the per-store level. For roll-up purposes, the task and thread tags
> > should be sufficient.
> > >
> > > I think the only reason we make some metrics Debug is that _recording_
> > them can be expensive. If there’s no added expense, I think we can just
> > register store-level metrics at Info level.
> > >
> > > Thanks for the KIP, Bruno!
> > > -John
> > >
> > > On Mon, May 11, 2020, at 17:32, Guozhang Wang wrote:
> > > > Hello Sophie / Bruno,
> > > >
> > > > I've also thought about the leveling question, and one motivation I
> > had for
> > > > setting it in instance-level is that we want to expose it in INFO
> > level:
> > > > today our report leveling is not very finer grained --- which I think
> > is
> > > > sth. worth itself --- such that one have to either turn on all DEBUG
> > > > metrics recording or none of them. If we can allow users to e.g.
> > specify
> > > > "turn on streams-metrics and stream-state-metrics, but not others"
> and
> > then
> > > > I think it should be just at store-level. However, right now if we
> > want to
> > > > set it as store-level then it would be DEBUG and not exposed by
> > default.
> > > >
> > > > So it seems we have several options in addition to the proposed one:
> > > >
> > > > a) we set it at store-level as INFO; but then one can argue why this
> is
> > > > INFO while others (bytes-written, etc) are DEBUG.
> > > > b) we set it at store-level as DEBUG, believing that we do not
> usually
> > need
> > > > to turn it on.
> > > > c) maybe, we can set it at task-level (? I'm not so sure myself about
> > > > this.. :P) as INFO.
> > > >
> > > >
> > > > Guozhang
> > > >
> > > >
> > > >
> > > >
> > > > On Mon, May 11, 2020 at 12:29 PM Sophie Blee-Goldman <
> > sop...@confluent.io>
> > > > wrote:
> > > >
> > > > > Hey Bruno,
> > > > >
> > > > > Thanks for the KIP! I have one high-level concern, which is that we
> > should
> > > > > consider
> > > > > reporting these metrics on the per-store level rather than
> > instance-wide. I
> > > > > know I was
> > > > > the one who first proposed making it instance-wide, so bear with
> me:
> > > > >
> > > > > While I would still argue that the instance-wide memory usage is
> > probably
> > > > > the most *useful*,
> > > > > exposing them at the store-level does not prevent users from
> > monitoring the
> > > > > instance-wide
> > > > > memory. They should be able to roll up all the store-level metrics
> > on an
> > > > > instance to
> > > > > compute the total off-heap memory. But rolling it up for the users
> > does
> > > > > prevent them from
> > > > > using this to debug rare cases where one store may be using
> > significantly
> > > > > more memory than
> > > > > expected.
> > > > >
> > > > > It's also worth considering that some users may be using the
> bounded
> > memory
> > > > > config setter
> > > > > to put a cap on the off-heap memory of the entire process, in which
> > case
> > > > > the memory usage
> > > > > metric for any one store should reflect the memory usage of the
> > entire
> > > > > instance. In that case
> > > > > any effort to roll up the memory usages ourselves would just be
> > wasted.
> > > > >
> > > > > Sorry for the reversal, but after a second thought I'm pretty
> > strongly in
> > > > > favor of reporting these
> > > > > at the store level.
> > > > >
> > > > > Best,
> > > > > Sophie
> > > > >
> > > > > On Wed, May 6, 2020 at 8:41 AM Bruno Cadonna <br...@confluent.io>
> > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > I'd like to discuss KIP-607 that aims to add RocksDB memory usage
> > > > > > metrics to Kafka Streams.
> > > > > >
> > > > > >
> > > > > >
> > > > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-607%3A+Add+Metrics+to+Record+the+Memory+Used+by+RocksDB+to+Kafka+Streams
> > > > > >
> > > > > > Best,
> > > > > > Bruno
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > -- Guozhang
> > > >
> >
>


-- 
-- Guozhang

Re: [DISCUSS] KIP-607: Add Metrics to Record the Memory Used by RocksDB to Kafka Streams

Reply via email to