Hi Bruno,

I just read the KIP. I think this feature is great. As far as I know, most
Kafka users monitor the host resources, JVM resources, and Kafka metrics
only, not RocksDB for configuring the statistics feature is a little bit
tiresome. Since RocksDB impacts the performance of Kafka Streams, I believe
providing these metrics with other metrics in one place is much better.

However, I am a little bit not assured for how much information should be
provided to the users with the documentation - how the user can control the
 RocksDB may on the boundary of the scope. How do you think?

+1. I entirely agree to Bill's comments; I also think `rocksdb-store-id` is
better than `rocksdb-state-id` and metrics on total compactions and an
average number of compactions is also needed.

Regards,
Dongjin

On Tue, May 21, 2019 at 2:48 AM John Roesler <j...@confluent.io> wrote:

> Hi Bruno,
>
> Looks really good overall. This is going to be an awesome addition.
>
> My only thought was that we have "bytes-flushed-(rate|total) and
> flush-time-(avg|min|max)" metrics, and the description states that
> these are specifically for Memtable flush operations. What do you
> think about calling it "memtable-bytes-flushed... (etc)"? On one hand,
> I could see this being redundant, since that's the only thing that
> gets "flushed" inside of Rocks. But on the other, we have an
> independent "flush" operation in streams, which might be confusing.
> Plus, it might help people who are looking at the full "menu" of
> hundreds of metrics. They can't read and remember every description
> while trying to understand the full list of metrics, so going for
> maximum self-documentation in the name seems nice.
>
> But that's a minor thought. Modulo the others' comments, this looks good
> to me.
>
> Thanks,
> -John
>
> On Mon, May 20, 2019 at 11:07 AM Bill Bejeck <bbej...@gmail.com> wrote:
> >
> > Hi Bruno,
> >
> > Thanks for the KIP, this will be a useful addition.
> >
> > Overall the KIP looks good to me, and I have two minor comments.
> >
> > 1. For the tags should, I'm wondering if rocksdb-state-id should be
> > rocksdb-store-id
> > instead?
> >
> > 2. With the compaction metrics, would it be possible to add total
> > compactions and an average number of compactions?  I've taken a look at
> the
> > available RocksDB metrics, and I'm not sure.  But users can control how
> > many L0 files it takes to trigger compaction so if it is possible; it may
> > be useful.
> >
> > Thanks,
> > Bill
> >
> >
> > On Mon, May 20, 2019 at 9:15 AM Bruno Cadonna <br...@confluent.io>
> wrote:
> >
> > > Hi Sophie,
> > >
> > > Thank you for your comments.
> > >
> > > It's a good idea to supplement the metrics with configuration option
> > > to change the metrics. I also had some thoughts about it. However, I
> > > think I need some experimentation to get this right.
> > >
> > > I added the block cache hit rates for index and filter blocks to the
> > > KIP. As far as I understood, they should stay at zero, if users do not
> > > configure RocksDB to include index and filter blocks into the block
> > > cache. Did you also understand this similarly? I guess also in this
> > > case some experimentation would be good to be sure.
> > >
> > > Best,
> > > Bruno
> > >
> > >
> > > On Sat, May 18, 2019 at 2:29 AM Sophie Blee-Goldman <
> sop...@confluent.io>
> > > wrote:
> > > >
> > > > Actually I wonder if it might be useful to users to be able to break
> up
> > > the
> > > > cache hit stats by type? Some people may choose to store index and
> filter
> > > > blocks alongside data blocks, and it would probably be very helpful
> for
> > > > them to know who is making more effective use of the cache in order
> to
> > > tune
> > > > how much of it is allocated to each. I'm not sure how common this
> really
> > > is
> > > > but I think it would be invaluable to those who do. RocksDB
> performance
> > > can
> > > > be quite opaque..
> > > >
> > > > Cheers,
> > > > Sophie
> > > >
> > > > On Fri, May 17, 2019 at 5:01 PM Sophie Blee-Goldman <
> sop...@confluent.io
> > > >
> > > > wrote:
> > > >
> > > > > Hey Bruno!
> > > > >
> > > > > This all looks pretty good to me, but one suggestion I have is to
> > > > > supplement each of the metrics with some info on how the user can
> > > control
> > > > > them. In other words, which options could/should they set in
> > > > > RocksDBConfigSetter should they discover a particular bottleneck?
> > > > >
> > > > > I don't think this necessarily needs to go into the KIP, but I do
> > > think it
> > > > > should be included in the docs somewhere (happy to help build up
> the
> > > list
> > > > > of associated options when the time comes)
> > > > >
> > > > > Thanks!
> > > > > Sophie
> > > > >
> > > > > On Fri, May 17, 2019 at 2:54 PM Bruno Cadonna <br...@confluent.io>
> > > wrote:
> > > > >
> > > > >> Hi all,
> > > > >>
> > > > >> this KIP describes the extension of the Kafka Streams' metrics to
> > > include
> > > > >> RocksDB's internal statistics.
> > > > >>
> > > > >> Please have a look at it and let me know what you think. Since I
> am
> > > not a
> > > > >> RocksDB expert, I am thankful for any additional pair of eyes that
> > > > >> evaluates this KIP.
> > > > >>
> > > > >>
> > > > >>
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-471:+Expose+RocksDB+Metrics+in+Kafka+Streams
> > > > >>
> > > > >> Best regards,
> > > > >> Bruno
> > > > >>
> > > > >
> > >
>


-- 
*Dongjin Lee*

*A hitchhiker in the mathematical world.*
*github:  <http://goog_969573159/>github.com/dongjinleekr
<https://github.com/dongjinleekr>linkedin: kr.linkedin.com/in/dongjinleekr
<https://kr.linkedin.com/in/dongjinleekr>speakerdeck: speakerdeck.com/dongjin
<https://speakerdeck.com/dongjin>*

Reply via email to