Thanks for the KIP, Bruno. Having sensible, easy to access RocksDB memory reporting will be a welcomed addition.
FWIW I also agree to have the metrics reported on a store level. I'm glad you changed the KIP to that effect. -Bill On Wed, May 13, 2020 at 6:24 PM Guozhang Wang <wangg...@gmail.com> wrote: > Hi Bruno, > > Sounds good to me. > > I think I'm just a bit more curious to see its impact on performance: as > long as we have one INFO level rocksDB metrics, then we'd have to turn on > the scheduled rocksdb metrics recorder whereas previously, we can decide to > not turn on the recorder at all if all are set as DEBUG and we configure at > INFO level in production. But this is an implementation detail anyways and > maybe the impact is negligible after all. We can check and re-discuss this > afterwards :) > > > Guozhang > > > On Wed, May 13, 2020 at 9:34 AM Sophie Blee-Goldman <sop...@confluent.io> > wrote: > > > Thanks Bruno! I took a look at the revised KIP and it looks good to me. > > > > Sophie > > > > On Wed, May 13, 2020 at 6:59 AM Bruno Cadonna <br...@confluent.io> > wrote: > > > > > Hi John, > > > > > > Thank you for the feedback! > > > > > > I agree and I will change the KIP as I stated in my previous e-mail to > > > Guozhang. > > > > > > Best, > > > Bruno > > > > > > On Tue, May 12, 2020 at 3:07 AM John Roesler <vvcep...@apache.org> > > wrote: > > > > > > > > Thanks, all. > > > > > > > > If you don’t mind, I’ll pitch in a few cents’ worth. > > > > > > > > In my life I’ve generally found more granular metrics to be more > > useful, > > > as long as there’s a sane way to roll them up. It does seem nice to see > > it > > > on the per-store level. For roll-up purposes, the task and thread tags > > > should be sufficient. > > > > > > > > I think the only reason we make some metrics Debug is that > _recording_ > > > them can be expensive. If there’s no added expense, I think we can just > > > register store-level metrics at Info level. > > > > > > > > Thanks for the KIP, Bruno! > > > > -John > > > > > > > > On Mon, May 11, 2020, at 17:32, Guozhang Wang wrote: > > > > > Hello Sophie / Bruno, > > > > > > > > > > I've also thought about the leveling question, and one motivation I > > > had for > > > > > setting it in instance-level is that we want to expose it in INFO > > > level: > > > > > today our report leveling is not very finer grained --- which I > think > > > is > > > > > sth. worth itself --- such that one have to either turn on all > DEBUG > > > > > metrics recording or none of them. If we can allow users to e.g. > > > specify > > > > > "turn on streams-metrics and stream-state-metrics, but not others" > > and > > > then > > > > > I think it should be just at store-level. However, right now if we > > > want to > > > > > set it as store-level then it would be DEBUG and not exposed by > > > default. > > > > > > > > > > So it seems we have several options in addition to the proposed > one: > > > > > > > > > > a) we set it at store-level as INFO; but then one can argue why > this > > is > > > > > INFO while others (bytes-written, etc) are DEBUG. > > > > > b) we set it at store-level as DEBUG, believing that we do not > > usually > > > need > > > > > to turn it on. > > > > > c) maybe, we can set it at task-level (? I'm not so sure myself > about > > > > > this.. :P) as INFO. > > > > > > > > > > > > > > > Guozhang > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, May 11, 2020 at 12:29 PM Sophie Blee-Goldman < > > > sop...@confluent.io> > > > > > wrote: > > > > > > > > > > > Hey Bruno, > > > > > > > > > > > > Thanks for the KIP! I have one high-level concern, which is that > we > > > should > > > > > > consider > > > > > > reporting these metrics on the per-store level rather than > > > instance-wide. I > > > > > > know I was > > > > > > the one who first proposed making it instance-wide, so bear with > > me: > > > > > > > > > > > > While I would still argue that the instance-wide memory usage is > > > probably > > > > > > the most *useful*, > > > > > > exposing them at the store-level does not prevent users from > > > monitoring the > > > > > > instance-wide > > > > > > memory. They should be able to roll up all the store-level > metrics > > > on an > > > > > > instance to > > > > > > compute the total off-heap memory. But rolling it up for the > users > > > does > > > > > > prevent them from > > > > > > using this to debug rare cases where one store may be using > > > significantly > > > > > > more memory than > > > > > > expected. > > > > > > > > > > > > It's also worth considering that some users may be using the > > bounded > > > memory > > > > > > config setter > > > > > > to put a cap on the off-heap memory of the entire process, in > which > > > case > > > > > > the memory usage > > > > > > metric for any one store should reflect the memory usage of the > > > entire > > > > > > instance. In that case > > > > > > any effort to roll up the memory usages ourselves would just be > > > wasted. > > > > > > > > > > > > Sorry for the reversal, but after a second thought I'm pretty > > > strongly in > > > > > > favor of reporting these > > > > > > at the store level. > > > > > > > > > > > > Best, > > > > > > Sophie > > > > > > > > > > > > On Wed, May 6, 2020 at 8:41 AM Bruno Cadonna <br...@confluent.io > > > > > wrote: > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > I'd like to discuss KIP-607 that aims to add RocksDB memory > usage > > > > > > > metrics to Kafka Streams. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-607%3A+Add+Metrics+to+Record+the+Memory+Used+by+RocksDB+to+Kafka+Streams > > > > > > > > > > > > > > Best, > > > > > > > Bruno > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > -- Guozhang > > > > > > > > > > > > > -- > -- Guozhang >