Thanks, all. If you don’t mind, I’ll pitch in a few cents’ worth.
In my life I’ve generally found more granular metrics to be more useful, as long as there’s a sane way to roll them up. It does seem nice to see it on the per-store level. For roll-up purposes, the task and thread tags should be sufficient. I think the only reason we make some metrics Debug is that _recording_ them can be expensive. If there’s no added expense, I think we can just register store-level metrics at Info level. Thanks for the KIP, Bruno! -John On Mon, May 11, 2020, at 17:32, Guozhang Wang wrote: > Hello Sophie / Bruno, > > I've also thought about the leveling question, and one motivation I had for > setting it in instance-level is that we want to expose it in INFO level: > today our report leveling is not very finer grained --- which I think is > sth. worth itself --- such that one have to either turn on all DEBUG > metrics recording or none of them. If we can allow users to e.g. specify > "turn on streams-metrics and stream-state-metrics, but not others" and then > I think it should be just at store-level. However, right now if we want to > set it as store-level then it would be DEBUG and not exposed by default. > > So it seems we have several options in addition to the proposed one: > > a) we set it at store-level as INFO; but then one can argue why this is > INFO while others (bytes-written, etc) are DEBUG. > b) we set it at store-level as DEBUG, believing that we do not usually need > to turn it on. > c) maybe, we can set it at task-level (? I'm not so sure myself about > this.. :P) as INFO. > > > Guozhang > > > > > On Mon, May 11, 2020 at 12:29 PM Sophie Blee-Goldman <sop...@confluent.io> > wrote: > > > Hey Bruno, > > > > Thanks for the KIP! I have one high-level concern, which is that we should > > consider > > reporting these metrics on the per-store level rather than instance-wide. I > > know I was > > the one who first proposed making it instance-wide, so bear with me: > > > > While I would still argue that the instance-wide memory usage is probably > > the most *useful*, > > exposing them at the store-level does not prevent users from monitoring the > > instance-wide > > memory. They should be able to roll up all the store-level metrics on an > > instance to > > compute the total off-heap memory. But rolling it up for the users does > > prevent them from > > using this to debug rare cases where one store may be using significantly > > more memory than > > expected. > > > > It's also worth considering that some users may be using the bounded memory > > config setter > > to put a cap on the off-heap memory of the entire process, in which case > > the memory usage > > metric for any one store should reflect the memory usage of the entire > > instance. In that case > > any effort to roll up the memory usages ourselves would just be wasted. > > > > Sorry for the reversal, but after a second thought I'm pretty strongly in > > favor of reporting these > > at the store level. > > > > Best, > > Sophie > > > > On Wed, May 6, 2020 at 8:41 AM Bruno Cadonna <br...@confluent.io> wrote: > > > > > Hi all, > > > > > > I'd like to discuss KIP-607 that aims to add RocksDB memory usage > > > metrics to Kafka Streams. > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-607%3A+Add+Metrics+to+Record+the+Memory+Used+by+RocksDB+to+Kafka+Streams > > > > > > Best, > > > Bruno > > > > > > > > -- > -- Guozhang >