Hello Bruno, I've read through the aggregation section and I think they look good to me. There are a few minor comments about the wiki page itself:
1) A state store might consist of multiple state stores -> You mean a `logical` state store be consistent of multiple `physical` store instances? 2) The "Hit Rates" calculation seems to be referring to the `Hit Ratio` (which is a percentage) than `Hit Rate`? And a couple further meta comments: 1) For memtable / block cache, instead of the hit-rate do you think we should expose the hit-ratio? I felt it is more useful for users to debug what's the root cause of unexpected slow performance. And for block cache misses, is it easy to provide a metric as of "target read" of where a read is served (from which level, either in OS cache or in SST files), similar to Fig.11 in http://cidrdb.org/cidr2017/papers/p82-dong-cidr17.pdf? 2) As @Patrik mentioned, is there a good way we can expose the total amount of memory and disk usage for each state store as well? I think it would also be very helpful for users to understand their capacity needs and read / write amplifications. Guozhang On Fri, Jun 14, 2019 at 6:55 AM Bruno Cadonna <br...@confluent.io> wrote: > Hi, > > I decided to go for the option in which metrics are exposed for each > logical state store. I revisited the KIP correspondingly and added a > section on how to aggregate metrics over multiple physical RocksDB > instances within one logical state store. Would be great, if you could > take a look and give feedback. If nobody has complaints about the > chosen option I would proceed with voting on this KIP since this was > the last open question. > > Best, > Bruno > > On Fri, Jun 7, 2019 at 9:38 PM Patrik Kleindl <pklei...@gmail.com> wrote: > > > > Hi Sophie > > This will be a good change, I have been thinking about proposing > something similar or even passing the properties per store. > > RocksDB should probably know how much memory was reserved but maybe does > not expose it. > > We are limiting it already as you suggested but this is a rather crude > tool. > > Especially in a larger topology with mixed loads par topic it would be > helpful to get more insights which store puts a lot of load on memory. > > Regarding the limiting capability, I think I remember reading that those > only affect some parts of the memory and others can still exceed this > limit. I‘ll try to look up the difference. > > Best regards > > Patrik > > > > > Am 07.06.2019 um 21:03 schrieb Sophie Blee-Goldman < > sop...@confluent.io>: > > > > > > Hi Patrik, > > > > > > As of 2.3 you will be able to use the RocksDBConfigSetter to > effectively > > > bound the total memory used by RocksDB for a single app instance. You > > > should already be able to limit the memory used per rocksdb store, > though > > > as you mention there can be a lot of them. I'm not sure you can > monitor the > > > memory usage if you are not limiting it though. > > > > > >> On Fri, Jun 7, 2019 at 2:06 AM Patrik Kleindl <pklei...@gmail.com> > wrote: > > >> > > >> Hi > > >> Thanks Bruno for the KIP, this is a very good idea. > > >> > > >> I have one question, are there metrics available for the memory > consumption > > >> of RocksDB? > > >> As they are running outside the JVM we have run into issues because > they > > >> were using all the other memory. > > >> And with multiple streams applications on the same machine, each with > > >> several KTables and 10+ partitions per topic the number of stores can > get > > >> out of hand pretty easily. > > >> Or did I miss something obvious how those can be monitored better? > > >> > > >> best regards > > >> > > >> Patrik > > >> > > >>> On Fri, 17 May 2019 at 23:54, Bruno Cadonna <br...@confluent.io> > wrote: > > >>> > > >>> Hi all, > > >>> > > >>> this KIP describes the extension of the Kafka Streams' metrics to > include > > >>> RocksDB's internal statistics. > > >>> > > >>> Please have a look at it and let me know what you think. Since I am > not a > > >>> RocksDB expert, I am thankful for any additional pair of eyes that > > >>> evaluates this KIP. > > >>> > > >>> > > >>> > > >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-471:+Expose+RocksDB+Metrics+in+Kafka+Streams > > >>> > > >>> Best regards, > > >>> Bruno > > >>> > > >> > -- -- Guozhang