One more thing for this KIP: Currently RocksDBWindowStore serialize the key/value before it puts into the in-memory cache, I think we should delay this serialization/deserialization unless it needs flush to db. For a simple countByKey for 100 records, this would trigger 100 serialization/deserialization even if everything is in-memory.
If we move this internal cache from RocksDBStore to a global place, I hope we can reduces the time it needs to do the serialization. On Mon, Jun 6, 2016 at 11:07 AM, Ismael Juma <ism...@juma.me.uk> wrote: > On Mon, Jun 6, 2016 at 6:48 PM, Guozhang Wang <wangg...@gmail.com> wrote: > > > > About using Instrumentation.getObjectSize, yeah we were worried a lot > about > > its efficiency as well as accuracy when discussing internally, but not a > > better solution was proposed. So if people have better ideas, please > throw > > them here, as it is also the purpose for us to call out such KIP > discussion > > threads. > > > > Note that this requires a Java agent to be configured. A few links: > > > https://github.com/apache/spark/blob/b0ce0d13127431fa7cd4c11064762eb0b12e3436/core/src/main/scala/org/apache/spark/util/SizeEstimator.scala > > https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/utils/ObjectSizes.java > https://github.com/jbellis/jamm > http://openjdk.java.net/projects/code-tools/jol/ > https://github.com/DimitrisAndreou/memory-measurer > > OK, maybe that's more than what you wanted. :) > > Ismael >