[ https://issues.apache.org/jira/browse/HIVE-9693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14372501#comment-14372501 ]
Alan Gates commented on HIVE-9693: ---------------------------------- HiveConf: METASTORE_HBASE_AGGREGATE_STATS_CACHE_MAX_VARIANCE 0.01 seems like a really low default for this. I was thinking we should start closer to 0.1 or 0.2. AggregateStatsCache: For these classes that won't be used outside of the o.a.h.h.metastore.hbase package I've been marking both classes and methods as package scoped instead of public. This way it's clear to people that this is not a public API. I'm nervous about the read/write lock on AggregateStatsList. I see why you're doing it, but I'm nervous that the readers could spend a long time in there comparing and block writers. One thing we could do is make it so that the readers only hold the lock for long enough to copy the list and then release the lock. Or we could add a timeout to the write lock so we bound the amount of time the writer will wait, and if it can't write in that amount of time it just drops it. Line 46, we know a priori how big the hash table can get from the config value, so we should use that in the construction of the hash table as that will avoid possibly expensive resizings later. findBestMatch() This should also check whether an AggregateStats entry is past it's ttl. If so, it should reject it even if it's a match. This avoids way old data getting used in the case where the cache is large relative to the system usage and the cleaner hasn't been kicked off recently. line 149, bogus javadoc, there's no param candidateList Why have both a cleaner thread and evictOneNode? It's simpler to spawn the cleaner and add the node, even if that puts the cache a few over. Then the cleaner should first remove all timed out nodes, and if it is still above the cleaning threshold it should start evicting nodes on an LRU basis. Actually, once it starts cleaning it should make sure it goes a ways below the threshold so it doesn't have to start cleaning again immediately. Thus I'd add a {{private float cleanUntil = 0.8f}} maxFull and cleanUntil should be configurable. You'll want it for testing purposes if nothing else. In evictOneNode, if I understand the javadoc on ConcurrentHashMap correctly you could get a key from the keyset that returns null on get, as some other thread could have already removed the key. Moving evictOneNode into the cleaner thread would get rid of this issue as well since it would guarantee only one thread was ever removing entries from the list. line 323, the cast is unnecessary. HBaseReadWrite line 1419, 1456 bogus javadoc, param name is tblName not tableName HBaseUtils You should break the ColumnStatsAggregator interface and implementing classes into separate classes. This is an area where we'll have more work to do (especially in combining NDVs) and it will be convenient to have it in separate code rather than letting this class get huge (or more huge, anyway). You could even put it in an hbase.aggregator package if you wanted. BitVector line 60, bogus javadoc, no param index BloomFilter Are there not existing implementations of a bloom filter? Do we have to implement one ourselves? I haven't reviewed the tests yet, as you said you were still adding a bunch. > Introduce a stats cache for HBase metastore [hbase-metastore branch] > --------------------------------------------------------------------- > > Key: HIVE-9693 > URL: https://issues.apache.org/jira/browse/HIVE-9693 > Project: Hive > Issue Type: Sub-task > Components: Metastore > Reporter: Vaibhav Gumashta > Assignee: Vaibhav Gumashta > Attachments: HIVE-9693.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)