Are you running pre-0.12 or with hive.metastore.try.direct.sql = false; Work done on https://issues.apache.org/jira/browse/HIVE-4051 should alleviate some of your problems.
On Mon, May 25, 2015 at 8:19 PM, Sivaramakrishnan Narayanan < tarb...@gmail.com> wrote: > Apologies if this has been discussed in the past - my searches did not pull > up any relevant threads. If there are better solutions available out of the > box, please let me know! > > Problem statement > -------------------------- > > We have a setup where a single metastoredb is used by Hive, Presto and > SparkSQL. In addition, there are 1000s of hive queries submitted in batch > form from multiple machines. Oftentimes, the metastoredb ends up being > remote (in a different region in AWS etc) and round-trip latency is high. > We've seen single thrift calls getting translated into lots of small SQL > calls by datanucleus and the roundtrip latency ends up killing performance. > Furthermore, any of these systems may create / modify a hive table and this > should be reflected in the other system. Example, I may create a table in > hive and query it using Presto or vice versa. In our setup, there may be > multiple thrift metastore servers pointing to the same metastore db. > > Investigation > ------------------- > > Basically, we've been looking at caching to solve this problem (will come > to invalidation in a bit). I looked briefly at DN's support for caching - > these two parameters seem to be switched off by default. > > METASTORE_CACHE_LEVEL2("datanucleus.cache.level2", false), > METASTORE_CACHE_LEVEL2_TYPE("datanucleus.cache.level2.type", "none"), > > Furthermore, my reading of > http://www.datanucleus.org/products/datanucleus/jdo/cache.html suggests > that there is no sophistication in invalidation - seems like only > time-based invalidation is supported and it can't work across multiple PMFs > (therefore, multiple thrift metastore servers) > > Solution Outline > ----------------------- > > - Every table / partition will have an additional property called > 'version' > - Any call that modifies table or partition will bump up version of the > table / partition > - Guava based cache of thrift objects that come from metastore calls > - We fire a single SQL matching versions before returning from cache > - It is conceivable to have a mode wherein invalidation based on version > happens in a background thread (for higher performance, lower fidelity) > - Not proposing any locking (not shooting for world peace here :) ) > - We could extend HiveMetaStore class or create a new server altogether > > Is this something that would be interesting to the community? Is this > problem already solved and should I spend my time watching GoT instead? > > Thanks > Siva >