Hey!

We have a catalog with fairly a lot of databases and tables.

Where we do a simple query (select * from table limit 5;) on an ideal
cluster, it takes around 20seconds, sometimes longer (usually first run
takes 40s+)

Looking at the hive-metastore logs during most of the query time the logs
show "metastore.Hivemetastore: 13:  get_materialized_views_for_rewriting:
db = <Database name>" for each database. When these calls finish, the query
executes pretty quickly.

My interpretation of this is that most of the time is spent on analyzing
the metastore and building a query plan, perhaps some sort of Metastore
in-memory cache is building built  (but it happens on every call). But I am
not really sure how to debug it further, nor could find in the code where
this is happening.

Any advice on what's causing it or how to troubleshoot?

p.s
The metadata (i.e original tables and databases) is actually coming from a
very old Hive version (1.1), and we migrated it to the newest version of
the Metastore using the upgrade tool. In the original Hive version
materialized views were not even supported.

Cheers,
Eugene

-- 

*Eugene Miretsky*
Managing Partner |  Badal.io | Book a meeting /w me!
<http://calendly.com/eugene-badal>
mobile:  416-568-9245
email:     eug...@badal.io <zb...@badal.io>

Reply via email to