[ https://issues.apache.org/jira/browse/HIVE-6157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13875228#comment-13875228 ]
Sergey Shelukhin commented on HIVE-6157: ---------------------------------------- Patch coming today barring something surprising happens > Fetching column stats slower than the 101 during rush hour > ---------------------------------------------------------- > > Key: HIVE-6157 > URL: https://issues.apache.org/jira/browse/HIVE-6157 > Project: Hive > Issue Type: Bug > Affects Versions: 0.13.0 > Reporter: Gunther Hagleitner > Assignee: Sergey Shelukhin > > "hive.stats.fetch.column.stats" controls whether the column stats for a table > are fetched during explain (in Tez: during query planning). On my setup (1 > table 4000 partitions, 24 columns) the time spent in semantic analyze goes > from ~1 second to ~66 seconds when turning the flag on. 65 seconds spent > fetching column stats... > The reason is probably that the APIs force you to make separate metastore > calls for each column in each partition. That's probably the first thing that > has to change. The question is if in addition to that we need to cache this > in the client or store the stats as a single blob in the database to further > cut down on the time. However, the way it stands right now column stats seem > unusable. -- This message was sent by Atlassian JIRA (v6.1.5#6160)