[ 
https://issues.apache.org/jira/browse/HIVE-19605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16484623#comment-16484623
 ] 

Todd Lipcon commented on HIVE-19605:
------------------------------------

It seems like this table can also be called from a get_table call. Oddly, the 
query being generated is:

SELECT 'org.apache.hadoop.hive.metastore.model.MTableColumnStatistics' AS 
NUCLEUS_TYPE,`A0`.`AVG_COL_LEN`,`A0`.`COLUMN_NAME`,`A0`.`COLUMN_TYPE`,`A0`.`DB_NAME`,`A0`.`BIG_DECIMAL_HIGH_VALUE`,`A0`.`BIG_DECIMAL_LOW_VALUE`,`A0`.`DOUBLE_HIGH_VALUE`,`A0`.`DOUBLE_LOW_VALUE`,`A0`.`LAST_ANALYZED`,`A0`.`LONG_HIGH_VALUE`,`A0`.`LONG_LOW_VALUE`,`A0`.`MAX_COL_LEN`,`A0`.`NUM_DISTINCTS`,`A0`.`NUM_FALSES`,`A0`.`NUM_NULLS`,`A0`.`NUM_TRUES`,`A0`.`TABLE_NAME`,`A0`.`CS_ID`
 FROM `TAB_COL_STATS` `A0` WHERE `A0`.`DB_NAME` = '';

(note the empty db_name).

Given the lack of index, this takes 450ms on the HMS instance I am testing (if 
the mysql query cache is disabled)

> TAB_COL_STATS table has no index on db/table name
> -------------------------------------------------
>
>                 Key: HIVE-19605
>                 URL: https://issues.apache.org/jira/browse/HIVE-19605
>             Project: Hive
>          Issue Type: Bug
>          Components: Metastore
>            Reporter: Todd Lipcon
>            Priority: Major
>
> The TAB_COL_STATS table is missing an index on (CAT_NAME, DB_NAME, 
> TABLE_NAME). The getTableColumnStatistics call queries based on this tuple. 
> This makes those queries take a significant amount of time in large 
> metastores since they do a full table scan.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to