[ https://issues.apache.org/jira/browse/HIVE-19996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kevin Minder reassigned HIVE-19996: ----------------------------------- Assignee: Kevin Minder > Beeline performance poor with drivers having slow > DatabaseMetaData.getPrimaryKeys impl > -------------------------------------------------------------------------------------- > > Key: HIVE-19996 > URL: https://issues.apache.org/jira/browse/HIVE-19996 > Project: Hive > Issue Type: Improvement > Components: Beeline > Affects Versions: 1.2.1 > Environment: Issue detected using Beeline with HBase Phoenix thin > driver and a result set with many columns. > Reporter: Kevin Minder > Assignee: Kevin Minder > Priority: Major > Attachments: HIVE-19996.1.patch > > > Beeline performance is rather poor for table output format when two > conditions occur for the same result set. > # The result set has a large number of columns. > # The driver being used has a slow implementation of > DatabaseMetaData.getPrimaryKeys. > For example testing has shown that for a query with ~100 columns using the > HBase Phoenix thin driver the execution time can be cut from ~30 seconds to > ~2 seconds by using CSV output format vs table output format. For example: > {{select * from system.catalog;}} > This is due to how primary keys are detected. Currently the Rows > implementation will make a metadata call for every column to determine it is > a primary key for display purposes. I propose optimizing this such that a > metadata call is only made for each unique table in the result set's columns. -- This message was sent by Atlassian JIRA (v7.6.3#76005)