[
https://issues.apache.org/jira/browse/HIVE-7195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14030131#comment-14030131
]
Mithun Radhakrishnan commented on HIVE-7195:
--------------------------------------------
[~sershe]: I'm sorry, I've not found the time to port my patch to 13 and raise
a JIRA. My work was primarily in the PartitionPruner code. It was to ensure
that {{listPartitions(db, table, -1)}} isn't called (during plan optimization),
if the call is a metadata-only query. I can post the 12-patch in a JIRA,
whatever that's worth.
Incidentally, I've raised HIVE-7223 to discuss the idea of using
{{PartitionSpecs}}. [~alangates] suggested that we explore if a PartitionSpec
abstract could also represent lighter Partition-groups that share commonality
(StorageDescs, etc.). Still thinking that through. (If only Thrift supported
polymorphism. :])
> Improve Metastore performance
> -----------------------------
>
> Key: HIVE-7195
> URL: https://issues.apache.org/jira/browse/HIVE-7195
> Project: Hive
> Issue Type: Improvement
> Reporter: Brock Noland
> Priority: Critical
>
> Even with direct SQL, which significantly improves MS performance, some
> operations take a considerable amount of time, when there are many partitions
> on table. Specifically I believe the issue:
> * When a client gets all partitions we do not send them an iterator, we
> create a collection of all data and then pass the object over the network in
> total
> * Operations which require looking up data on the NN can still be slow since
> there is no cache of information and it's done in a serial fashion
> * Perhaps a tangent, but our client timeout is quite dumb. The client will
> timeout and the server has no idea the client is gone. We should use
> deadlines, i.e. pass the timeout to the server so it can calculate that the
> client has expired.
--
This message was sent by Atlassian JIRA
(v6.2#6252)