[jira] [Commented] (HIVE-7195) Improve Metastore performance

Mithun Radhakrishnan (JIRA) Wed, 11 Jun 2014 16:01:33 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-7195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14028545#comment-14028545
 ]


Mithun Radhakrishnan commented on HIVE-7195:
--------------------------------------------

[~sershe]: listPartitions(), etc. do have a max_parts parameter. I'm exploring 
the possibility of reducing the thrift traffic for partition-operations, for a 
given number of partitions. That would free us up to transfer metadata for more 
partitions, without fear of the metastore keeling over from heap-frag, etc.

One way of doing that is to reduce redundancy when specifying multiple 
partitions. Abstracting how partitions are specified makes it possible to vary 
and extend this.

> Improve Metastore performance
> -----------------------------
>
>                 Key: HIVE-7195
>                 URL: https://issues.apache.org/jira/browse/HIVE-7195
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Brock Noland
>            Priority: Critical
>
> Even with direct SQL, which significantly improves MS performance, some 
> operations take a considerable amount of time, when there are many partitions 
> on table. Specifically I believe the issue:
> * When a client gets all partitions we do not send them an iterator, we 
> create a collection of all data and then pass the object over the network in 
> total
> * Operations which require looking up data on the NN can still be slow since 
> there is no cache of information and it's done in a serial fashion
> * Perhaps a tangent, but our client timeout is quite dumb. The client will 
> timeout and the server has no idea the client is gone. We should use 
> deadlines, i.e. pass the timeout to the server so it can calculate that the 
> client has expired.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7195) Improve Metastore performance

Reply via email to