[jira] [Commented] (HIVE-2213) Optimize get_partition_names_ps()

[email protected] (JIRA) Fri, 10 Jun 2011 14:04:53 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13047452#comment-13047452
 ]

[email protected] commented on HIVE-2213:
-----------------------------------------------------

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/878/#review804
-----------------------------------------------------------

You can do this here or in a separate JIRA, but can you update 
get_partitions_ps() using a similar technique?

trunk/common/src/java/org/apache/hadoop/hive/common/FileUtils.java
<https://reviews.apache.org/r/878/#comment1753>

    Can you refactor with the above function since they are similar?

trunk/common/src/java/org/apache/hadoop/hive/common/FileUtils.java
<https://reviews.apache.org/r/878/#comment1754>

    Same here

trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
<https://reviews.apache.org/r/878/#comment1755>

    To be consistent with the other method, maybe call this 
listPartitionNamesPs?

trunk/metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java
<https://reviews.apache.org/r/878/#comment1756>

    Combine with above

- Paul

On 2011-06-10 07:05:56, Sohan Jain wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/878/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-06-10 07:05:56)
bq.  
bq.  
bq.  Review request for hive and Paul Yang.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  If a table has a large number of partitions, get_partition_names_ps() make 
take a long time to execute, because we get all of the partition names from the 
database. This is not very memory efficient, and the operation can be pushed 
down to the JDO layer without getting all of the names first.
bq.  
bq.  
bq.  This addresses bug HIVE-2213.
bq.      https://issues.apache.org/jira/browse/HIVE-2213
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    trunk/common/src/java/org/apache/hadoop/hive/common/FileUtils.java 
1134205 
bq.    
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
1134205 
bq.    
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
1134205 
bq.    trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 
1134205 
bq.    trunk/metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java 
1134205 
bq.    
trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 1134205 
bq.  
bq.  Diff: https://reviews.apache.org/r/878/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Passes previous test cases for get_partition_names_ps() in 
TestHiveMetaStore.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Sohan
bq.  
bq.

> Optimize get_partition_names_ps()
> ---------------------------------
>
>                 Key: HIVE-2213
>                 URL: https://issues.apache.org/jira/browse/HIVE-2213
>             Project: Hive
>          Issue Type: Improvement
>          Components: Metastore
>            Reporter: Sohan Jain
>            Assignee: Sohan Jain
>         Attachments: HIVE-2213.1.patch
>
>
> If a table has a large number of partitions, get_partition_names_ps() make 
> take a long time to execute, because we get all of the partition names from 
> the database.  This is not very memory efficient, and the operation can be 
> pushed down to the JDO layer without getting all of the names first.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2213) Optimize get_partition_names_ps()

Reply via email to