Hive runs out of memory with a large number of partitions
---------------------------------------------------------

                 Key: HIVE-2575
                 URL: https://issues.apache.org/jira/browse/HIVE-2575
             Project: Hive
          Issue Type: Bug
            Reporter: Jonathan Chang


When a large number of partitions needs to be fetched for a query (say ~10k), 
it will take several minutes for the query plan to even be generated and the 
client will often run out of memory.

Some quick investigation shows that the partition pruner is relatively speedy, 
but the actual fetch of the partitions is quite slow with most of the time 
being spent in DataNucleus generated code.  It also looks like the amount of 
data that needs to be pulled and stored for each Partition object is quite 
large.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to