Christoph Lipka created HIVE-10891:
--------------------------------------

             Summary: Limited fetch on partitioned table can eat up all heap
                 Key: HIVE-10891
                 URL: https://issues.apache.org/jira/browse/HIVE-10891
             Project: Hive
          Issue Type: Bug
          Components: Physical Optimizer
    Affects Versions: 1.1.0
            Reporter: Christoph Lipka


When doing a query like 
{code}
select *
from partitioned_table
where not_the_partition_key_column = "xyz"
limit 100
{code}
it is executed in memory. For all tables except the smallest this behavior 
quickly consumes the complete heap and crashes the server.
If the limit clause is omitted, a mr-job is started and the query is executed 
without memory issues. One can also work around this problem by extending the 
query to also select the partition_key like 
{code}
select *
from partitioned_table a
where a.not_the_partition_key_column = "xyz"
and a.partition_key_column = (select b.partition_key_column from 
partitioned_table b)
limit 100
{code}
In this case hive also creates a mr-job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to