Christoph Lipka created HIVE-10891: -------------------------------------- Summary: Limited fetch on partitioned table can eat up all heap Key: HIVE-10891 URL: https://issues.apache.org/jira/browse/HIVE-10891 Project: Hive Issue Type: Bug Components: Physical Optimizer Affects Versions: 1.1.0 Reporter: Christoph Lipka
When doing a query like {code} select * from partitioned_table where not_the_partition_key_column = "xyz" limit 100 {code} it is executed in memory. For all tables except the smallest this behavior quickly consumes the complete heap and crashes the server. If the limit clause is omitted, a mr-job is started and the query is executed without memory issues. One can also work around this problem by extending the query to also select the partition_key like {code} select * from partitioned_table a where a.not_the_partition_key_column = "xyz" and a.partition_key_column = (select b.partition_key_column from partitioned_table b) limit 100 {code} In this case hive also creates a mr-job. -- This message was sent by Atlassian JIRA (v6.3.4#6332)