[ 
https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siying Dong updated HIVE-2068:
------------------------------

    Status: Patch Available  (was: Open)

looks like simple "... limit ..." depends on the sequence of list files, which 
is not deterministic. I modify the test case to always put the 3 same files so 
that the results will be deterministic.

> Speed up query "select xx,xx from xxx LIMIT xxx" if no filtering or 
> aggregation
> -------------------------------------------------------------------------------
>
>                 Key: HIVE-2068
>                 URL: https://issues.apache.org/jira/browse/HIVE-2068
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Siying Dong
>            Assignee: Siying Dong
>         Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch, 
> HIVE-2068.4.patch, HIVE-2068.5.patch, HIVE-2068.6.patch
>
>
> Currently, "select xx,xx from xxx where ...(only partition conditions) LIMIT 
> xxx" will start a MapReduce job with input to be the whole table or 
> partition. The latency can be huge if the table or partition is big. We could 
> reduce number of input files to speed up the queries.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to