[jira] [Updated] (HIVE-4160) Vectorized Query Execution in Hive

Jitendra Nath Pandey (JIRA) Mon, 18 Mar 2013 16:41:16 -0700

     [ 
https://issues.apache.org/jira/browse/HIVE-4160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Jitendra Nath Pandey updated HIVE-4160:
---------------------------------------

    Attachment: Hive-Vectorized-Query-Execution-Design.docx

The attached document covers the outline of the design. Any comments/feedback 
are welcome. We will keep updating the document with more details as we include 
more data types, operators and expressions. We will also include the vectorized 
iterator design into the document.
                
> Vectorized Query Execution in Hive
> ----------------------------------
>
>                 Key: HIVE-4160
>                 URL: https://issues.apache.org/jira/browse/HIVE-4160
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Jitendra Nath Pandey
>            Assignee: Jitendra Nath Pandey
>         Attachments: Hive-Vectorized-Query-Execution-Design.docx
>
>
>   Hive query execution engine currently processes one row at a time. A single 
> row of data goes through all the operators before next row can be processed. 
> This mode of processing is very inefficient in terms of CPU usage. Research 
> has demonstrated that this yields very low instructions per cycle [MonetDB]. 
> Also currently hive heavily relies on lazy deserialization and data columns 
> go through a layer of object inspectors that identify column type, 
> de-serialize data and determine appropriate expression routines in the inner 
> loop. These layers of virtual method calls further slow down the processing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4160) Vectorized Query Execution in Hive

Reply via email to