[
https://issues.apache.org/jira/browse/HIVE-4160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Eric Hanson updated HIVE-4160:
------------------------------
Attachment: Hive-Vectorized-Query-Execution-Design-rev3.pdf
Adding pdf of design doc per request.
> Vectorized Query Execution in Hive
> ----------------------------------
>
> Key: HIVE-4160
> URL: https://issues.apache.org/jira/browse/HIVE-4160
> Project: Hive
> Issue Type: New Feature
> Reporter: Jitendra Nath Pandey
> Assignee: Jitendra Nath Pandey
> Attachments: Hive-Vectorized-Query-Execution-Design.docx,
> Hive-Vectorized-Query-Execution-Design-rev2.docx,
> Hive-Vectorized-Query-Execution-Design-rev3.docx,
> Hive-Vectorized-Query-Execution-Design-rev3.docx,
> Hive-Vectorized-Query-Execution-Design-rev3.pdf
>
>
> Hive query execution engine currently processes one row at a time. A single
> row of data goes through all the operators before next row can be processed.
> This mode of processing is very inefficient in terms of CPU usage. Research
> has demonstrated that this yields very low instructions per cycle [MonetDB].
> Also currently hive heavily relies on lazy deserialization and data columns
> go through a layer of object inspectors that identify column type,
> de-serialize data and determine appropriate expression routines in the inner
> loop. These layers of virtual method calls further slow down the processing.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira