[ 
https://issues.apache.org/jira/browse/HIVE-4478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13696152#comment-13696152
 ] 

Hudson commented on HIVE-4478:
------------------------------

Integrated in Hive-trunk-h0.21 #2168 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2168/])
    HIVE-4478. In ORC remove ispresent stream from columns that contain no null 
values in a stripe. (Prasanth Jayachandran via omalley) (Revision 1497912)

     Result = FAILURE
omalley : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1497912
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OutStream.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java
* /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestFileDump.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcNullOptimization.java
* /hive/trunk/ql/src/test/resources/orc-file-dump.out

                
> In ORC, add boolean noNulls flag to column stripe metadata
> ----------------------------------------------------------
>
>                 Key: HIVE-4478
>                 URL: https://issues.apache.org/jira/browse/HIVE-4478
>             Project: Hive
>          Issue Type: Sub-task
>          Components: File Formats
>    Affects Versions: 0.12.0
>            Reporter: Eric Hanson
>            Assignee: Prasanth J
>             Fix For: 0.12.0
>
>         Attachments: HIVE-4478.1.patch.txt, HIVE-4478.2.git.patch.txt
>
>
> Currently, the stripe metadata for ORC contains the min and max value for 
> each column in the stripe. This will be used for stripe elimination. However, 
> an additional bit of metadata for each column for each stripe, noNulls 
> (true/false), is needed to help speed up vectorized query execution as much 
> as 30%. 
> The vectorized QE code has a Boolean flag for each column vector called 
> noNulls. If this is true, all the null-checking logic is skipped for that 
> column for a VectorizedRowBatch when an operation is performed on that 
> column. For simple filters and arithmetic expressions, this can save on the 
> order of 30% of the time.
> Once this noNulls stripe metadata is available, the vectorized iterator 
> (reader) for ORC can be updated to avoid all expense to load the isNull 
> bitmap, and efficiently set the noNulls flag for each column vector.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to