[jira] [Commented] (HIVE-11394) Enhance EXPLAIN display for vectorization

Hive QA (JIRA) Fri, 23 Dec 2016 03:09:53 -0800

    [ 
https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15772567#comment-15772567
 ]


Hive QA commented on HIVE-11394:
--------------------------------



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12844550/HIVE-11394.094.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2715/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2715/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2715/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2016-12-23 11:08:58.263
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-2715/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2016-12-23 11:08:58.266
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 7befe8e HIVE-15487: LLAP: Improvements to random selection while 
scheduling
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 7befe8e HIVE-15487: LLAP: Improvements to random selection while 
scheduling
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2016-12-23 11:08:59.249
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java:2479
error: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java: patch 
does not apply
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12844550 - PreCommit-HIVE-Build

> Enhance EXPLAIN display for vectorization
> -----------------------------------------
>
>                 Key: HIVE-11394
>                 URL: https://issues.apache.org/jira/browse/HIVE-11394
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>            Reporter: Matt McCline
>            Assignee: Matt McCline
>            Priority: Critical
>             Fix For: 2.2.0
>
>         Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, 
> HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, 
> HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, 
> HIVE-11394.09.patch, HIVE-11394.091.patch, HIVE-11394.092.patch, 
> HIVE-11394.093.patch, HIVE-11394.094.patch
>
>
> Add detail to the EXPLAIN output showing why a Map and Reduce work is not 
> vectorized.
> New syntax is: EXPLAIN VECTORIZATION \[ONLY\] 
> \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\]
> The ONLY option suppresses most non-vectorization elements.
> SUMMARY shows vectorization information for the PLAN (is vectorization 
> enabled) and a summary of Map and Reduce work.
> OPERATOR shows vectorization information for operators.  E.g. Filter 
> Vectorization.  It includes all information of SUMMARY, too.
> EXPRESSION shows vectorization information for expressions.  E.g. 
> predicateExpression.  It includes all information of SUMMARY and OPERATOR, 
> too.
> DETAIL shows very vectorization information.
> It includes all information of SUMMARY, OPERATOR, and EXPRESSION too.
> The optional clause defaults are not ONLY and SUMMARY.
> ---------------------------------------------------------------------------------------------------
> Here are some examples:
> EXPLAIN VECTORIZATION example:
> (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization 
> sections)
> Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION 
> SUMMARY.
> Under Reducer 3’s "Reduce Vectorization:" you’ll see
> notVectorizedReason: Aggregation Function UDF avg parameter expression for 
> GROUPBY operator: Data type struct<count:bigint,sum:double,input:int> of 
> Column\[VALUE._col2\] not supported
> For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": 
> "false" which says a node has a GROUP BY with an AVG or some other aggregator 
> that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators 
> are row-mode.  I.e. not vector output.
> If "usesVectorUDFAdaptor:": "false" were true, it would say there was at 
> least one vectorized expression is using VectorUDFAdaptor.
> And, "allNative:": "false" will be true when all operators are native.  
> Today, GROUP BY and FILE SINK are not native.  MAP JOIN and REDUCE SINK are 
> conditionally native.  FILTER and SELECT are native.
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
>     Tez
> ...
>       Edges:
>         Reducer 2 <- Map 1 (SIMPLE_EDGE)
>         Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
> ...
>       Vertices:
>         Map 1 
>             Map Operator Tree:
>                 TableScan
>                   alias: alltypesorc
>                   Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
>                   Select Operator
>                     expressions: cint (type: int)
>                     outputColumnNames: cint
>                     Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
>                     Group By Operator
>                       keys: cint (type: int)
>                       mode: hash
>                       outputColumnNames: _col0
>                       Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: int)
>                         sort order: +
>                         Map-reduce partition columns: _col0 (type: int)
>                         Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
>             Execution mode: vectorized, llap
>             LLAP IO: all inputs
>             Map Vectorization:
>                 enabled: true
>                 enabledConditionsMet: 
> hive.vectorized.use.vectorized.input.format IS true
>                 groupByVectorOutput: true
>                 inputFileFormats: 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
>                 allNative: false
>                 usesVectorUDFAdaptor: false
>                 vectorized: true
>         Reducer 2 
>             Execution mode: vectorized, llap
>             Reduce Vectorization:
>                 enabled: true
>                 enableConditionsMet: hive.vectorized.execution.reduce.enabled 
> IS true, hive.execution.engine tez IN [tez, spark] IS true
>                 groupByVectorOutput: false
>                 allNative: false
>                 usesVectorUDFAdaptor: false
>                 vectorized: true
>             Reduce Operator Tree:
>               Group By Operator
>                 keys: KEY._col0 (type: int)
>                 mode: mergepartial
>                 outputColumnNames: _col0
>                 Statistics: Num rows: 5775 Data size: 17248 Basic stats: 
> COMPLETE Column stats: COMPLETE
>                 Group By Operator
>                   aggregations: sum(_col0), count(_col0), avg(_col0), 
> std(_col0)
>                   mode: hash
>                   outputColumnNames: _col0, _col1, _col2, _col3
>                   Statistics: Num rows: 1 Data size: 172 Basic stats: 
> COMPLETE Column stats: COMPLETE
>                   Reduce Output Operator
>                     sort order: 
>                     Statistics: Num rows: 1 Data size: 172 Basic stats: 
> COMPLETE Column stats: COMPLETE
>                     value expressions: _col0 (type: bigint), _col1 (type: 
> bigint), _col2 (type: struct<count:bigint,sum:double,input:int>), _col3 
> (type: struct<count:bigint,sum:double,variance:double>)
>         Reducer 3 
>             Execution mode: llap
>             Reduce Vectorization:
>                 enabled: true
>                 enableConditionsMet: hive.vectorized.execution.reduce.enabled 
> IS true, hive.execution.engine tez IN [tez, spark] IS true
>                 notVectorizedReason: Aggregation Function UDF avg parameter 
> expression for GROUPBY operator: Data type 
> struct<count:bigint,sum:double,input:int> of Column[VALUE._col2] not supported
>                 vectorized: false
>             Reduce Operator Tree:
>               Group By Operator
>                 aggregations: sum(VALUE._col0), count(VALUE._col1), 
> avg(VALUE._col2), std(VALUE._col3)
>                 mode: mergepartial
>                 outputColumnNames: _col0, _col1, _col2, _col3
>                 Statistics: Num rows: 1 Data size: 32 Basic stats: COMPLETE 
> Column stats: COMPLETE
>                 File Output Operator
>                   compressed: false
>                   Statistics: Num rows: 1 Data size: 32 Basic stats: COMPLETE 
> Column stats: COMPLETE
>                   table:
>                       input format: 
> org.apache.hadoop.mapred.SequenceFileInputFormat
>                       output format: 
> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
>                       serde: 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
>     Fetch Operator
>       limit: -1
>       Processor Tree:
>         ListSink 
> {code}
> EXPLAIN VECTORIZATION OPERATOR
> Notice the added  TableScan Vectorization, Select Vectorization, Group By 
> Vectorization, Map Join Vectorizatin, Reduce Sink Vectorization sections in 
> this example.
> Notice the nativeConditionsMet detail on why Reduce Vectorization is native.
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
>     Tez
> #### A masked pattern was here ####
>       Edges:
>         Map 2 <- Map 1 (BROADCAST_EDGE)
>         Reducer 3 <- Map 2 (SIMPLE_EDGE)
> #### A masked pattern was here ####
>       Vertices:
>         Map 1 
>             Map Operator Tree:
>                 TableScan
>                   alias: a
>                   Statistics: Num rows: 3 Data size: 294 Basic stats: 
> COMPLETE Column stats: NONE
>                   TableScan Vectorization:
>                       native: true
>                       projectedOutputColumns: [0, 1]
>                   Filter Operator
>                     Filter Vectorization:
>                         className: VectorFilterOperator
>                         native: true
> predicate: c2 is not null (type: boolean)
>                     Statistics: Num rows: 3 Data size: 294 Basic stats: 
> COMPLETE Column stats: NONE
>                     Select Operator
>                       expressions: c1 (type: int), c2 (type: char(10))
>                       outputColumnNames: _col0, _col1
>                       Select Vectorization:
>                           className: VectorSelectOperator
>                           native: true
>                           projectedOutputColumns: [0, 1]
>                       Statistics: Num rows: 3 Data size: 294 Basic stats: 
> COMPLETE Column stats: NONE
>                       Reduce Output Operator
>                         key expressions: _col1 (type: char(20))
>                         sort order: +
>                         Map-reduce partition columns: _col1 (type: char(20))
>                         Reduce Sink Vectorization:
>                             className: VectorReduceSinkStringOperator
>                             native: true
>                             nativeConditionsMet: 
> hive.vectorized.execution.reducesink.new.enabled IS true, 
> hive.execution.engine tez IN [tez, spark] IS true, Not ACID UPDATE or DELETE 
> IS true, No buckets IS true, No TopN IS true, Uniform Hash IS true, No 
> DISTINCT columns IS true, BinarySortableSerDe for keys IS true, 
> LazyBinarySerDe for values IS true
>                         Statistics: Num rows: 3 Data size: 294 Basic stats: 
> COMPLETE Column stats: NONE
>                         value expressions: _col0 (type: int)
>             Execution mode: vectorized, llap
>             LLAP IO: all inputs
>             Map Vectorization:
>                 enabled: true
>                 enabledConditionsMet: 
> hive.vectorized.use.vectorized.input.format IS true
>                 groupByVectorOutput: true
>                 inputFileFormats: 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
>                 allNative: true
>                 usesVectorUDFAdaptor: false
>                 vectorized: true
>         Map 2 
>             Map Operator Tree:
>                 TableScan
>                   alias: b
>                   Statistics: Num rows: 3 Data size: 324 Basic stats: 
> COMPLETE Column stats: NONE
>                   TableScan Vectorization:
>                       native: true
>                       projectedOutputColumns: [0, 1]
>                   Filter Operator
>                     Filter Vectorization:
>                         className: VectorFilterOperator
>                         native: true
> predicate: c2 is not null (type: boolean)
>                     Statistics: Num rows: 3 Data size: 324 Basic stats: 
> COMPLETE Column stats: NONE
>                     Select Operator
>                       expressions: c1 (type: int), c2 (type: char(20))
>                       outputColumnNames: _col0, _col1
>                       Select Vectorization:
>                           className: VectorSelectOperator
>                           native: true
>                           projectedOutputColumns: [0, 1]
>                       Statistics: Num rows: 3 Data size: 324 Basic stats: 
> COMPLETE Column stats: NONE
>                       Map Join Operator
>                         condition map:
>                              Inner Join 0 to 1
>                         keys:
>                           0 _col1 (type: char(20))
>                           1 _col1 (type: char(20))
>                         Map Join Vectorization:
>                             className: VectorMapJoinInnerStringOperator
>                             native: true
>                             nativeConditionsMet: 
> hive.vectorized.execution.mapjoin.native.enabled IS true, 
> hive.execution.engine tez IN [tez, spark] IS true, One MapJoin Condition IS 
> true, No nullsafe IS true, Supports Key Types IS true, Not empty key IS true, 
> When Fast Hash Table, then requires no Hybrid Hash Join IS true, Small table 
> vectorizes IS true
>                         outputColumnNames: _col0, _col1, _col2, _col3
>                         input vertices:
>                           0 Map 1
>                         Statistics: Num rows: 3 Data size: 323 Basic stats: 
> COMPLETE Column stats: NONE
>                         Reduce Output Operator
>                           key expressions: _col0 (type: int)
>                           sort order: +
>                           Reduce Sink Vectorization:
>                               className: VectorReduceSinkOperator
>                               native: false
>                               nativeConditionsMet: 
> hive.vectorized.execution.reducesink.new.enabled IS true, 
> hive.execution.engine tez IN [tez, spark] IS true, Not ACID UPDATE or DELETE 
> IS true, No buckets IS true, No TopN IS true, No DISTINCT columns IS true, 
> BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
>                               nativeConditionsNotMet: Uniform Hash IS false
>                           Statistics: Num rows: 3 Data size: 323 Basic stats: 
> COMPLETE Column stats: NONE
>                           value expressions: _col1 (type: char(10)), _col2 
> (type: int), _col3 (type: char(20))
>             Execution mode: vectorized, llap
>             LLAP IO: all inputs
>             Map Vectorization:
>                 enabled: true
>                 enabledConditionsMet: 
> hive.vectorized.use.vectorized.input.format IS true
>                 groupByVectorOutput: true
>                 inputFileFormats: 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
>                 allNative: false
>                 usesVectorUDFAdaptor: false
>                 vectorized: true
>         Reducer 3 
>             Execution mode: vectorized, llap
>             Reduce Vectorization:
>                 enabled: true
>                 enableConditionsMet: hive.vectorized.execution.reduce.enabled 
> IS true, hive.execution.engine tez IN [tez, spark] IS true
>                 groupByVectorOutput: true
>                 allNative: false
>                 usesVectorUDFAdaptor: false
>                 vectorized: true
>             Reduce Operator Tree:
>               Select Operator
>                 expressions: KEY.reducesinkkey0 (type: int), VALUE._col0 
> (type: char(10)), VALUE._col1 (type: int), VALUE._col2 (type: char(20))
>                 outputColumnNames: _col0, _col1, _col2, _col3
>                 Select Vectorization:
>                     className: VectorSelectOperator
>                     native: true
>                     projectedOutputColumns: [0, 1, 2, 3]
>                 Statistics: Num rows: 3 Data size: 323 Basic stats: COMPLETE 
> Column stats: NONE
>                 File Output Operator
>                   compressed: false
>                   File Sink Vectorization:
>                       className: VectorFileSinkOperator
>                       native: false
>                   Statistics: Num rows: 3 Data size: 323 Basic stats: 
> COMPLETE Column stats: NONE
>                   table:
>                       input format: 
> org.apache.hadoop.mapred.SequenceFileInputFormat
>                       output format: 
> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
>                       serde: 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
>     Fetch Operator
>       limit: -1
>       Processor Tree:
>         ListSink
>  {code}
> EXPLAIN VECTORIZATION EXPRESSION
> Notice the predicateExpression in this example.
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
>     Tez
> #### A masked pattern was here ####
>       Edges:
>         Reducer 2 <- Map 1 (SIMPLE_EDGE)
> #### A masked pattern was here ####
>       Vertices:
>         Map 1 
>             Map Operator Tree:
>                 TableScan
>                   alias: vector_interval_2
>                   Statistics: Num rows: 2 Data size: 788 Basic stats: 
> COMPLETE Column stats: NONE
>                   TableScan Vectorization:
>                       native: true
>                       projectedOutputColumns: [0, 1, 2, 3, 4, 5]
>                   Filter Operator
>                     Filter Vectorization:
>                         className: VectorFilterOperator
>                         native: true
>                         predicateExpression: FilterExprAndExpr(children: 
> FilterTimestampScalarEqualTimestampColumn(val 2001-01-01 01:02:03.0, col 
> 6)(children: DateColAddIntervalDayTimeScalar(col 1, val 0 01:02:03.000000000) 
> -> 6:timestamp) -> boolean, FilterTimestampScalarNotEqualTimestampColumn(val 
> 2001-01-01 01:02:03.0, col 6)(children: DateColAddIntervalDayTimeScalar(col 
> 1, val 0 01:02:04.000000000) -> 6:timestamp) -> boolean, 
> FilterTimestampScalarLessEqualTimestampColumn(val 2001-01-01 01:02:03.0, col 
> 6)(children: DateColAddIntervalDayTimeScalar(col 1, val 0 01:02:03.000000000) 
> -> 6:timestamp) -> boolean, FilterTimestampScalarLessTimestampColumn(val 
> 2001-01-01 01:02:03.0, col 6)(children: DateColAddIntervalDayTimeScalar(col 
> 1, val 0 01:02:04.000000000) -> 6:timestamp) -> boolean, 
> FilterTimestampScalarGreaterEqualTimestampColumn(val 2001-01-01 01:02:03.0, 
> col 6)(children: DateColSubtractIntervalDayTimeScalar(col 1, val 0 
> 01:02:03.000000000) -> 6:timestamp) -> boolean, 
> FilterTimestampScalarGreaterTimestampColumn(val 2001-01-01 01:02:03.0, col 
> 6)(children: DateColSubtractIntervalDayTimeScalar(col 1, val 0 
> 01:02:04.000000000) -> 6:timestamp) -> boolean, 
> FilterTimestampColEqualTimestampScalar(col 6, val 2001-01-01 
> 01:02:03.0)(children: DateColAddIntervalDayTimeScalar(col 1, val 0 
> 01:02:03.000000000) -> 6:timestamp) -> boolean, 
> FilterTimestampColNotEqualTimestampScalar(col 6, val 2001-01-01 
> 01:02:03.0)(children: DateColAddIntervalDayTimeScalar(col 1, val 0 
> 01:02:04.000000000) -> 6:timestamp) -> boolean, 
> FilterTimestampColGreaterEqualTimestampScalar(col 6, val 2001-01-01 
> 01:02:03.0)(children: DateColAddIntervalDayTimeScalar(col 1, val 0 
> 01:02:03.000000000) -> 6:timestamp) -> boolean, 
> FilterTimestampColGreaterTimestampScalar(col 6, val 2001-01-01 
> 01:02:03.0)(children: DateColAddIntervalDayTimeScalar(col 1, val 0 
> 01:02:04.000000000) -> 6:timestamp) -> boolean, 
> FilterTimestampColLessEqualTimestampScalar(col 6, val 2001-01-01 
> 01:02:03.0)(children: DateColSubtractIntervalDayTimeScalar(col 1, val 0 
> 01:02:03.000000000) -> 6:timestamp) -> boolean, 
> FilterTimestampColLessTimestampScalar(col 6, val 2001-01-01 
> 01:02:03.0)(children: DateColSubtractIntervalDayTimeScalar(col 1, val 0 
> 01:02:04.000000000) -> 6:timestamp) -> boolean, 
> FilterTimestampColEqualTimestampColumn(col 0, col 6)(children: 
> DateColAddIntervalDayTimeScalar(col 1, val 0 01:02:03.000000000) -> 
> 6:timestamp) -> boolean, FilterTimestampColNotEqualTimestampColumn(col 0, col 
> 6)(children: DateColAddIntervalDayTimeScalar(col 1, val 0 01:02:04.000000000) 
> -> 6:timestamp) -> boolean, FilterTimestampColLessEqualTimestampColumn(col 0, 
> col 6)(children: DateColAddIntervalDayTimeScalar(col 1, val 0 
> 01:02:03.000000000) -> 6:timestamp) -> boolean, 
> FilterTimestampColLessTimestampColumn(col 0, col 6)(children: 
> DateColAddIntervalDayTimeScalar(col 1, val 0 01:02:04.000000000) -> 
> 6:timestamp) -> boolean, FilterTimestampColGreaterEqualTimestampColumn(col 0, 
> col 6)(children: DateColSubtractIntervalDayTimeScalar(col 1, val 0 
> 01:02:03.000000000) -> 6:timestamp) -> boolean, 
> FilterTimestampColGreaterTimestampColumn(col 0, col 6)(children: 
> DateColSubtractIntervalDayTimeScalar(col 1, val 0 01:02:04.000000000) -> 
> 6:timestamp) -> boolean) -> boolean
>                     predicate: ((2001-01-01 01:02:03.0 = (dt + 0 
> 01:02:03.000000000)) and (2001-01-01 01:02:03.0 <> (dt + 0 
> 01:02:04.000000000)) and (2001-01-01 01:02:03.0 <= (dt + 0 
> 01:02:03.000000000)) and (2001-01-01 01:02:03.0 < (dt + 0 
> 01:02:04.000000000)) and (2001-01-01 01:02:03.0 >= (dt - 0 
> 01:02:03.000000000)) and (2001-01-01 01:02:03.0 > (dt - 0 
> 01:02:04.000000000)) and ((dt + 0 01:02:03.000000000) = 2001-01-01 
> 01:02:03.0) and ((dt + 0 01:02:04.000000000) <> 2001-01-01 01:02:03.0) and 
> ((dt + 0 01:02:03.000000000) >= 2001-01-01 01:02:03.0) and ((dt + 0 
> 01:02:04.000000000) > 2001-01-01 01:02:03.0) and ((dt - 0 01:02:03.000000000) 
> <= 2001-01-01 01:02:03.0) and ((dt - 0 01:02:04.000000000) < 2001-01-01 
> 01:02:03.0) and (ts = (dt + 0 01:02:03.000000000)) and (ts <> (dt + 0 
> 01:02:04.000000000)) and (ts <= (dt + 0 01:02:03.000000000)) and (ts < (dt + 
> 0 01:02:04.000000000)) and (ts >= (dt - 0 01:02:03.000000000)) and (ts > (dt 
> - 0 01:02:04.000000000))) (type: boolean)
>                     Statistics: Num rows: 1 Data size: 394 Basic stats: 
> COMPLETE Column stats: NONE
>                     Select Operator
>                       expressions: ts (type: timestamp)
>                       outputColumnNames: _col0
>                       Select Vectorization:
>                           className: VectorSelectOperator
>                           native: true
>                           projectedOutputColumns: [0]
>                       Statistics: Num rows: 1 Data size: 394 Basic stats: 
> COMPLETE Column stats: NONE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: timestamp)
>                         sort order: +
>                         Reduce Sink Vectorization:
>                             className: VectorReduceSinkOperator
>                             native: false
>                             nativeConditionsMet: 
> hive.vectorized.execution.reducesink.new.enabled IS true, 
> hive.execution.engine tez IN [tez, spark] IS true, Not ACID UPDATE or DELETE 
> IS true, No buckets IS true, No TopN IS true, No DISTINCT columns IS true, 
> BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
>                             nativeConditionsNotMet: Uniform Hash IS false
>                         Statistics: Num rows: 1 Data size: 394 Basic stats: 
> COMPLETE Column stats: NONE
>             Execution mode: vectorized, llap
>             LLAP IO: all inputs
>             Map Vectorization:
>                 enabled: true
>                 enabledConditionsMet: 
> hive.vectorized.use.vectorized.input.format IS true
>                 groupByVectorOutput: true
>                 inputFileFormats: 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
>                 allNative: false
>                 usesVectorUDFAdaptor: false
>                 vectorized: true
>         Reducer 2 
> ... 
> {code}
> The standard @Explain Annotation Type is used.  A new 'vectorization' 
> annotation marks each new class and method.
> Works for FORMATTED, like other non-vectorization EXPLAIN variations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11394) Enhance EXPLAIN display for vectorization

Reply via email to