Vihang Karajgaonkar created HIVE-17876: ------------------------------------------
Summary: row.serde.deserialize broken for non-vectorized file inputformats Key: HIVE-17876 URL: https://issues.apache.org/jira/browse/HIVE-17876 Project: Hive Issue Type: Bug Affects Versions: 3.0.0, 2.4.0 Reporter: Vihang Karajgaonkar Vectorization using {{hive.vectorized.use.row.serde.deserialize}} errors out for both Orc and Parquet input format. Steps to reproduce: {noformat} set hive.fetch.task.conversion=none; set hive.vectorized.use.row.serde.deserialize=true; set hive.vectorized.input.format.excludes=org.apache.hadoop.hive.ql.io.orc.OrcInputFormat; set hive.vectorized.execution.enabled=true; explain vectorization select * from alltypesorc where cint = 528534767 limit 10; +----------------------------------------------------+ | Explain | +----------------------------------------------------+ | PLAN VECTORIZATION: | | enabled: true | | enabledConditionsMet: [hive.vectorized.execution.enabled IS true] | | | | STAGE DEPENDENCIES: | | Stage-1 is a root stage | | Stage-0 depends on stages: Stage-1 | | | | STAGE PLANS: | | Stage: Stage-1 | | Map Reduce | | Map Operator Tree: | | TableScan | | alias: alltypesorc | | Statistics: Num rows: 12288 Data size: 2641964 Basic stats: COMPLETE Column stats: NONE | | Filter Operator | | predicate: (cint = 528534767) (type: boolean) | | Statistics: Num rows: 6144 Data size: 1320982 Basic stats: COMPLETE Column stats: NONE | | Select Operator | | expressions: ctinyint (type: tinyint), csmallint (type: smallint), 528534767 (type: int), cbigint (type: bigint), cfloat (type: float), cdouble (type: double), cstring1 (type: string), cstring2 (type: string), ctimestamp1 (type: timestamp), ctimestamp2 (type: timestamp), cboolean1 (type: boolean), cboolean2 (type: boolean) | | outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11 | | Statistics: Num rows: 6144 Data size: 1320982 Basic stats: COMPLETE Column stats: NONE | | Limit | | Number of rows: 10 | | Statistics: Num rows: 10 Data size: 2150 Basic stats: COMPLETE Column stats: NONE | | File Output Operator | | compressed: false | | Statistics: Num rows: 10 Data size: 2150 Basic stats: COMPLETE Column stats: NONE | | table: | | input format: org.apache.hadoop.mapred.SequenceFileInputFormat | | output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat | | serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | | Execution mode: vectorized | | Map Vectorization: | | enabled: true | | enabledConditionsMet: hive.vectorized.use.row.serde.deserialize IS true | | groupByVectorOutput: true | | inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat | | allNative: false | | usesVectorUDFAdaptor: false | | vectorized: true | | | | Stage: Stage-0 | | Fetch Operator | | limit: 10 | | Processor Tree: | | ListSink | | | +----------------------------------------------------+ 48 rows selected (0.742 seconds) 0: jdbc:hive2://localhost:10000/default> 0: jdbc:hive2://localhost:10000/default> select * from alltypesorc where cint = 528534767 limit 10; Error: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2) {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)