[ https://issues.apache.org/jira/browse/HIVE-2111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Marc Harris updated HIVE-2111: ------------------------------ Description: When querying against a table that is partitioned, and uses RegexSerde, select with explicit columns works, but "select *" results in a NullPointerException To reproduce: 1) create a table containing the following text (notice the blank line): ====start==== fillerdatafillerdatafiller fillerdata2fillerdata2filler =====end===== 2) copy the file to hdfs: hadoop dfs -put foo.txt test/part1=x/foo.txt 3) run the following hive commands to create a table: add jar s3://elasticmapreduce/samples/hive/jars/hive_contrib.jar; drop table test; create external table test(col1 STRING, col2 STRING) partitioned by (part1 STRING) row format serde 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' with serdeproperties ( "input.regex" = "^\(.*data\)\(.*data\).*$") stored as textfile location 'hdfs:///user/hadoop/test'; alter table test add partition (part1='x'); (Note that the text processor seems to have mangled the regex a bit. Inside each pair of parentheses should be dot star data. After the second pair of parentheses should be dot start dollar). 4) select from it with explicit columns: select part1, col1, col2 from test; outputs: OK x fillerdata fillerdata x NULL NULL x fillerdata 2fillerdata 5) select from it with * columns select * from test; outputs: Failed with exception java.io.IOException:java.lang.NullPointerException 11/04/12 14:28:27 ERROR CliDriver: Failed with exception java.io.IOException:java.lang.NullPointerException java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:149) at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1039) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:172) at org.apache.hadoop.hive.cli.CliDriver.processLineInternal(CliDriver.java:228) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:209) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:398) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) Caused by: java.lang.NullPointerException at java.util.ArrayList.addAll(ArrayList.java:472) at org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldsDataAsList(UnionStructObjectInspector.java:144) at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:357) at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:141) ... 10 more was: When querying against a table that is partitioned, and uses RegexSerde, select with explicit columns works, but "select *" results in a NullPointerException To reproduce: 1) create a table containing the following text (notice the blank line): ====start==== fillerdatafillerdatafiller fillerdata2fillerdata2filler =====end===== 2) copy the file to hdfs: hadoop dfs -put foo.txt test/part1=x/foo.txt 3) run the following hive commands to create a table: add jar s3://elasticmapreduce/samples/hive/jars/hive_contrib.jar; drop table test; create external table test(col1 STRING, col2 STRING) partitioned by (part1 STRING) row format serde 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' with serdeproperties ( "input.regex" = "^\(.*data\)(.*data\).*$") stored as textfile location 'hdfs:///user/hadoop/test'; alter table test add partition (part1='x'); 4) select from it with explicit columns: select part1, col1, col2 from test; outputs: OK x fillerdata fillerdata x NULL NULL x fillerdata 2fillerdata 5) select from it with * columns select * from test; outputs: Failed with exception java.io.IOException:java.lang.NullPointerException 11/04/12 14:28:27 ERROR CliDriver: Failed with exception java.io.IOException:java.lang.NullPointerException java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:149) at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1039) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:172) at org.apache.hadoop.hive.cli.CliDriver.processLineInternal(CliDriver.java:228) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:209) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:398) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) Caused by: java.lang.NullPointerException at java.util.ArrayList.addAll(ArrayList.java:472) at org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldsDataAsList(UnionStructObjectInspector.java:144) at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:357) at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:141) ... 10 more > NullPointerException on select * with table using RegexSerDe and partitions > --------------------------------------------------------------------------- > > Key: HIVE-2111 > URL: https://issues.apache.org/jira/browse/HIVE-2111 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers > Affects Versions: 0.7.0 > Environment: Amazon Elastic Mapreduce > Reporter: Marc Harris > > When querying against a table that is partitioned, and uses RegexSerde, > select with explicit columns works, but "select *" results in a > NullPointerException > To reproduce: > 1) create a table containing the following text (notice the blank line): > ====start==== > fillerdatafillerdatafiller > fillerdata2fillerdata2filler > =====end===== > 2) copy the file to hdfs: > hadoop dfs -put foo.txt test/part1=x/foo.txt > 3) run the following hive commands to create a table: > add jar s3://elasticmapreduce/samples/hive/jars/hive_contrib.jar; > drop table test; > create external table test(col1 STRING, col2 STRING) > partitioned by (part1 STRING) > row format serde 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' > with serdeproperties ( "input.regex" = "^\(.*data\)\(.*data\).*$") > stored as textfile > location 'hdfs:///user/hadoop/test'; > alter table test add partition (part1='x'); > (Note that the text processor seems to have mangled the regex a bit. Inside > each pair of parentheses should be dot star data. After the second pair of > parentheses should be dot start dollar). > 4) select from it with explicit columns: > select part1, col1, col2 from test; > outputs: > OK > x fillerdata fillerdata > x NULL NULL > x fillerdata 2fillerdata > 5) select from it with * columns > select * from test; > outputs: > Failed with exception java.io.IOException:java.lang.NullPointerException > 11/04/12 14:28:27 ERROR CliDriver: Failed with exception > java.io.IOException:java.lang.NullPointerException > java.io.IOException: java.lang.NullPointerException > at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:149) > at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1039) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:172) > at > org.apache.hadoop.hive.cli.CliDriver.processLineInternal(CliDriver.java:228) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:209) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:398) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > Caused by: java.lang.NullPointerException > at java.util.ArrayList.addAll(ArrayList.java:472) > at > org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldsDataAsList(UnionStructObjectInspector.java:144) > at > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:357) > at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:141) > ... 10 more -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira