[jira] [Updated] (HIVE-2111) NullPointerException on select * with table using RegexSerDe and partitions

Marc Harris (JIRA) Tue, 12 Apr 2011 07:35:48 -0700

     [ 
https://issues.apache.org/jira/browse/HIVE-2111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Marc Harris updated HIVE-2111:
------------------------------

    Description: 
When querying against a table that is partitioned, and uses RegexSerde, select 
with explicit columns works, but "select *" results in a NullPointerException

To reproduce:

1) create a table containing the following text (notice the blank line):
====start====
fillerdatafillerdatafiller

fillerdata2fillerdata2filler
=====end=====

2) copy the file to hdfs:
hadoop dfs -put foo.txt test/part1=x/foo.txt

3) run the following hive commands to create a table:

add jar s3://elasticmapreduce/samples/hive/jars/hive_contrib.jar;

drop table test;

create external table test(col1 STRING, col2 STRING) 
partitioned by (part1 STRING) 
row format serde 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' 
with serdeproperties ( "input.regex" = "^\(.*data\)\(.*data\).*$") 
stored as textfile 
location 'hdfs:///user/hadoop/test';

alter table test add partition (part1='x');

(Note that the text processor seems to have mangled the regex a bit. Inside 
each pair of parentheses should be dot star data. After the second pair of 
parentheses should be dot start dollar).

4) select from it with explicit columns:
select part1, col1, col2 from test;
outputs:
OK
x       fillerdata      fillerdata
x       NULL    NULL
x       fillerdata      2fillerdata

5) select from it with * columns
select * from test;
outputs:

Failed with exception java.io.IOException:java.lang.NullPointerException
11/04/12 14:28:27 ERROR CliDriver: Failed with exception 
java.io.IOException:java.lang.NullPointerException
java.io.IOException: java.lang.NullPointerException
        at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:149)
        at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1039)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:172)
        at 
org.apache.hadoop.hive.cli.CliDriver.processLineInternal(CliDriver.java:228)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:209)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:398)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: java.lang.NullPointerException
        at java.util.ArrayList.addAll(ArrayList.java:472)
        at 
org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldsDataAsList(UnionStructObjectInspector.java:144)
        at 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:357)
        at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:141)
        ... 10 more


  was:
When querying against a table that is partitioned, and uses RegexSerde, select 
with explicit columns works, but "select *" results in a NullPointerException

To reproduce:

1) create a table containing the following text (notice the blank line):
====start====
fillerdatafillerdatafiller

fillerdata2fillerdata2filler
=====end=====

2) copy the file to hdfs:
hadoop dfs -put foo.txt test/part1=x/foo.txt

3) run the following hive commands to create a table:

add jar s3://elasticmapreduce/samples/hive/jars/hive_contrib.jar;

drop table test;

create external table test(col1 STRING, col2 STRING) 
partitioned by (part1 STRING) 
row format serde 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' 
with serdeproperties ( "input.regex" = "^\(.*data\)(.*data\).*$") 
stored as textfile 
location 'hdfs:///user/hadoop/test';

alter table test add partition (part1='x');

4) select from it with explicit columns:
select part1, col1, col2 from test;
outputs:
OK
x       fillerdata      fillerdata
x       NULL    NULL
x       fillerdata      2fillerdata

5) select from it with * columns
select * from test;
outputs:

Failed with exception java.io.IOException:java.lang.NullPointerException
11/04/12 14:28:27 ERROR CliDriver: Failed with exception 
java.io.IOException:java.lang.NullPointerException
java.io.IOException: java.lang.NullPointerException
        at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:149)
        at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1039)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:172)
        at 
org.apache.hadoop.hive.cli.CliDriver.processLineInternal(CliDriver.java:228)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:209)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:398)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: java.lang.NullPointerException
        at java.util.ArrayList.addAll(ArrayList.java:472)
        at 
org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldsDataAsList(UnionStructObjectInspector.java:144)
        at 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:357)
        at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:141)
        ... 10 more



> NullPointerException on select * with table using RegexSerDe and partitions
> ---------------------------------------------------------------------------
>
>                 Key: HIVE-2111
>                 URL: https://issues.apache.org/jira/browse/HIVE-2111
>             Project: Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>    Affects Versions: 0.7.0
>         Environment: Amazon Elastic Mapreduce
>            Reporter: Marc Harris
>
> When querying against a table that is partitioned, and uses RegexSerde, 
> select with explicit columns works, but "select *" results in a 
> NullPointerException
> To reproduce:
> 1) create a table containing the following text (notice the blank line):
> ====start====
> fillerdatafillerdatafiller
> fillerdata2fillerdata2filler
> =====end=====
> 2) copy the file to hdfs:
> hadoop dfs -put foo.txt test/part1=x/foo.txt
> 3) run the following hive commands to create a table:
> add jar s3://elasticmapreduce/samples/hive/jars/hive_contrib.jar;
> drop table test;
> create external table test(col1 STRING, col2 STRING) 
> partitioned by (part1 STRING) 
> row format serde 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' 
> with serdeproperties ( "input.regex" = "^\(.*data\)\(.*data\).*$") 
> stored as textfile 
> location 'hdfs:///user/hadoop/test';
> alter table test add partition (part1='x');
> (Note that the text processor seems to have mangled the regex a bit. Inside 
> each pair of parentheses should be dot star data. After the second pair of 
> parentheses should be dot start dollar).
> 4) select from it with explicit columns:
> select part1, col1, col2 from test;
> outputs:
> OK
> x     fillerdata      fillerdata
> x     NULL    NULL
> x     fillerdata      2fillerdata
> 5) select from it with * columns
> select * from test;
> outputs:
> Failed with exception java.io.IOException:java.lang.NullPointerException
> 11/04/12 14:28:27 ERROR CliDriver: Failed with exception 
> java.io.IOException:java.lang.NullPointerException
> java.io.IOException: java.lang.NullPointerException
>       at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:149)
>       at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1039)
>       at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:172)
>       at 
> org.apache.hadoop.hive.cli.CliDriver.processLineInternal(CliDriver.java:228)
>       at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:209)
>       at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:398)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>       at java.lang.reflect.Method.invoke(Method.java:597)
>       at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> Caused by: java.lang.NullPointerException
>       at java.util.ArrayList.addAll(ArrayList.java:472)
>       at 
> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldsDataAsList(UnionStructObjectInspector.java:144)
>       at 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:357)
>       at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:141)
>       ... 10 more

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2111) NullPointerException on select * with table using RegexSerDe and partitions

Reply via email to