I created a parquet file, expose that to hive using an external table, but
select from such tables are always giving NULL.


to show the symptom, I created the following data set , each record has
only 2 fields  __PRIMARY_KEY__ and nullableInt.  the schema represented in
avro is the following (I converted the data into parquet through the
avro-parquet convertor)

{"type":"record","name":"mytest","namespace":"yy.com
","doc":"","fields":[{"name":"__PRIMARY_KEY__","type":"string","doc":""},{"name":"nullableInt","type":["int","null"],"doc":""}],"version":"1424373511441"}



the following is the parquet hive table def.  I also attached the sample
parquet file.

Thanks!
yang


drop table mytest;
CREATE EXTERNAL TABLE IF NOT EXISTS mytest
(
PRIMARY_KEY String,
nullableInt     int
)
  STORED AS  PARQUET
LOCATION '/user/myusername/camus/topics/mytest/hourly/2015/02/19/11/'
;

select * from mytest limit 10;

Attachment: mytest.1.0.4.8.1424372400000.parquet
Description: Binary data

Reply via email to