Hive Problem in Pig generated Parquet file schema in CREATE EXTERNAL TABLE (e.g. bag::col1)

Jianshi Huang Fri, 05 Dec 2014 16:43:49 -0800

Hi,

I had to use Pig for some preprocessing and to generate Parquet files for
Spark to consume.


However, due to Pig's limitation, the generated schema contains Pig's
identifier

e.g.
sorted::id, sorted::cre_ts, ...

I tried to put the schema inside CREATE EXTERNAL TABLE, e.g.

  create external table pmt (
    sorted::id bigint
  )
  stored as parquet
  location '...'

Obviously it didn't work, I also tried removing the identifier sorted::,
but the resulting rows contain only nulls.

Any idea how to create a table in HiveContext from these Parquet files?

Thanks,
Jianshi
-- 
Jianshi Huang

LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/

Hive Problem in Pig generated Parquet file schema in CREATE EXTERNAL TABLE (e.g. bag::col1)

Reply via email to