Rekha Joshi created HADOOP-10084:
------------------------------------
Summary: Hcat alter table add parttition: add skip header/row
feature
Key: HADOOP-10084
URL: https://issues.apache.org/jira/browse/HADOOP-10084
Project: Hadoop Common
Issue Type: Improvement
Components: conf
Affects Versions: 0.5.0
Reporter: Rekha Joshi
Priority: Minor
Creating hcatalog table using creating tables and alter table add partition is
most used approach.However at times the incoming files can come with header
row/column names.
In such cases it would be good feature to be able skip header/rows.
Suggestions below:
hcat "alter table rawevents add partition (ds='20100819') location
'hdfs://data/rawevents/20100819/data' -skip header"
hcat "alter table rawevents add partition (ds='20100819') location
'hdfs://data/rawevents/20100819/data' -skip [n]"
hcat "alter table rawevents add partition (ds='20100819') location
'hdfs://data/rawevents/20100819/data'" -DskipRow=1
-- can choose with bounded array (rows) for selecting rows for table
hcat "alter table rawevents add partition (ds='20100819') location
'hdfs://data/rawevents/20100819/data' -rows[2:]" // from first row till all
hcat "alter table rawevents add partition (ds='20100819') location
'hdfs://data/rawevents/20100819/data' -rows[2:100]" // from first row till 100
rows
Correct place for this feature in hive or hcat?or with -D can be handled in
hcat?
Thanks
Rekha
--
This message was sent by Atlassian JIRA
(v6.1#6144)