[
https://issues.apache.org/jira/browse/HIVE-5871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rui Li updated HIVE-5871:
-------------------------
Description:
By default, hive only allows user to use single character as field delimiter.
Although there's RegexSerDe to specify multiple-character delimiter, it can be
daunting to use, especially for amateurs.
The patch adds a new SerDe named MultiDelimitSerDe. With MultiDelimitSerDe,
users can specify a multiple-character field delimiter when creating tables, in
a way most similar to typical table creations. For example:
{code}
create table test (id string,hivearray array<binary>,hivemap map<string,int>)
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' WITH
SERDEPROPERTIES ("field.delim"="[,]","collection.delim"=":","mapkey.delim"="@");
{code}
where {{field.delim}} is the field delimiter, {{collection.delim}} and
{{mapkey.delim}} is the delimiter for collection items and key value pairs,
respectively. Among these delimiters, {{field.delim}} is mandatory and can be
of multiple characters, while {{collection.delim}} and {{mapkey.delim}} is
optional and only support single character.
To use MultiDelimitSerDe, you have to add the hive-contrib jar to the class
path, e.g. with the {{add jar}} command.
was:
By default, hive only allows user to use single character as field delimiter.
Although there's RegexSerDe to specify multiple-character delimiter, it can be
daunting to use, especially for amateurs.
The patch adds a new SerDe named MultiDelimitSerDe. With MultiDelimitSerDe,
users can specify a multiple-character field delimiter when creating tables, in
a way most similar to typical table creations. For example:
> Use multiple-characters as field delimiter
> ------------------------------------------
>
> Key: HIVE-5871
> URL: https://issues.apache.org/jira/browse/HIVE-5871
> Project: Hive
> Issue Type: Improvement
> Components: Contrib
> Affects Versions: 0.12.0
> Reporter: Rui Li
> Assignee: Rui Li
> Labels: TODOC14
> Fix For: 0.14.0
>
> Attachments: HIVE-5871.2.patch, HIVE-5871.3.patch, HIVE-5871.4.patch,
> HIVE-5871.5.patch, HIVE-5871.6.patch, HIVE-5871.patch
>
>
> By default, hive only allows user to use single character as field delimiter.
> Although there's RegexSerDe to specify multiple-character delimiter, it can
> be daunting to use, especially for amateurs.
> The patch adds a new SerDe named MultiDelimitSerDe. With MultiDelimitSerDe,
> users can specify a multiple-character field delimiter when creating tables,
> in a way most similar to typical table creations. For example:
> {code}
> create table test (id string,hivearray array<binary>,hivemap map<string,int>)
> ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe'
> WITH SERDEPROPERTIES
> ("field.delim"="[,]","collection.delim"=":","mapkey.delim"="@");
> {code}
> where {{field.delim}} is the field delimiter, {{collection.delim}} and
> {{mapkey.delim}} is the delimiter for collection items and key value pairs,
> respectively. Among these delimiters, {{field.delim}} is mandatory and can be
> of multiple characters, while {{collection.delim}} and {{mapkey.delim}} is
> optional and only support single character.
> To use MultiDelimitSerDe, you have to add the hive-contrib jar to the class
> path, e.g. with the {{add jar}} command.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)