[ https://issues.apache.org/jira/browse/HIVE-5871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rui Li updated HIVE-5871: ------------------------- Description: By default, hive only allows user to use single character as field delimiter. Although there's RegexSerDe to specify multiple-character delimiter, it can be daunting to use, especially for amateurs. The patch adds a new SerDe named MultiDelimitSerDe. With MultiDelimitSerDe, users can specify a multiple-character field delimiter when creating tables, in a way most similar to typical table creations. For example: {code} create table test (id string,hivearray array<binary>,hivemap map<string,int>) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' WITH SERDEPROPERTIES ("field.delim"="[,]","collection.delim"=":","mapkey.delim"="@"); {code} where {{field.delim}} is the field delimiter, {{collection.delim}} and {{mapkey.delim}} is the delimiter for collection items and key value pairs, respectively. Among these delimiters, {{field.delim}} is mandatory and can be of multiple characters, while {{collection.delim}} and {{mapkey.delim}} is optional and only support single character. To use MultiDelimitSerDe, you have to add the hive-contrib jar to the class path, e.g. with the {{add jar}} command. was: By default, hive only allows user to use single character as field delimiter. Although there's RegexSerDe to specify multiple-character delimiter, it can be daunting to use, especially for amateurs. The patch adds a new SerDe named MultiDelimitSerDe. With MultiDelimitSerDe, users can specify a multiple-character field delimiter when creating tables, in a way most similar to typical table creations. For example: > Use multiple-characters as field delimiter > ------------------------------------------ > > Key: HIVE-5871 > URL: https://issues.apache.org/jira/browse/HIVE-5871 > Project: Hive > Issue Type: Improvement > Components: Contrib > Affects Versions: 0.12.0 > Reporter: Rui Li > Assignee: Rui Li > Labels: TODOC14 > Fix For: 0.14.0 > > Attachments: HIVE-5871.2.patch, HIVE-5871.3.patch, HIVE-5871.4.patch, > HIVE-5871.5.patch, HIVE-5871.6.patch, HIVE-5871.patch > > > By default, hive only allows user to use single character as field delimiter. > Although there's RegexSerDe to specify multiple-character delimiter, it can > be daunting to use, especially for amateurs. > The patch adds a new SerDe named MultiDelimitSerDe. With MultiDelimitSerDe, > users can specify a multiple-character field delimiter when creating tables, > in a way most similar to typical table creations. For example: > {code} > create table test (id string,hivearray array<binary>,hivemap map<string,int>) > ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' > WITH SERDEPROPERTIES > ("field.delim"="[,]","collection.delim"=":","mapkey.delim"="@"); > {code} > where {{field.delim}} is the field delimiter, {{collection.delim}} and > {{mapkey.delim}} is the delimiter for collection items and key value pairs, > respectively. Among these delimiters, {{field.delim}} is mandatory and can be > of multiple characters, while {{collection.delim}} and {{mapkey.delim}} is > optional and only support single character. > To use MultiDelimitSerDe, you have to add the hive-contrib jar to the class > path, e.g. with the {{add jar}} command. -- This message was sent by Atlassian JIRA (v6.3.4#6332)