[ https://issues.apache.org/jira/browse/HIVE-2917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kai Zhang updated HIVE-2917: ---------------------------- Attachment: HIVE-2917.2.patch.txt Fixed a mistake in the last patch > Add support for various charsets in LazySimpleSerDe > --------------------------------------------------- > > Key: HIVE-2917 > URL: https://issues.apache.org/jira/browse/HIVE-2917 > Project: Hive > Issue Type: New Feature > Components: CLI, Serializers/Deserializers > Affects Versions: 0.9.0 > Reporter: Kai Zhang > Attachments: HIVE-2917.1.patch.txt, HIVE-2917.2.patch.txt > > > Currently hive can only serialize/deserialize data encoded in utf-8. > It would be useful to specify the data's charset when creating the table. > The idea is to add a new keyword CHARSET to set charset at table level. > For example: > CREATE TABLE tbl1 (col1 STRING) ROW FORMAT CHARET "GBK" DELIMITED FIELDS > TERMINATED BY '\t'; > Another place to use CHARSET is in TRANSFORM clause. > For example: > SELECT TRANSFORM(col1, col2) ROW FORMAT CHARSET 'gbk' > USING 'some_script' > AS (col3, col4) ROW FORMAT CHARSET 'utf-8'; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira