Hi all

I've been working with hive for some time.

In my company, we use hive for querying on large datasets and found it's
very easy to use.

However we also found hive is lack of various charsets support so that we
have to manually transform data files to utf-8 encoding before loading them
into hive.

So I have made a patch to make hive supports setting charset when creating
a table.
And the charset property will be used by SerDe when it serialize or
deserialize data.

The modified hql is like:

CREATE TABLE tbl1 (col1 STRING) ROW FORMAT CHARET "GBK" DELIMITED FIELDS
TERMINATED BY '\t';

I'm very happy to contribute this to the community and looking forward to
your feedbacks.

Thanks,
Kai Zhang

Reply via email to