[ 
https://issues.apache.org/jira/browse/HIVE-3245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

N Campbell updated HIVE-3245:
-----------------------------

    Description: 
various foreign language data (i.e. japanese, thai etc) is loaded into string 
columns via tab delimited text files. A simple projection of the columns in the 
table is not displaying the correct data. Exporting the data from Hive and 
looking at the files implies the data is loaded properly. it appears to be an 
encoding issue at the driver but unaware of any required URL connection 
properties re encoding that Hive JDBC requires.


create table if not exists CERT.TLJA_JP_E ( RNUM int , C1 string, ORD int)
row format delimited
fields terminated by '\t'
stored as textfile;

create table if not exists CERT.TLJA_JP ( RNUM int , C1 string, ORD int)
stored as sequencefile;

load data local inpath '/home/hadoopadmin/jdbc-cert/CERT/CERT.TLJA_JP.txt'
overwrite into table CERT.TLJA_JP_E;
insert overwrite table CERT.TLJA_JP  select * from CERT.TLJA_JP_E;


  was:various foreign language data (i.e. japanese, thai etc) is loaded into 
string columns via tab delimited text files. A simple projection of the columns 
in the table is not displaying the correct data. Exporting the data from Hive 
and looking at the files implies the data is loaded properly. it appears to be 
an encoding issue at the driver but unaware of any required URL connection 
properties re encoding that Hive JDBC requires

    
> UTF encoded data not displayed correctly by Hive driver
> -------------------------------------------------------
>
>                 Key: HIVE-3245
>                 URL: https://issues.apache.org/jira/browse/HIVE-3245
>             Project: Hive
>          Issue Type: Bug
>          Components: JDBC
>    Affects Versions: 0.8.0
>            Reporter: N Campbell
>         Attachments: CERT.TLJA.txt
>
>
> various foreign language data (i.e. japanese, thai etc) is loaded into string 
> columns via tab delimited text files. A simple projection of the columns in 
> the table is not displaying the correct data. Exporting the data from Hive 
> and looking at the files implies the data is loaded properly. it appears to 
> be an encoding issue at the driver but unaware of any required URL connection 
> properties re encoding that Hive JDBC requires.
> create table if not exists CERT.TLJA_JP_E ( RNUM int , C1 string, ORD int)
> row format delimited
> fields terminated by '\t'
> stored as textfile;
> create table if not exists CERT.TLJA_JP ( RNUM int , C1 string, ORD int)
> stored as sequencefile;
> load data local inpath '/home/hadoopadmin/jdbc-cert/CERT/CERT.TLJA_JP.txt'
> overwrite into table CERT.TLJA_JP_E;
> insert overwrite table CERT.TLJA_JP  select * from CERT.TLJA_JP_E;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to