> On Nov. 23, 2014, 10:59 p.m., Mohit Sabharwal wrote:
> > data/files/parquet_types.txt, lines 1-3
> > <https://reviews.apache.org/r/28147/diff/3/?file=772138#file772138line1>
> >
> >     I think this is bit confusing, since the 0b prefix gives the impression 
> > that data is read in binary format, whereas it is actually getting read as 
> > a string.
> >     
> >     I think we can either write (preferably non-ascii) binary data instead 
> > (for example, see: data/files/string.txt) OR alternatively, we could write 
> > it legibly in hex, like 68656c6c6f ("hello") and convert it to binary using 
> > unhex() in the INSERT OVERWRITE query. What do you think ?

I encode some Chinese words(non-ascii) and use hex function to convert into 
string like B4F3CAFDBEDD(some Chinese words).


> On Nov. 23, 2014, 10:59 p.m., Mohit Sabharwal wrote:
> > ql/src/test/queries/clientpositive/parquet_types.q, line 48
> > <https://reviews.apache.org/r/28147/diff/3/?file=772143#file772143line48>
> >
> >     No need to unhex here...
> >     
> >     Can just be:
> >     
> >      SELECT cchar, LENGTH(cchar), cvarchar, LENGTH(cvarchar), cbinary FROM 
> > parquet_types
> >      
> >     Or you can pass it through hex() if original data has unprintable 
> > characters:
> >     
> >      SELECT cchar, LENGTH(cchar), cvarchar, LENGTH(cvarchar), hex(cbinary) 
> > FROM parquet_types

I think the statement of "SELECT cint, ctinyint, csmallint, cfloat, cdouble, 
cstring1, t, cchar, cvarchar, hex(cbinary), m1, l1, st1 FROM parquet_types;" 
will cover the case of binary. There is no need anymore for checking cbinary 
again.


- cheng


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28147/#review62744
-----------------------------------------------------------


On Nov. 21, 2014, 8:53 a.m., cheng xu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28147/
> -----------------------------------------------------------
> 
> (Updated Nov. 21, 2014, 8:53 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This patch includes:
> 1. binary support for ParquetHiveSerde
> 2. related test cases both in unit and ql test
> 
> 
> Diffs
> -----
> 
>   data/files/parquet_types.txt d342062 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveSchemaConverter.java
>  472de8f 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ArrayWritableObjectInspector.java
>  d5aae3b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 
> 4effe73 
>   ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetSerDe.java 
> 8ac7864 
>   ql/src/test/queries/clientpositive/parquet_types.q 22585c3 
>   ql/src/test/results/clientpositive/parquet_types.q.out 275897c 
> 
> Diff: https://reviews.apache.org/r/28147/diff/
> 
> 
> Testing
> -------
> 
> related UT and QL tests passed
> 
> 
> Thanks,
> 
> cheng xu
> 
>

Reply via email to