-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61522/#review186844
-----------------------------------------------------------



Hey Sandish,

Thanks for your contribution. I am pretty sure your work will solve a lot of 
headache for a lot of people.

I'd some findings so I added my comments to the code. Please read them and put 
your comments/ideas to it.

Cheers, Zoli


src/java/org/apache/sqoop/avro/AvroUtil.java
Lines 194 (patched)
<https://reviews.apache.org/r/61522/#comment263683>

    If it is only Parquet file related then what is this change in AvroUtils. I 
haven't checked why it is necessary so please explain why does this change 
needed.



src/java/org/apache/sqoop/avro/AvroUtil.java
Lines 195 (patched)
<https://reviews.apache.org/r/61522/#comment263684>

    If this part is needed then you can use Java Charset to load UTF-8 so it 
won't throw IOException.



src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java
Lines 100 (patched)
<https://reviews.apache.org/r/61522/#comment263688>

    It is called here double time. Is it okay?



src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java
Lines 107 (patched)
<https://reviews.apache.org/r/61522/#comment263690>

    Why do you use double log.warn here? You can merge it to one.



src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java
Lines 111 (patched)
<https://reviews.apache.org/r/61522/#comment263689>

    Why don't use the catch block for this code snippet?



src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java
Lines 120 (patched)
<https://reviews.apache.org/r/61522/#comment263687>

    I think this shouldn't be public. If you use it only inside of this class 
then make it private. If you would like to use it outside of this class then a 
getter would make more sense.
    
    It is also should be somewhere begining of the class because it is a static 
final variable. You can intialize it in a static block.



src/test/com/cloudera/sqoop/TestParquetExport.java
Lines 165 (patched)
<https://reviews.apache.org/r/61522/#comment263678>

    This method is almost copy of the previous one. Please avoid duplicates and 
put the common parts into one method.
    
    And please also change the comment of the method becuase that is the same 
but the two method doing different thing.
    
    fileNum parameter is not used in the method, please remove it.



src/test/com/cloudera/sqoop/TestParquetExport.java
Lines 226 (patched)
<https://reviews.apache.org/r/61522/#comment263680>

    Please do not copy and paste code parts which is already in the class. It 
increases duplication which makes harder to read and maintain later.
    
    testSupportedParquetTypes() method also contains this part of the code. Put 
them into a different method like createParquetTestFileContent()



src/test/com/cloudera/sqoop/TestParquetExport.java
Lines 232 (patched)
<https://reviews.apache.org/r/61522/#comment263679>

    We don't have an official formatter yet but I think a 120 columns length 
row is better than 80.



src/test/com/cloudera/sqoop/TestParquetExport.java
Lines 475 (patched)
<https://reviews.apache.org/r/61522/#comment263681>

    Another duplication



src/test/com/cloudera/sqoop/TestParquetExport.java
Lines 527 (patched)
<https://reviews.apache.org/r/61522/#comment263685>

    Please specify the expected exception here. Your test should have one good 
result only. Pleas avoid unnecessary usage of general exceptions.
    
    Use expected exception instead of try - catch. It makes the code more 
readable.



src/test/com/cloudera/sqoop/TestParquetExport.java
Lines 614 (patched)
<https://reviews.apache.org/r/61522/#comment263686>

    Same here. Please specify the exception and use expected exception instead 
of try - catch blocks.


- Zoltán Tóth


On Aug. 9, 2017, 10:53 a.m., Sandish Kumar HN wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61522/
> -----------------------------------------------------------
> 
> (Updated Aug. 9, 2017, 10:53 a.m.)
> 
> 
> Review request for Sqoop and Anna Szonyi.
> 
> 
> Bugs: SQOOP-2907
>     https://issues.apache.org/jira/browse/SQOOP-2907
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> Kite currently requires .metadata.
> Parquet files have their own metadata stored along data files.
> It would be great for Export operation on parquet files to RDBMS not to 
> require .metadata.
> We have most of the files created by Spark and Hive, and they don't create 
> .metadata, it only Kite that does.
> It makes sqoop export of parquet files usability very limited.
> 
> 
> Diffs
> -----
> 
>   src/java/org/apache/sqoop/avro/AvroUtil.java ee29f140 
>   src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java 6f9afaf9 
>   src/test/com/cloudera/sqoop/TestParquetExport.java 680fd73b 
> 
> 
> Diff: https://reviews.apache.org/r/61522/diff/1/
> 
> 
> Testing
> -------
> 
> testSupportedParquetTypesForWithoutParquetMeta - done
> testNullableFieldWithoutParquetMeta - done
> testParquetRecordsNotSupportedWithoutParquetMeta -done
> testMissingDatabaseFieldsWithoutParquetMeta - done
> testMissingParquetFieldsWithoutParquetMeta - done
> 
> 
> Thanks,
> 
> Sandish Kumar HN
> 
>

Reply via email to