Re: Review Request 62360: HIVE-16898: Validation of source file after distcp in repl load

anishek Sun, 17 Sep 2017 21:50:26 -0700

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62360/#review185534
-----------------------------------------------------------





ql/src/java/org/apache/hadoop/hive/ql/parse/repl/CopyUtils.java
Lines 73 (patched)
<https://reviews.apache.org/r/62360/#comment261851>

    Evaluation of doing a regularCopy or distCp can be done in the inner most 
function call, this will reduce passing in another variable from the top which 
can be evaluated later



ql/src/java/org/apache/hadoop/hive/ql/parse/repl/CopyUtils.java
Lines 92 (patched)
<https://reviews.apache.org/r/62360/#comment261852>

    I think eventually we have to move to a model of doing the checksum on 
sourceFS vs destinationFS as you have done here, though certain FS 
configurations change the value of checksum and unless we can guarantee that we 
calculate the checksum on the data by reading the data this might lead to more 
failures,
    
    I thought the idea for now was that,
    
    1>> we get the checksum of the file on sourceFS before copy
    2>> we do the copy
    3>> we get the checksum on the file on sourceFS again 
    4>> we compare the checksum in 1 and 3 and if its not changed then during 
our copy the value wouldnt have either. 
    
    until we can figure out the acutal solution to this, the fall back of doing 
the check on sourceFS might be the way to go.



ql/src/java/org/apache/hadoop/hive/ql/parse/repl/CopyUtils.java
Lines 95 (patched)
<https://reviews.apache.org/r/62360/#comment261853>

    the same problem here were our cleaner thread on CM is running sooner than 
expected then we will miss data, might be better to fail in case the file is 
not in CM



ql/src/java/org/apache/hadoop/hive/ql/parse/repl/CopyUtils.java
Lines 116 (patched)
<https://reviews.apache.org/r/62360/#comment261850>

    As a part of doing copy if the copy fails due to fileNotFoundException for 
a file location to actual location on hdfs then we should retry with the 
corresponding CMRoot Path for this file since it was moved while we were in the 
porcess of doing the copy.
    
    Also if this happnes for a CM root file then there is an issue in our 
configuration such that the CM root FS is cleaned before the copy is done and 
we should log this as an error as the cleaner thread for CMroot is not 
configured for the right time. i did rather fail repl load, instead of just 
logging the error else we might not know how many such instances might happen 
before we realize that replication is broken.


- anishek


On Sept. 15, 2017, 6:10 p.m., Daniel Dai wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62360/
> -----------------------------------------------------------
> 
> (Updated Sept. 15, 2017, 6:10 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> See HIVE-16898
> 
> 
> Diffs
> -----
> 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ReplChangeManager.java 
> 88d6a7a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ReplCopyTask.java 54746d3 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/repl/CopyUtils.java 28e7bcb 
> 
> 
> Diff: https://reviews.apache.org/r/62360/diff/1/
> 
> 
> Testing
> -------
> 
> Manually test it with debugger: setup a breakpoint right before copy, and 
> drop table in another session.
> 
> 
> Thanks,
> 
> Daniel Dai
> 
>

Re: Review Request 62360: HIVE-16898: Validation of source file after distcp in repl load

Reply via email to