[ 
https://issues.apache.org/jira/browse/IMPALA-13894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yida Wu resolved IMPALA-13894.
------------------------------
    Fix Version/s: Impala 5.0.0
       Resolution: Fixed

> Tuple cache correctness verification should proceed past file size differences
> ------------------------------------------------------------------------------
>
>                 Key: IMPALA-13894
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13894
>             Project: IMPALA
>          Issue Type: Task
>          Components: Backend
>    Affects Versions: Impala 5.0.0
>            Reporter: Joe McDonnell
>            Assignee: Yida Wu
>            Priority: Major
>             Fix For: Impala 5.0.0
>
>
> Tuple cache correctness verification does a fast check to see if the two 
> files are identical. If it determines that they are not identical, then it 
> can proceed to a slow check that corrects for order differences.
> This fast check looks at the file sizes and if they are not the same, it 
> returns a not-OK status:
> {noformat}
>   if (file1_length != file2_length || file1_length == 
> TUPLE_TEXT_FILE_SIZE_ERROR) {
>     return Status(TErrorCode::TUPLE_CACHE_INCONSISTENCY,
>         Substitute("Size of file '$0' (size: $1) and '$2' (size: $3) are 
> different",
>             path_a + DEBUG_TUPLE_CACHE_BAD_POSTFIX, file1_length,
>             path_b + DEBUG_TUPLE_CACHE_BAD_POSTFIX, file2_length));
>   }{noformat}
> Returning not-OK status actually causes the calling code to skip the slow 
> check that can give more detail about what is different. We should change 
> this to set *passed = false and let the slower check go forward so that it 
> produces a more interesting error message. It's also unclear whether the same 
> rows in a different order would always have the same size.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to