[ https://issues.apache.org/jira/browse/IMPALA-13894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yida Wu resolved IMPALA-13894. ------------------------------ Fix Version/s: Impala 5.0.0 Resolution: Fixed > Tuple cache correctness verification should proceed past file size differences > ------------------------------------------------------------------------------ > > Key: IMPALA-13894 > URL: https://issues.apache.org/jira/browse/IMPALA-13894 > Project: IMPALA > Issue Type: Task > Components: Backend > Affects Versions: Impala 5.0.0 > Reporter: Joe McDonnell > Assignee: Yida Wu > Priority: Major > Fix For: Impala 5.0.0 > > > Tuple cache correctness verification does a fast check to see if the two > files are identical. If it determines that they are not identical, then it > can proceed to a slow check that corrects for order differences. > This fast check looks at the file sizes and if they are not the same, it > returns a not-OK status: > {noformat} > if (file1_length != file2_length || file1_length == > TUPLE_TEXT_FILE_SIZE_ERROR) { > return Status(TErrorCode::TUPLE_CACHE_INCONSISTENCY, > Substitute("Size of file '$0' (size: $1) and '$2' (size: $3) are > different", > path_a + DEBUG_TUPLE_CACHE_BAD_POSTFIX, file1_length, > path_b + DEBUG_TUPLE_CACHE_BAD_POSTFIX, file2_length)); > }{noformat} > Returning not-OK status actually causes the calling code to skip the slow > check that can give more detail about what is different. We should change > this to set *passed = false and let the slower check go forward so that it > produces a more interesting error message. It's also unclear whether the same > rows in a different order would always have the same size. -- This message was sent by Atlassian Jira (v8.20.10#820010)