Zoltan Borok-Nagy has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/20793


Change subject: IMPALA-12606: Sporadic failures around 
query_test.test_queries.TestQueries.test_intersect
......................................................................

IMPALA-12606: Sporadic failures around 
query_test.test_queries.TestQueries.test_intersect

test_intersect failed when ASYNC_CODEGEN was enabled. This happened
because we were using codegened 'ProcessProbeBatch' in the HASH JOIN
operator with non-codegened InsertBatch/ProcessBuildBatch at the Builder
side, or vice versa.

Only the NULL StringValue's were hit by the bug, turned out NULLs are
handled differently in the hash table. We have been using the
HashUtil::FNV_SEED number to represent NULL values. This number was
chosen arbitrarily, we just wanted to use some random value.

HashUtil::FNV_SEED was set as ptr and length for the StringValue that
represents a NULL. The problem is HashUtil::FNV_SEED is a bit larger
than INT_MAX, therefore it wrote '1' to the small string indicator bit,
so StringValue::len() returned a small length, not the original value
that have been set.

To fix the issue, I introduced NULL_NUMBER instead of FNV_SEED, and the
value of it is FNV_SEED / 2. Most importantly, it is less than INT_MAX.

Testing:
 * Executed TestQueries.test_intersect multiple times

Change-Id: I6b855c59808db80fd7ac596ce338fc4c3c9c7667
---
M be/src/exec/hash-table.cc
1 file changed, 23 insertions(+), 20 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/20793/1
--
To view, visit http://gerrit.cloudera.org:8080/20793
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I6b855c59808db80fd7ac596ce338fc4c3c9c7667
Gerrit-Change-Number: 20793
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy <[email protected]>

Reply via email to