Arnab Karmakar has uploaded this change for review. ( http://gerrit.cloudera.org:8080/24068
Change subject: IMPALA-13122 addendum: Fix host statistics logging for erasure coded files ...................................................................... IMPALA-13122 addendum: Fix host statistics logging for erasure coded files When erasure coding is enabled, disk IDs are unavailable (set to -1) for EC blocks. The previous implementation only tracked hosts via host:disk pairs, requiring valid disk IDs. This caused host statistics to be missing from logs in EC environments. Fixed by tracking host indices separately from host:disk pairs: - Added uniqueHostIndices set to FileMetadataStats - Track all host indices regardless of disk ID availability - Host:disk pairs still tracked only when disk IDs are valid (>= 0) - Updated getNumUniqueHosts() to use uniqueHostIndices directly With this fix: - Traditional replication: Both hosts and host:disk pairs are logged - Erasure coding: Hosts are logged, host:disk pairs may be 0 or omitted Testing: - Updated JUnit test assertion: hosts >= pairs (was hosts <= pairs) - All tests pass with and without erasure coding Change-Id: Ie6f5b70fa9c46dd3f34287f030553360da6b20c6 --- M fe/src/main/java/org/apache/impala/catalog/FeFsTable.java M fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java 2 files changed, 14 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/24068/1 -- To view, visit http://gerrit.cloudera.org:8080/24068 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ie6f5b70fa9c46dd3f34287f030553360da6b20c6 Gerrit-Change-Number: 24068 Gerrit-PatchSet: 1 Gerrit-Owner: Arnab Karmakar <[email protected]>
