[ 
https://issues.apache.org/jira/browse/IMPALA-14110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17956457#comment-17956457
 ] 

ASF subversion and git services commented on IMPALA-14110:
----------------------------------------------------------

Commit a056808bc2b6d22c88022321cc5e007dbea01036 in impala's branch 
refs/heads/master from Xuebin Su
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=a056808bc ]

IMPALA-14110: Avoid decoding values for counting columns

For a counting column, its slot descriptor is null and its data decoder
is not initialized. Therefore, trying to decode the values when skipping
them will lead to check failure.

This patch fixes the issue by returning early when trying to skip values
if the current column is a counting column to avoid trying to decode any
value.

Testing:
- Passed TestZippingUnnest in exhaustive mode.
- Added test cases to make sure that page filtering works for counting
  columns.

Change-Id: Ia707335c50cc0653097f375aae3f10609e0eb091
Reviewed-on: http://gerrit.cloudera.org:8080/22974
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Hit DCHECK: bit_width_ >= 0 (-1 vs. 0) RleBatchDecoder must be initialised
> --------------------------------------------------------------------------
>
>                 Key: IMPALA-14110
>                 URL: https://issues.apache.org/jira/browse/IMPALA-14110
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 5.0.0
>            Reporter: Joe McDonnell
>            Assignee: Xuebin Su
>            Priority: Blocker
>
> In a custom core job running end-to-end tests, I ran into a crash due to 
> hitting "Check failed: bit_width_ >= 0 (-1 vs. 0) RleBatchDecoder must be 
> initialised". Here is the stack trace:
>  
> {noformat}
> Thread 896 (crashed)
>  0  libc.so.6!__GI_raise + 0x10f
>  1  libc.so.6!__GI_abort + 0x127
>  2  impalad!google::DumpStackTraceAndExit [utilities.cc : 178 + 0x5]
>  3  impalad!google::LogMessage::Flush() [logging.cc : 1799 + 0x2]
>  4  libstdc++.so.6!std::basic_ostream<char, std::char_traits<char> >& 
> std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, 
> std::char_traits<char> >&, char const*, long) [streambuf : 458 + 0xd]
>  5  impalad!impala::RleBatchDecoder<unsigned int>::NextCounts() 
> [rle-encoding.h : 665 + 0x8]
>  6  impalad!impala::BaseScalarColumnReader::StartPageFiltering() 
> [parquet-column-readers.cc : 1303 + 0x3]
>  7  impalad!impala::RleBatchDecoder<unsigned int>::SkipValues(int) 
> [rle-encoding.h : 512 + 0x8]
>  8  
> impalad!impala::BaseScalarColumnReader::InitDataPageDecoders(impala::ParquetColumnChunkReader::DataPageInfo
>  const&) [parquet-column-readers.cc : 1234 + 0x1a]
>  9  impalad!impala::ParquetLevelDecoder::CacheNextBatch(int) 
> [parquet-level-decoder.cc : 111 + 0x5]
> 10  impalad!impala::ScalarColumnReader<signed char, (parquet::Type::type)1, 
> false>::SkipEncodedValuesInPage(long) [dict-encoding.h : 589 + 0xc]
> 11  impalad!impala::BaseScalarColumnReader::ReadDataPage() 
> [parquet-column-readers.cc : 1174 + 0x12]
> 12  impalad!bool impala::BaseScalarColumnReader::SkipTopLevelRows<true>(long, 
> long*) [parquet-column-readers.cc : 1408 + 0xf]
> 13  libc.so.6!__clock_gettime_2 + 0x2a
> 14  impalad!impala::BaseScalarColumnReader::NextPage() [stopwatch.h : 162 + 
> 0xc]
> 15  impalad!bool impala::BaseScalarColumnReader::SkipRowsInternal<true>(long, 
> long) [parquet-column-readers.cc : 1647 + 0x12]
> 16  impalad!impala::BaseScalarColumnReader::SkipRows(long, long) 
> [parquet-column-readers.h : 613 + 0xe]
> 17  impalad!impala::CollectionColumnReader::SkipRows(long, long) 
> [parquet-collection-column-reader.cc : 179 + 0x9]
> 18  
> impalad!impala::HdfsParquetScanner::FillScratchMicroBatches(std::vector<impala::ParquetColumnReader*,
>  std::allocator<impala::ParquetColumnReader*> > const&, impala::RowBatch*, 
> bool*, impala::ScratchMicroBatch const*, int, int, int*) 
> [hdfs-parquet-scanner.cc : 2550 + 0x3]
> 19  impalad!impala::Status 
> impala::HdfsParquetScanner::AssembleRows<false>(impala::RowBatch*, bool*) 
> [hdfs-parquet-scanner.cc : 2463 + 0x21]
> 20  impalad!impala::HdfsParquetScanner::GetNextInternal(impala::RowBatch*) 
> [hdfs-parquet-scanner.cc : 564 + 0x19]
> 21  impalad!impala::HdfsParquetScanner::ProcessSplit() 
> [hdfs-parquet-scanner.cc : 451 + 0x17]
> 22  
> impalad!impala::HdfsScanNode::ProcessSplit(std::vector<impala::FilterContext, 
> std::allocator<impala::FilterContext> > const&, impala::MemPool*, 
> impala::io::ScanRange*, long*) [hdfs-scan-node.cc : 504 + 0x7]
> 23  impalad!impala::HdfsScanNode::ScannerThread(bool, long) 
> [hdfs-scan-node.cc : 422 + 0x19]
> {noformat}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to