[
https://issues.apache.org/jira/browse/IMPALA-14219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18004885#comment-18004885
]
Joe McDonnell commented on IMPALA-14219:
----------------------------------------
Example crash stack:
{noformat}
Crash reason: SIGSEGV /SEGV_MAPERR
Crash address: 0x1df
Process uptime: not availableThread 429 (crashed)
0
impalad!impala::SlotRef::GetBooleanValInterpreted(impala::ScalarExprEvaluator*,
impala::TupleRow const*) const [tuple.h : 288 + 0x0]
rax = 0x00000000000001c2 rdx = 0x000000000000001d
rcx = 0x00000000000001e0 rbx = 0x0000000030cd0400
rsi = 0x0000000031fad2c0 rdi = 0x0000000030cd0400
rbp = 0x00007fea7278eac0 rsp = 0x00007fea7278ea40
r8 = 0x00007feb6c149080 r9 = 0x000000000000327b
r10 = 0x00007feb6c149090 r11 = 0x000000000062054a
r12 = 0x000000004652e000 r13 = 0x0000000000000000
r14 = 0x000000004652e000 r15 = 0x0000000059268c18
rip = 0x000000000215dae2
Found by: given as instruction pointer in context
1 impalad!impala::ScalarExpr::GetBooleanVal(impala::ScalarExprEvaluator*,
impala::TupleRow const*) const [scalar-expr.inline.h : 54 + 0xc]
2 impalad!impala::ScalarExprEvaluator::GetBooleanVal(impala::TupleRow const*)
[scalar-expr-evaluator.cc : 399 + 0x5]
3 impalad!impala::HdfsColumnarScanner::ProcessScratchBatch(impala::RowBatch*)
[scalar-expr-evaluator.h : 173 + 0x8]
4
impalad!impala::HdfsColumnarScanner::ProcessScratchBatchCodegenOrInterpret(impala::RowBatch*)
[hdfs-scanner.h : 507 + 0x2]
5 impalad!impala::HdfsColumnarScanner::FilterScratchBatch(impala::RowBatch*)
[hdfs-columnar-scanner.cc : 156 + 0xb]
6
impalad!impala::HdfsColumnarScanner::TransferScratchTuples(impala::RowBatch*)
[hdfs-columnar-scanner.cc : 160 + 0x5]
7 impalad!impala::HdfsParquetScanner::GetNextInternal(impala::RowBatch*)
[hdfs-parquet-scanner.cc : 524 + 0xb]
8 impalad!impala::HdfsParquetScanner::ProcessSplit() [hdfs-parquet-scanner.cc
: 451 + 0x17]
9
impalad!impala::HdfsScanNode::ProcessSplit(std::vector<impala::FilterContext,
std::allocator<impala::FilterContext> > const&, impala::MemPool*,
impala::io::ScanRange*, long*) [hdfs-scan-node.cc : 504 + 0x7]
10 impalad!impala::HdfsScanNode::ScannerThread(bool, long) [hdfs-scan-node.cc
: 422 + 0x19]
...{noformat}
There is also some code hitting this DCHECK:
{noformat}
F20250712 00:54:00.647553 390729 parquet-collection-column-reader.cc:156]
0049d0038e1732d3:d847317400000003] Check failed: children_[i]->def_level() ==
def_level_ (1 vs. 0)
F20250712 00:54:01.942261 390781 parquet-collection-column-reader.cc:156]
8e4c946a4f166355:95051b8600000005] Check failed: children_[i]->def_level() ==
def_level_ (1 vs. 0)
F20250712 00:54:01.942261 390781 parquet-collection-column-reader.cc:156]
8e4c946a4f166355:95051b8600000005] Check failed: children_[i]->def_level() ==
def_level_ (1 vs. 0) F20250712 00:54:02.777822 390838
parquet-collection-column-reader.cc:156] 9d4ad6644de21b62:c5942c6300000003]
Check failed: children_[i]->def_level() == def_level_ (1 vs. 0) {noformat}
> test_scanners_fuzz.py runs some queries against the wrong database with
> uncorrupted tables
> ------------------------------------------------------------------------------------------
>
> Key: IMPALA-14219
> URL: https://issues.apache.org/jira/browse/IMPALA-14219
> Project: IMPALA
> Issue Type: Task
> Components: Test
> Affects Versions: Impala 5.0.0
> Reporter: Joe McDonnell
> Priority: Major
>
> When working on IMPALA-13898, I discovered that
> "ImpalaTestSuite::_get_table_location()" actually switches into a database
> based on the vector (i.e. parquet switches into functional_parquet). When I
> changed this behavior to not switch databases, this broke some tests in
> test_scanners_fuzz.py. It turns out the tests were not running against the
> corrupted version of the tables. When they do run against the corrupted
> version of the table, Impala frequently crashes.
> Basically, the test creates a corrupted table in a unique_database based on
> an original base table and then runs SQLs against it. Some of the SQLs use
> relative names for the corrupted table (e.g. complextypestbl or
> alltypesagg_parquet_v2_uncompressed), which is the same as the base table.
> "ImpalaTestSuite::_get_table_location()" was switching them into
> functional_parquet which has tables of those names, and then nothing else
> ever switched them to the unique_database with the corrupted versions of the
> tables.
> We need to fix this test behavior, but we need to fix the Impala crashes
> before we can make these tests do the right thing.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]