Hi Ying,
Hmm, yes, this may be related to the null bitmaps, or the offsets. Can you try to inspect or pretty-print the offsets arrays for the two list arrays? Regards Antoine. Le 10/02/2021 à 03:26, Ying Zhou a écrit : > Hi, > > This is an extremely weird phenomenon. There are two 2*1 tables that are > supposedly different when I got a confusing error message like this: > > [ RUN ] TestAdapterWriteNested.writeList > /Users/karlkatzen/Documents/code/arrow-dev/arrow/cpp/src/arrow/testing/gtest_util.cc:459: > Failure > Failed > Unequal at absolute position 2 > Expected: > [ > [ > null, > 1074834796, > null, > null > ], > null > ] > Actual: > [ > [ > null, > 1074834796, > null, > null > ], > null > ] > [ FAILED ] TestAdapterWriteNested.writeList (2 ms) > > Here is the code that causes the issue: > > TEST(TestAdapterWriteNested, writeList) { > std::shared_ptr<Schema> table_schema = schema({field("list", > list(int32()))}); > int64_t num_rows = 2; > arrow::random::RandomArrayGenerator rand(kRandomSeed); > auto value_array = rand.ArrayOf(int32(), 2 * num_rows, 0.6); > std::shared_ptr<Array> array = rand.List(*value_array, num_rows + 1, 1); > std::shared_ptr<ChunkedArray> chunked_array = > std::make_shared<ChunkedArray>(array); > std::shared_ptr<Table> table = Table::Make(table_schema, {chunked_array}); > AssertTableWriteReadEqual(table, table, kDefaultSmallMemStreamSize * 5); > } > > Here AssertTableWriteReadEqual is a function I use to test that > from_orc(to_orc(table_in)) == expected_table_out. The function did not have > issues before. > > void AssertTableWriteReadEqual(const std::shared_ptr<Table>& input_table, > const std::shared_ptr<Table>& > expected_output_table, > const int64_t max_size = > kDefaultSmallMemStreamSize) { > std::shared_ptr<io::BufferOutputStream> buffer_output_stream = > io::BufferOutputStream::Create(max_size).ValueOrDie(); > std::unique_ptr<adapters::orc::ORCFileWriter> writer = > adapters::orc::ORCFileWriter::Open(*buffer_output_stream).ValueOrDie(); > ARROW_EXPECT_OK(writer->Write(*input_table)); > ARROW_EXPECT_OK(writer->Close()); > std::shared_ptr<Buffer> buffer = > buffer_output_stream->Finish().ValueOrDie(); > std::shared_ptr<io::RandomAccessFile> in_stream(new > io::BufferReader(buffer)); > std::unique_ptr<adapters::orc::ORCFileReader> reader; > ARROW_EXPECT_OK( > adapters::orc::ORCFileReader::Open(in_stream, default_memory_pool(), > &reader)); > std::shared_ptr<Table> actual_output_table; > ARROW_EXPECT_OK(reader->Read(&actual_output_table)); > AssertTablesEqual(*actual_output_table, *expected_output_table, false, > false); > } > > I strongly suspect that this is related to the null bitmaps. What do you guys > think? > > Ying >