Hi,

This is an extremely weird phenomenon. There are two 2*1 tables that are 
supposedly different when I got a confusing error message like this:

[ RUN      ] TestAdapterWriteNested.writeList
/Users/karlkatzen/Documents/code/arrow-dev/arrow/cpp/src/arrow/testing/gtest_util.cc:459:
 Failure
Failed
Unequal at absolute position 2
Expected:
  [
    [
      null,
      1074834796,
      null,
      null
    ],
    null
  ]
Actual:
  [
    [
      null,
      1074834796,
      null,
      null
    ],
    null
  ]
[  FAILED  ] TestAdapterWriteNested.writeList (2 ms)

Here is the code that causes the issue:

TEST(TestAdapterWriteNested, writeList) {
  std::shared_ptr<Schema> table_schema = schema({field("list", list(int32()))});
  int64_t num_rows = 2;
  arrow::random::RandomArrayGenerator rand(kRandomSeed);
  auto value_array = rand.ArrayOf(int32(), 2 * num_rows, 0.6);
  std::shared_ptr<Array> array = rand.List(*value_array, num_rows + 1, 1);
  std::shared_ptr<ChunkedArray> chunked_array = 
std::make_shared<ChunkedArray>(array);
  std::shared_ptr<Table> table = Table::Make(table_schema, {chunked_array});
  AssertTableWriteReadEqual(table, table, kDefaultSmallMemStreamSize * 5);
}

Here AssertTableWriteReadEqual is a function I use to test that 
from_orc(to_orc(table_in)) == expected_table_out. The function did not have 
issues before.

void AssertTableWriteReadEqual(const std::shared_ptr<Table>& input_table,
                               const std::shared_ptr<Table>& 
expected_output_table,
                               const int64_t max_size = 
kDefaultSmallMemStreamSize) {
  std::shared_ptr<io::BufferOutputStream> buffer_output_stream =
      io::BufferOutputStream::Create(max_size).ValueOrDie();
  std::unique_ptr<adapters::orc::ORCFileWriter> writer =
      adapters::orc::ORCFileWriter::Open(*buffer_output_stream).ValueOrDie();
  ARROW_EXPECT_OK(writer->Write(*input_table));
  ARROW_EXPECT_OK(writer->Close());
  std::shared_ptr<Buffer> buffer = buffer_output_stream->Finish().ValueOrDie();
  std::shared_ptr<io::RandomAccessFile> in_stream(new io::BufferReader(buffer));
  std::unique_ptr<adapters::orc::ORCFileReader> reader;
  ARROW_EXPECT_OK(
      adapters::orc::ORCFileReader::Open(in_stream, default_memory_pool(), 
&reader));
  std::shared_ptr<Table> actual_output_table;
  ARROW_EXPECT_OK(reader->Read(&actual_output_table));
  AssertTablesEqual(*actual_output_table, *expected_output_table, false, false);
}

I strongly suspect that this is related to the null bitmaps. What do you guys 
think?

Ying

Reply via email to