A Jira ticket on this bug has been filed: https://issues.apache.org/jira/browse/ARROW-11548 <https://issues.apache.org/jira/browse/ARROW-11548>
> On Feb 7, 2021, at 3:29 PM, Ying Zhou <yzhou7...@gmail.com> wrote: > > Hi, > > Recently I found a weird bug in RandomArrayGenerator. > > RandomArrayGenerator::List consistently produces ListArrays with their length > 1 below what they should be according to their documentation. Moreover the > bitmaps we have are weird. > > Here is some simple test: > > TEST(TestAdapterWriteNested, ListTest) { > int64_t num_rows = 2; > static constexpr random::SeedType kRandomSeed2 = 0x0ff1ce; > arrow::random::RandomArrayGenerator rand(kRandomSeed2); > std::shared_ptr<Array> value_array = rand.ArrayOf(int32(), 2 * num_rows, > 0.2); > std::shared_ptr<Array> array = rand.List(*value_array, num_rows, 1); > RecordProperty("bitmap",*(array->null_bitmap_data())); > RecordProperty("length",array->length()); > RecordProperty("array",array->ToString()); > } > > Here are the results: > > <testcase name="ListTest" status="run" result="completed" time="0" > timestamp="2021-02-07T15:23:16" classname="TestAdapterWriteNested"> > <properties> > <property name="bitmap" value="3"/> > <property name="length" value="1"/> > <property name="array" value="[
 [
 null,
 > 1074834796,
 551076274,
 1184187771
 ]
]"/> > </properties> > </testcase> > > Here is what RandomArrayGenerator::List should do: > > /// \brief Generate a random ListArray > /// > /// \param[in] values The underlying values array > /// \param[in] size The size of the generated list array > /// \param[in] null_probability the probability of a list value being null > /// \param[in] force_empty_nulls if true, null list entries must have 0 > length > /// > /// \return a generated Array > std::shared_ptr<Array> List(const Array& values, int64_t size, double > null_probability, > bool force_empty_nulls = false); > > Note that the generator failed in at least two aspects: > 1. The length of the generated array is too low. > 2. Even when null_probability is set to 1 there are still 1s in the bitmap. > 3. The size of the bitmap is larger than the size of the Array. > > I’d like to know where we can find tests for arrow/testing/random. If they > are absent I need to write them. > > Thanks, > Ying >