A Jira ticket on this bug has been filed: 
https://issues.apache.org/jira/browse/ARROW-11548 
<https://issues.apache.org/jira/browse/ARROW-11548> 

> On Feb 7, 2021, at 3:29 PM, Ying Zhou <yzhou7...@gmail.com> wrote:
> 
> Hi,
> 
> Recently I found a weird bug in RandomArrayGenerator.
> 
> RandomArrayGenerator::List consistently produces ListArrays with their length 
> 1 below what they should be according to their documentation. Moreover the 
> bitmaps we have are weird.
> 
> Here is some simple test:
> 
> TEST(TestAdapterWriteNested, ListTest) {
>   int64_t num_rows = 2;
>   static constexpr random::SeedType kRandomSeed2 = 0x0ff1ce;
>   arrow::random::RandomArrayGenerator rand(kRandomSeed2);
>   std::shared_ptr<Array> value_array = rand.ArrayOf(int32(), 2 * num_rows, 
> 0.2);
>   std::shared_ptr<Array> array = rand.List(*value_array, num_rows, 1);
>   RecordProperty("bitmap",*(array->null_bitmap_data()));
>   RecordProperty("length",array->length());
>   RecordProperty("array",array->ToString());
> }
> 
> Here are the results:
> 
> <testcase name="ListTest" status="run" result="completed" time="0" 
> timestamp="2021-02-07T15:23:16" classname="TestAdapterWriteNested">
> <properties>
> <property name="bitmap" value="3"/>
> <property name="length" value="1"/>
> <property name="array" value="[&#x0A;  [&#x0A;    null,&#x0A;    
> 1074834796,&#x0A;    551076274,&#x0A;    1184187771&#x0A;  ]&#x0A;]"/>
> </properties>
>     </testcase>
> 
> Here is what RandomArrayGenerator::List should do:
> 
>   /// \brief Generate a random ListArray
>   ///
>   /// \param[in] values The underlying values array
>   /// \param[in] size The size of the generated list array
>   /// \param[in] null_probability the probability of a list value being null
>   /// \param[in] force_empty_nulls if true, null list entries must have 0 
> length
>   ///
>   /// \return a generated Array
>   std::shared_ptr<Array> List(const Array& values, int64_t size, double 
> null_probability,
>                               bool force_empty_nulls = false);
> 
> Note that the generator failed in at least two aspects:
> 1. The length of the generated array is too low.
> 2. Even when null_probability is set to 1 there are still 1s in the bitmap. 
> 3. The size of the bitmap is larger than the size of the Array.
> 
> I’d like to know where we can find tests for arrow/testing/random. If they 
> are absent I need to write them.
> 
> Thanks,
> Ying
> 

Reply via email to