Jefffrey commented on code in PR #17732:
URL: https://github.com/apache/datafusion/pull/17732#discussion_r2370662077


##########
datafusion/common/src/scalar/mod.rs:
##########
@@ -3305,17 +3306,29 @@ impl ScalarValue {
     /// assert_eq!(scalar_vec, expected);
     /// ```
     pub fn convert_array_to_scalar_vec(array: &dyn Array) -> 
Result<Vec<Vec<Self>>> {
-        let mut scalars = Vec::with_capacity(array.len());
-
-        for index in 0..array.len() {
-            let nested_array = array.as_list::<i32>().value(index);
-            let scalar_values = (0..nested_array.len())
-                .map(|i| ScalarValue::try_from_array(&nested_array, i))
-                .collect::<Result<Vec<_>>>()?;
-            scalars.push(scalar_values);
+        fn generic_collect<OffsetSize: OffsetSizeTrait>(
+            array: &dyn Array,
+        ) -> Result<Vec<Vec<ScalarValue>>> {
+            array
+                .as_list::<OffsetSize>()
+                .iter()
+                .map(|nested_array| match nested_array {
+                    Some(nested_array) => (0..nested_array.len())
+                        .map(|i| ScalarValue::try_from_array(&nested_array, i))
+                        .collect::<Result<Vec<_>>>(),
+                    // TODO: what can we put for null?
+                    None => Ok(vec![]),
+                })
+                .collect()

Review Comment:
   Because it used to do `array.as_list::<i32>().value(index)`, this never 
checked for nulls before.
   
   So in the test case I added below:
   
   ```rust
   // Funky (null slot has non-zero list offsets)
   // Offsets + Values looks like this: [[1, 2], [3, 4], [5]]
   // But with NullBuffer it's like this: [[1, 2], NULL, [5]]
   let funky = ListArray::new(
       Field::new_list_field(DataType::Int64, true).into(),
       OffsetBuffer::new(vec![0, 2, 4, 5].into()),
       Arc::new(Int64Array::from(vec![1, 2, 3, 4, 5, 6])),
       Some(NullBuffer::from(vec![true, false, true])),
   );
   let converted = ScalarValue::convert_array_to_scalar_vec(&funky).unwrap();
   assert_eq!(
       converted,
       vec![
           vec![ScalarValue::Int64(Some(1)), ScalarValue::Int64(Some(2))],
           vec![],
           vec![ScalarValue::Int64(Some(5))],
       ]
   );
   ```
   
   For the output, it incorrect would have `vec![3, 4]` instead of the empty 
Vec (it's accessing that element as if it were a valid list when in fact it was 
null in the parent list).
   
   For now I made nulls return empty list to not change the signature of the 
method, though this seems undesirable 🤔 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to