efredine commented on code in PR #11319:
URL: https://github.com/apache/datafusion/pull/11319#discussion_r1667766837


##########
datafusion/core/src/datasource/physical_plan/parquet/statistics.rs:
##########
@@ -747,10 +770,10 @@ macro_rules! get_data_page_statistics {
                 Some(DataType::Boolean) => Ok(Arc::new(
                     BooleanArray::from_iter(

Review Comment:
   We could use BooleanBuilder instead? That would likely be more efficient.



##########
datafusion/core/src/datasource/physical_plan/parquet/statistics.rs:
##########
@@ -875,14 +914,14 @@ macro_rules! get_data_page_statistics {
                 Some(DataType::Date32) => 
Ok(Arc::new(Date32Array::from_iter([<$stat_type_prefix 
Int32DataPageStatsIterator>]::new($iterator).flatten()))),
                 Some(DataType::Date64) => Ok(
                     Arc::new(
-                        Date64Array::from([<$stat_type_prefix 
Int32DataPageStatsIterator>]::new($iterator)
+                        Date64Array::from_iter([<$stat_type_prefix 
Int32DataPageStatsIterator>]::new($iterator)
                             .map(|x| {
                                 x.into_iter()
-                                .filter_map(|x| {
+                                .map(|x| {

Review Comment:
   I removed the unnecessary collect, so it should be good to go once the tests 
have passed.
   
   I suspect the remaining cases where we are using collect could be made more 
efficient using the Builder pattern?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to