alamb commented on code in PR #12032:
URL: https://github.com/apache/datafusion/pull/12032#discussion_r1719849131
##########
datafusion-cli/src/functions.rs:
##########
@@ -376,16 +397,11 @@ impl TableFunctionImpl for ParquetMetadataFunc {
let converted_type = column.column_descr().converted_type();
if let Some(s) = column.statistics() {
- let (min_val, max_val) = if s.has_min_max_set() {
- let (min_val, max_val) =
- convert_parquet_statistics(s, converted_type);
- (Some(min_val), Some(max_val))
- } else {
- (None, None)
- };
+ let (min_val, max_val) =
+ convert_parquet_statistics(s, converted_type);
Review Comment:
min() and max() are deprecated, so this updates the code to use `min_opt()`
and `max_opt()` which I think makes this much less convoluted
##########
datafusion/sqllogictest/test_files/repartition_scan.slt:
##########
@@ -61,7 +61,7 @@ logical_plan
physical_plan
01)CoalesceBatchesExec: target_batch_size=8192
02)--FilterExec: column1@0 != 42
-03)----ParquetExec: file_groups={4 groups:
[[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/repartition_scan/parquet_table/2.parquet:0..104],
[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/repartition_scan/parquet_table/2.parquet:104..208],
[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/repartition_scan/parquet_table/2.parquet:208..312],
[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/repartition_scan/parquet_table/2.parquet:312..414]]},
projection=[column1], predicate=column1@0 != 42, pruning_predicate=CASE WHEN
column1_null_count@2 = column1_row_count@3 THEN false ELSE column1_min@0 != 42
OR 42 != column1_max@1 END, required_guarantees=[column1 not in (42)]
+03)----ParquetExec: file_groups={4 groups:
[[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/repartition_scan/parquet_table/2.parquet:0..87],
[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/repartition_scan/parquet_table/2.parquet:87..174],
[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/repartition_scan/parquet_table/2.parquet:174..261],
[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/repartition_scan/parquet_table/2.parquet:261..347]]},
projection=[column1], predicate=column1@0 != 42, pruning_predicate=CASE WHEN
column1_null_count@2 = column1_row_count@3 THEN false ELSE column1_min@0 != 42
OR 42 != column1_max@1 END, required_guarantees=[column1 not in (42)]
Review Comment:
the sizes of the parquet files changed slightly and thus so did the data
split
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]