kosiew commented on code in PR #19733:
URL: https://github.com/apache/datafusion/pull/19733#discussion_r2735715790
##########
datafusion/functions/src/datetime/date_part.rs:
##########
@@ -237,6 +243,68 @@ impl ScalarUDFImpl for DatePartFunc {
})
}
+ // Only casting the year is supported since pruning other IntervalUnit is
not possible
+ // date_part(col, YEAR) = 2024 => col >= '2024-01-01' and col <
'2025-01-01'
+ // But for anything less than YEAR simplifying is not possible without
specifying the bigger interval
+ // date_part(col, MONTH) = 1 => col = '2023-01-01' or col = '2024-01-01'
or ... or col = '3000-01-01'
+ fn preimage(
Review Comment:
Consider adding a section to
`docs/source/library-user-guide/functions/adding-udfs.md` explaining:
- What preimage is and when to implement it
- How it enables predicate pushdown
- Example implementation (perhaps referencing `date_part`)
##########
datafusion/functions/src/datetime/date_part.rs:
##########
@@ -237,6 +243,68 @@ impl ScalarUDFImpl for DatePartFunc {
})
}
+ // Only casting the year is supported since pruning other IntervalUnit is
not possible
+ // date_part(col, YEAR) = 2024 => col >= '2024-01-01' and col <
'2025-01-01'
+ // But for anything less than YEAR simplifying is not possible without
specifying the bigger interval
+ // date_part(col, MONTH) = 1 => col = '2023-01-01' or col = '2024-01-01'
or ... or col = '3000-01-01'
+ fn preimage(
+ &self,
+ args: &[Expr],
+ lit_expr: &Expr,
+ info: &SimplifyContext,
+ ) -> Result<PreimageResult> {
+ let [part, col_expr] = take_function_args(self.name(), args)?;
+
+ // Get the interval unit from the part argument
+ let interval_unit = part
+ .as_literal()
+ .and_then(|sv| sv.try_as_str().flatten())
+ .map(part_normalization)
+ .and_then(|s| IntervalUnit::from_str(s).ok());
+
+ // only support extracting year
+ match interval_unit {
+ Some(IntervalUnit::Year) => (),
+ _ => return Ok(PreimageResult::None),
+ }
+
+ // Check if the argument is a literal (e.g. date_part(YEAR, col) =
2024)
+ let Some(argument_literal) = lit_expr.as_literal() else {
+ return Ok(PreimageResult::None);
+ };
+
+ // Extract i32 year from Scalar value
+ let year = match argument_literal {
+ ScalarValue::Int32(Some(y)) => *y,
+ _ => return Ok(PreimageResult::None),
+ };
+
+ // Can only extract year from Date32/64 and Timestamp column
+ let target_type = match info.get_data_type(col_expr)? {
+ Date32 | Date64 | Timestamp(_, _) =>
&info.get_data_type(col_expr)?,
+ _ => return Ok(PreimageResult::None),
+ };
+
+ // Compute the Interval bounds
+ let start_time =
+ NaiveDate::from_ymd_opt(year, 1, 1).expect("Expect computed start
time");
+ let end_time = start_time
+ .with_year(year + 1)
+ .expect("Expect computed end time");
Review Comment:
Can we replace `expect` with proper error handling?
Year overflow (e.g., year 999999) should return `PreimageResult::None`
rather than panic. This maintains consistency with the "fail gracefully"
pattern where unsupported cases return `None`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]