kosiew commented on code in PR #1387:
URL: 
https://github.com/apache/datafusion-python/pull/1387#discussion_r2853291032


##########
python/datafusion/functions.py:
##########
@@ -1010,17 +1013,69 @@ def now() -> Expr:
     return Expr(f.now())
 
 
+def to_char(arg: Expr, format: Expr) -> Expr:

Review Comment:
   `format` shadows Python’s built-in `format`.
   What do you think of renaming to `formatter` like in to_date?



##########
python/datafusion/functions.py:
##########
@@ -1010,17 +1013,69 @@ def now() -> Expr:
     return Expr(f.now())
 
 
+def to_char(arg: Expr, format: Expr) -> Expr:
+    """Returns a string representation of a date, time, timestamp or duration.
+
+    For usage of ``format`` see the rust chrono package ``strftime`` package.
+
+    [Documentation 
here.](https://docs.rs/chrono/latest/chrono/format/strftime/index.html)
+    """
+    return Expr(f.to_char(arg.expr, format.expr))
+
+
+def to_date(arg: Expr, *formatters: Expr) -> Expr:
+    """Converts a value to a date (YYYY-MM-DD).
+
+    Supports strings, numeric and timestamp types as input.
+    Integers and doubles are interpreted as days since the unix epoch.
+    Strings are parsed as YYYY-MM-DD (e.g. '2023-07-20')
+    if ``formatters`` are not provided.
+
+    For usage of ``formatters`` see the rust chrono package ``strftime`` 
package.
+
+    [Documentation 
here.](https://docs.rs/chrono/latest/chrono/format/strftime/index.html)
+    """
+    if not formatters:
+        return Expr(f.to_date(arg.expr))
+    formatters = [fmt.expr for fmt in formatters]
+    return Expr(f.to_date(arg.expr, *formatters))

Review Comment:
   `to_date`, `to_time`, and `to_timestamp*` each perform the same 
formatter-unwrapping (`[fmt.expr for fmt in formatters]`) and “no formatters” 
branching.
   What do you think of extracting a tiny private helper (e.g. 
`_unwrap_exprs(*args: Expr) -> list[PyExpr]`) and/or a shared optional-format 
dispatch helper to reduce duplication and keep behavior aligned across 
functions?



##########
python/datafusion/functions.py:
##########
@@ -1071,6 +1126,9 @@ def current_date() -> Expr:
     return Expr(f.current_date())
 
 
+today = current_date

Review Comment:
   This alias appears unrelated to the PR and is not added to `__all__` or 
covered by tests/docs.
   



##########
src/functions.rs:
##########
@@ -601,6 +601,9 @@ expr_fn!(
     "Converts the number to its equivalent hexadecimal representation."
 );
 expr_fn!(now);
+expr_fn_vec!(to_date);
+expr_fn_vec!(to_local_time);

Review Comment:
   `to_local_time` is exposed in Python as unary (`def to_local_time(arg: 
Expr)`). 
   Why do we need `expr_fn_vec!` here?
   
   If `expr_fn_vec!` is intentionally required by upstream API shape, a short 
comment here would prevent future confusion.



##########
python/tests/test_functions.py:
##########
@@ -1026,6 +1032,39 @@ def test_temporal_functions(df):
         [datetime(2023, 9, 7, 5, 6, 14, 523952, tzinfo=DEFAULT_TZ)] * 3,
         type=pa.timestamp("ns"),
     )
+    assert result.column(17) == pa.array(
+        [time(12, 30, 45)] * 3,
+        type=pa.time64("ns"),
+    )
+    assert result.column(18) == pa.array(
+        [time(12, 30, 45)] * 3,
+        type=pa.time64("ns"),
+    )
+    assert result.column(19) == pa.array(
+        [date(2017, 5, 31)] * 3,
+        type=pa.date32(),
+    )
+    assert result.column(20) == pa.array(
+        [date(2017, 5, 31)] * 3,
+        type=pa.date32(),
+    )
+    assert result.column(21) == pa.array(
+        [
+            datetime(2022, 12, 31, tzinfo=DEFAULT_TZ),
+            datetime(2027, 6, 26, tzinfo=DEFAULT_TZ),
+            datetime(2020, 7, 2, tzinfo=DEFAULT_TZ),
+        ],
+        type=pa.timestamp("us"),
+    )
+
+    assert result.column(22) == pa.array(
+        [
+            "31-12-2022",
+            "26-06-2027",
+            "02-07-2020",
+        ],
+        type=pa.string(),
+    )

Review Comment:
   Good positive-path coverage was added to test_temporal_functions
   
   Consider adding some focused negative-case assertion for invalid 
formatter/input for the new API (e.g. malformed `to_time` formatter), to ensure 
error surfaces are stable and user-facing diagnostics remain clear.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to