Michael-J-Ward commented on code in PR #808:
URL: https://github.com/apache/datafusion-python/pull/808#discussion_r1730026984


##########
docs/source/user-guide/common-operations/windows.rst:
##########
@@ -40,54 +43,86 @@ We'll use the pokemon dataset (from Ritchie Vink) in the 
following examples.
     ctx = SessionContext()
     df = ctx.read_csv("pokemon.csv")
 
-Here is an example that shows how to compare each pokemons’s attack power with 
the average attack power in its ``"Type 1"``
+Here is an example that shows how you can compare each pokemon's speed to the 
speed of the
+previous row in the DataFrame.
 
 .. ipython:: python
 
     df.select(
         col('"Name"'),
-        col('"Attack"'),
-        f.alias(
-            f.window("avg", [col('"Attack"')], partition_by=[col('"Type 1"')]),
-            "Average Attack",
-        )
+        col('"Speed"'),
+        f.lag(col('"Speed"')).alias("Previous Speed")
     )
 
-You can also control the order in which rows are processed by window functions 
by providing
+Setting Parameters
+------------------
+
+You can control the order in which rows are processed by window functions by 
providing
 a list of ``order_by`` functions for the ``order_by`` parameter.
 
 .. ipython:: python
 
     df.select(
         col('"Name"'),
         col('"Attack"'),
-        f.alias(
-            f.window(
-                "rank",
-                [],
-                partition_by=[col('"Type 1"')],
-                order_by=[f.order_by(col('"Attack"'))],
-            ),
-            "rank",
-        ),
+        col('"Type 1"'),
+        f.rank()
+            .partition_by(col('"Type 1"'))
+            .order_by(col('"Attack"').sort(ascending=True))
+            .build()
+            .alias("rank"),
+    ).sort(col('"Type 1"').sort(), col('"Attack"').sort())

Review Comment:
   Is there any requirement or constraint on the final `sort` and the 
`partition_by / order_by` args to the window function?
   
   Or is this just for a nicer, more interpretable output?



##########
src/functions.rs:
##########
@@ -348,32 +349,8 @@ pub fn first_value(
 }
 
 #[pyfunction]
-pub fn last_value(
-    expr: PyExpr,
-    distinct: bool,
-    filter: Option<PyExpr>,
-    order_by: Option<Vec<PyExpr>>,
-    null_treatment: Option<NullTreatment>,
-) -> PyResult<PyExpr> {
-    let agg_fn = functions_aggregate::expr_fn::last_value(vec![expr.expr]);
-
-    // luckily, I can guarantee initializing a builder with an `order_by` 
default of empty vec
-    let order_by = order_by
-        .map(|x| x.into_iter().map(|x| x.expr).collect::<Vec<_>>())
-        .unwrap_or_default();
-    let mut builder = agg_fn.order_by(order_by);
-
-    if distinct {
-        builder = builder.distinct();
-    }
-
-    if let Some(filter) = filter {
-        builder = builder.filter(filter.expr);
-    }
-
-    builder = 
builder.null_treatment(null_treatment.map(DFNullTreatment::from));
-
-    Ok(builder.build()?.into())
+pub fn last_value(expr: PyExpr) -> PyExpr {

Review Comment:
   Is there anything preventing also updating of `first_value`?



##########
python/datafusion/functions.py:
##########
@@ -1479,12 +1502,17 @@ def approx_percentile_cont(
     """Returns the value that is approximately at a given percentile of 
``expr``."""
     if num_centroids is None:
         return Expr(
-            f.approx_percentile_cont(expression.expr, percentile.expr, 
distinct=distinct, num_centroids=None)
+            f.approx_percentile_cont(
+                expression.expr, percentile.expr, distinct=distinct, 
num_centroids=None
+            )
         )
 
     return Expr(
         f.approx_percentile_cont(
-            expression.expr, percentile.expr, distinct=distinct, 
num_centroids=num_centroids.expr
+            expression.expr,

Review Comment:
   Was this a `black` formatting change?
   
   If so, is our CI not catching these?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to