Herman van Hovell created SPARK-18604:
-----------------------------------------
Summary: Collapse Window optimizer rule changes column order
Key: SPARK-18604
URL: https://issues.apache.org/jira/browse/SPARK-18604
Project: Spark
Issue Type: Improvement
Components: SQL
Reporter: Herman van Hovell
The recently added CollapseWindow optimizer rule changes the column order of
attributes. This actually modifies the schema of the logical plan (which
optimization should not do), and breaks `collect()` in a subtle way (we bind
the row encoder to the output of the logical plan and not the optimized plan).
For example the following code:
{noformat}
val customers = Seq(
("Alice", "2016-05-01", 50.00),
("Alice", "2016-05-03", 45.00),
("Alice", "2016-05-04", 55.00),
("Bob", "2016-05-01", 25.00),
("Bob", "2016-05-04", 29.00),
("Bob", "2016-05-06", 27.00)).
toDF("name", "date", "amountSpent")
// Import the window functions.
import org.apache.spark.sql.expressions.Window
import org.apache.spark.sql.functions._
// Create a window spec.
val wSpec1 = Window.partitionBy("name").orderBy("date").rowsBetween(-1, 1)
val df2 = customers
.withColumn("total", sum(customers("amountSpent")).over(wSpec1))
.withColumn("cnt", count(customers("amountSpent")).over(wSpec1))
{noformat}
...yields the following weird result:
{noformat}
+-----+----------+-----------+--------+-------------------+
| name| date|amountSpent| total| cnt|
+-----+----------+-----------+--------+-------------------+
| Bob|2016-05-01| 25.0|1.0E-323|4632796641680687104|
| Bob|2016-05-04| 29.0|1.5E-323|4635400285215260672|
| Bob|2016-05-06| 27.0|1.0E-323|4633078116657397760|
|Alice|2016-05-01| 50.0|1.0E-323|4636385447633747968|
|Alice|2016-05-03| 45.0|1.5E-323|4639481672377565184|
|Alice|2016-05-04| 55.0|1.0E-323|4636737291354636288|
+-----+----------+-----------+--------+-------------------+
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]