Github user marmbrus commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11995#discussion_r57938549
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicOperators.scala
 ---
    @@ -514,12 +524,13 @@ private[sql] object Expand {
      * @param projections to apply
      * @param output of all projections.
      * @param child operator.
    + * @param groupByAttrs the attributes used in group by.
      */
     case class Expand(
         projections: Seq[Seq[Expression]],
         output: Seq[Attribute],
    -    child: LogicalPlan) extends UnaryNode {
    -
    +    child: LogicalPlan,
    +    groupByAttrs: Seq[Attribute]) extends UnaryNode {
    --- End diff --
    
    This still feels kind of hacky to me and I think it comes down to the fact 
that this operator is poorly designed.  Since we use this operator for more 
than just grouping, it seems kind of odd to add a new parameter called 
`groupByAttrs` to it.  Also, we still have the property that its not always 
going to be correct by construction.  If a user of this class incorrectly sets 
the `groupByAttrs` everything will still work, but it will just lie about its 
constraints.
    
    I think the root problem is that `projections` and `output` are defined 
separately in the constructor.  Everywhere else in logical plans, you either 
have an `AttributeReference` or, if you are producing a new value, you have an 
`Alias`.  When you follow this pattern, the constraint system just works.
    
    However, in `Expand` we have logic that decides to replace a column with 
`null` (which should be a new `AttributeReference`), but instead we impersonate 
the original value.
    
    Until we come up with a principled solution, maybe we should just set 
`validConstraints` to be empty?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to