szehon-ho commented on code in PR #52443:
URL: https://github.com/apache/spark/pull/52443#discussion_r2393515866


##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/ShuffleSpecSuite.scala:
##########
@@ -479,4 +480,72 @@ class ShuffleSpecSuite extends SparkFunSuite with 
SQLHelper {
         "methodName" -> "createPartitioning$",
         "className" -> 
"org.apache.spark.sql.catalyst.plans.physical.ShuffleSpec"))
   }
+
+  test("compatibility: ShufflePartitionIdPassThroughSpec on both sides") {
+    val dist = ClusteredDistribution(Seq($"a", $"b"))
+    val p1 = ShufflePartitionIdPassThrough(DirectShufflePartitionID($"a"), 10)
+    val p2 = ShufflePartitionIdPassThrough(DirectShufflePartitionID($"c"), 10)
+
+    // Identical specs should be compatible
+    checkCompatible(
+      p1.createShuffleSpec(dist),
+      p2.createShuffleSpec(ClusteredDistribution(Seq($"c", $"d"))),
+      expected = true
+    )
+
+    // Different number of partitions should be incompatible
+    val p3 = ShufflePartitionIdPassThrough(DirectShufflePartitionID($"c"), 5)
+    checkCompatible(
+      p1.createShuffleSpec(dist),
+      p3.createShuffleSpec(ClusteredDistribution(Seq($"c", $"d"))),
+      expected = false
+    )
+
+    // Mismatched key positions should be incompatible
+    val dist1 = ClusteredDistribution(Seq($"a", $"b"))

Review Comment:
   I was going to wait for address of @cloud-fan's comments :) but i agree this 
test is a bit hard to read.  The variable declaration is not so consistent (ie, 
on very top or above the check).  Also as wenchen point out, some variable like 
ClusteredDistribution(Seq($"a", $"b")) is used somewhere but not everywhere,  
ClusteredDistribution(Seq($"c", $"d")) is repeated but not variable at all.
   
   How about, we make some better names too, like ab, cd, a, b?  (or a bit 
longer if necessary)
   
   checkCompatible(
      ShufflePartitionIdPassThrough(b, 10).createDist(ab),
      ShuffelPartitionIdPassThrough(c, 10).createDist(cd)
      expected = false
   )
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to