erenavsarogullari commented on code in PR #54242:
URL: https://github.com/apache/spark/pull/54242#discussion_r2835588974
##########
sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/CoalesceShufflePartitions.scala:
##########
@@ -109,7 +109,8 @@ case class CoalesceShufflePartitions(session: SparkSession)
extends AQEShuffleRe
coalesceGroup.shuffleStages.map(_.partitionSpecs),
advisoryTargetSize = advisoryTargetSize,
minNumPartitions = minNumPartitions,
- minPartitionSize = minPartitionSize)
+ minPartitionSize = minPartitionSize,
+ coalesceGroup.shuffleStages.map(_.shuffleStage.id))
Review Comment:
Is the eventual goal as follow?
Legacy API Signature:
```
def coalescePartitions(
mapOutputStatistics: Seq[Option[MapOutputStatistics]],
inputPartitionSpecs: Seq[Option[Seq[ShufflePartitionSpec]]],
advisoryTargetSize: Long,
minNumPartitions: Int,
minPartitionSize: Long,
shuffleStageIds: Seq[Int] = Seq.empty)
```
New API Signature:
```
def coalescePartitions(
shuffleStages: Seq[ShuffleStageInfo],
advisoryTargetSize: Long,
minNumPartitions: Int,
minPartitionSize: Long)
```
If so, i think it makes sense because there are 3 parameters
(mapOutputStatistics, inputPartitionSpecs and shuffleStageIds) need to be
passed to ShufflePartitionsUtil.coalescePartitions() which are using the same
reference `coalesceGroup.shuffleStages`. However, there is one down-side:
Currently, `ShufflePartitionsUtil.coalescePartitions()` func is also used by
`ShufflePartitionsUtilSuite` and they will also need to be refactored to pass
`Seq[ShuffleStageInfo]` instead of `Seq[Option[MapOutputStatistics]]`. New way
seems to introduce more verbosity when being compared with legacy signature
change.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]