[ https://issues.apache.org/jira/browse/SPARK-51460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17935034#comment-17935034 ]
imarch1 zhang commented on SPARK-51460: --------------------------------------- [~mridul] Could you take a look at this? > Shuffle read and write are inconsistent when push-based shuffle is enabled > -------------------------------------------------------------------------- > > Key: SPARK-51460 > URL: https://issues.apache.org/jira/browse/SPARK-51460 > Project: Spark > Issue Type: Bug > Components: Shuffle > Affects Versions: 3.3.0 > Reporter: imarch1 zhang > Priority: Major > Attachments: image-2025-03-11-09-11-16-656.png > > > When push-based shuffle enabled, some spark applications in our cluster > experienced shuffle data inconsistent. The metrics of Exchange are as follows: > !image-2025-03-11-09-11-16-656.png! > As seen in the picture, reduce tasks read more data than what map tasks > write. > The only clue we find is that the number of records read by all *successful* > reduce tasks is consistent with the number of record written, which is > 1,529,614,111. We fail to find out how come additional wrong records > (1,529,974,564 - 1,529,614,111) appear in Exchange. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org