[ 
https://issues.apache.org/jira/browse/SPARK-51460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17935034#comment-17935034
 ] 

imarch1 zhang commented on SPARK-51460:
---------------------------------------

[~mridul] Could you take a look at this?

> Shuffle read and write are inconsistent when push-based shuffle is enabled
> --------------------------------------------------------------------------
>
>                 Key: SPARK-51460
>                 URL: https://issues.apache.org/jira/browse/SPARK-51460
>             Project: Spark
>          Issue Type: Bug
>          Components: Shuffle
>    Affects Versions: 3.3.0
>            Reporter: imarch1 zhang
>            Priority: Major
>         Attachments: image-2025-03-11-09-11-16-656.png
>
>
> When push-based shuffle enabled, some spark applications in our cluster 
> experienced shuffle data inconsistent. The metrics of Exchange are as follows:
> !image-2025-03-11-09-11-16-656.png!
> As seen in the picture, reduce tasks read more data than what map tasks 
> write. 
> The only clue we find is that the number of records read by all *successful* 
> reduce tasks is consistent with the number of record written, which is 
> 1,529,614,111. We fail to find out how come additional wrong records 
> (1,529,974,564 - 1,529,614,111) appear in Exchange.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to