I has an application using grapx, and some phase is very slow.
Stage Id Description Submitted Duration ▴ Tasks: Succeeded/Total Shuffle Read Shuffle Write 282 reduce at VertexRDD.scala:91 2014/04/28 14:07:13 5.20 h 100/100 3.8 MB 419 zipPartitions at ReplicatedVertexView.scala:101 2014/04/28 22:18:37 5.14 h 100/100 71.3 KB 4.5 MB In it, you can see task info as below: 94 5758 SUCCESS PROCESS_LOCAL BP-YZH-2-5971.360buy.com 2014/04/28 14:07:13 54 ms 37.7 KB 71 5759 SUCCESS PROCESS_LOCAL BP-YZH-2-5978.360buy.com 2014/04/28 14:07:13 15 ms 38.7 KB 14 5760 SUCCESS PROCESS_LOCAL BP-YZH-2-5977.360buy.com 2014/04/28 14:07:16 585 ms 38.6 KB 91 5761 SUCCESS PROCESS_LOCAL BP-YZH-2-5977.360buy.com 2014/04/28 14:07:16 209 ms 38.3 KB 53 5762 SUCCESS NODE_LOCAL BP-YZH-2-5977.360buy.com 2014/04/28 14:07:19 5.20 h 40.8 s 39.6 KB And in the slow task, can see log: 14/04/29 09:30:10 INFO storage.BlockFetcherIterator$BasicBlockFetcherIterator: maxBytesInFlight: 50331648, minRequest: 10066329 14/04/29 09:30:10 INFO storage.BlockFetcherIterator$BasicBlockFetcherIterator: maxBytesInFlight: 50331648, minRequest: 10066329 14/04/29 09:30:10 INFO storage.BlockFetcherIterator$BasicBlockFetcherIterator: maxBytesInFlight: 50331648, minRequest: 10066329 14/04/29 09:30:10 INFO storage.BlockFetcherIterator$BasicBlockFetcherIterator: Getting 100 non-zero-bytes blocks out of 100 blocks 14/04/29 09:30:10 INFO storage.BlockFetcherIterator$BasicBlockFetcherIterator: Started 45 remote gets in 2 ms Why this? How to solve it? Many Thx! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Shuffle-phase-is-very-slow-any-help-thx-tp5004.html Sent from the Apache Spark User List mailing list archive at Nabble.com.