So the problem is that 99 tasks are fast (< 1 second), but 1 task is really slow (5+ hours), is that right? And your operation is graph.vertices.count? That is odd, but it could be that this job includes running previous transformations. How did you construct the graph?
On Tue, Apr 29, 2014 at 3:41 AM, gogototo <wangbi...@gmail.com> wrote: > I has an application using grapx, and some phase is very slow. > That would look great on a T-shirt! Stage Id Description Submitted Duration ▴ Tasks: > Succeeded/Total Shuffle > Read Shuffle Write > 282 reduce at VertexRDD.scala:91 2014/04/28 14:07:13 5.20 h > 100/100 3.8 MB > 419 zipPartitions at ReplicatedVertexView.scala:101 2014/04/28 22:18:37 > 5.14 h 100/100 71.3 KB 4.5 MB > > In it, you can see task info as below: > 94 5758 SUCCESS PROCESS_LOCAL BP-YZH-2-5971.360buy.com > 2014/04/28 14:07:13 > 54 ms 37.7 KB > 71 5759 SUCCESS PROCESS_LOCAL BP-YZH-2-5978.360buy.com > 2014/04/28 14:07:13 > 15 ms 38.7 KB > 14 5760 SUCCESS PROCESS_LOCAL BP-YZH-2-5977.360buy.com > 2014/04/28 14:07:16 > 585 ms 38.6 KB > 91 5761 SUCCESS PROCESS_LOCAL BP-YZH-2-5977.360buy.com > 2014/04/28 14:07:16 > 209 ms 38.3 KB > 53 5762 SUCCESS NODE_LOCAL BP-YZH-2-5977.360buy.com > 2014/04/28 14:07:19 > 5.20 h 40.8 s 39.6 KB > > And in the slow task, can see log: > 14/04/29 09:30:10 INFO > storage.BlockFetcherIterator$BasicBlockFetcherIterator: maxBytesInFlight: > 50331648, minRequest: 10066329 > 14/04/29 09:30:10 INFO > storage.BlockFetcherIterator$BasicBlockFetcherIterator: maxBytesInFlight: > 50331648, minRequest: 10066329 > 14/04/29 09:30:10 INFO > storage.BlockFetcherIterator$BasicBlockFetcherIterator: maxBytesInFlight: > 50331648, minRequest: 10066329 > 14/04/29 09:30:10 INFO > storage.BlockFetcherIterator$BasicBlockFetcherIterator: Getting 100 > non-zero-bytes blocks out of 100 blocks > 14/04/29 09:30:10 INFO > storage.BlockFetcherIterator$BasicBlockFetcherIterator: Started 45 remote > gets in 2 ms > > > Why this? How to solve it? Many Thx! > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Shuffle-phase-is-very-slow-any-help-thx-tp5004.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. >