Re: Improving system design logging in spark

2016-04-21 Thread Ali Tootoonchian
Hi, My point for #2 is distinguishing between how long does it take for each task to read a data from disk and transfer it through network to targeted node. As I know (correct me if I'm wrong) block time to fetch data includes both reading a data by remote node and transferring it to requested nod

Re: Improving system design logging in spark

2016-04-20 Thread Takeshi Yamamuro
Hi, As for #1 and #2, seems it is hard to catch remote/local fetching time because they are overlapped with each other: See `ShuffleBlockFetcherIterator`. IMO the current metric there (catching block time to fetch data from a queue) is kind of enough for most of users because remote fetching could

Re: Improving system design logging in spark

2016-04-20 Thread Ted Yu
Interesting. For #3: bq. reading data from, I guess you meant reading from disk. On Wed, Apr 20, 2016 at 10:45 AM, atootoonchian wrote: > Current spark logging mechanism can be improved by adding the following > parameters. It will help in understanding system bottlenecks and provide > useful