Hi, My point for #2 is distinguishing between how long does it take for each task to read a data from disk and transfer it through network to targeted node. As I know (correct me if I'm wrong) block time to fetch data includes both reading a data by remote node and transferring it to requested node. If the block time is bigger than our expectation, from system design, we cannot identify which component is weakest link, storage or network.
-- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Improving-system-design-logging-in-spark-tp17291p17308.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org