Hi, I'm interested in adding read-time (from HDFS) to Task Metrics. The motivation is to help debug performance issues. After some digging, its briefly mentioned in SPARK-1683 that this feature didn't make it due to metric collection causing a performance regression [1].
I'd like to try tackling this, but would be very grateful if those with experience can give some more information on what was attempted previously, and why this didn't work previously. Or if there are philosophical objections to these metrics. If you feel this is a dead-end please help me from myself. Thank you, Brian [1] https://github.com/apache/spark/pull/962