Just realized I had been replying back to only Takeshi.
Thanks for tip as it got me on the right track. Running into an issue with
private [spark] methods though. It looks like the input metrics start out as
None and are not initialized (verified by throwing new Exception on pattern
match cases
How about using `SparkListener`?
You can collect IO statistics thru TaskMetrics#inputMetrics by yourself.
// maropu
On Mon, Jul 4, 2016 at 11:46 AM, Pedro Rodriguez
wrote:
> Hi All,
>
> I noticed on some Spark jobs it shows you input/output read size. I am
> implementing a custom RDD which read
Hi All,
I noticed on some Spark jobs it shows you input/output read size. I am
implementing a custom RDD which reads files and would like to report these
metrics to Spark since they are available to me.
I looked through the RDD source code and a couple different implementations and
the best I