Reynold you're totally right, as discussed offline -- I didn't think about
the limit use case when I wrote this. Sandy, is it easy to fix this as
part of your patch to use StatisticsData? If not, I can fix it in a
separate patch.
On Sat, Jul 26, 2014 at 12:12 PM, Reynold Xin wrote:
> That mak
That makes sense, Sandy.
When you add the patch, can you make sure you comment inline on why the
fallback is needed?
On Sat, Jul 26, 2014 at 11:46 AM, Sandy Ryza
wrote:
> I'm working on a patch that switches this stuff out with the Hadoop
> FileSystem StatisticsData, which will both give an a
I'm working on a patch that switches this stuff out with the Hadoop
FileSystem StatisticsData, which will both give an accurate count and allow
us to get metrics while the task is in progress. A hitch is that it relies
on https://issues.apache.org/jira/browse/HADOOP-10688, so we still might
want a
There is one piece of information that'd be useful to know, which is the
source of the input. Even in the presence of an IOException, the input
metrics still specifies the task is reading from Hadoop.
However, I'm slightly confused by this -- I think usually we'd want to
report the number of bytes