> On 27 Apr 2016, at 04:59, Takeshi Yamamuro <linguin....@gmail.com> wrote:
> 
> Hi, all
> 
> See SPARK-1529 for related discussion.
> 
> // maropu


I'd not seen that discussion.

I'm actually curious about why the 15% diff in performance between Java NIO and 
Hadoop FS APIs, and, if it is the case (Hadoop still uses the pre-NIO 
libraries, *has anyone thought of just fixing Hadoop Local FS codepath?*

It's not like anyone hasn't filed JIRAs on that ... it's just that nothing has 
ever got to a state where it was considered ready to adopt, where "ready" 
means: passes all unit and load tests against Linux, Unix, Windows filesystems. 
There's been some attempts, but they never quite got much engagement or 
support, especially as nio wasn't there properly until Java 7, —and Hadoop was 
stuck on java 6 support until 2015. That's no longer a constraint: someone 
could do the work, using the existing JIRAs as starting points.


If someone did do this in RawLocalFS, it'd be nice if the patch also allowed 
you to turn off CRC creation and checking. 

That's not only part of the overhead, it means that flush() doesn't, not until 
you reach the end of a CRC32 block ... so breaking what few durability 
guarantees POSIX offers.



Reply via email to