Hi Radim,

Currently it's CPU-intensive for several reasons:
1) It doesn't yet use the native CRC code
2) It makes several unnecessary copies and byte buffer allocations, both in
the client and in the DataNode

There are open JIRAs for these, and I have a preliminary patch which helped
a lot, but it hasn't been high priority. On most clusters, writing becomes
network bound before being CPU-bound. On the other hand, as 10gbe is
becoming fairly common, this will probably be more important soon. Hoping
to find time to get back to finishing the patches in the next few months.

-Todd

On Sun, Nov 25, 2012 at 1:41 PM, Radim Kolar <h...@filez.com> wrote:

> anybody tried to profile why HDFS write path is so much CPU intensive?
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Reply via email to