Re: libhdfs portability

2013-10-28 Thread Colin McCabe
On Mon, Oct 28, 2013 at 4:24 PM, Kyle Sletmoe wrote: > I have written a WebHDFSClient and I do not believe that reusing > connections is enough to noticeably speed up transfers in my case. I did > some tests and on average it took roughly 14 minutes to transfer a 3.6 GB > file to an HDFS on my loc

Re: libhdfs portability

2013-10-28 Thread Kyle Sletmoe
I have written a WebHDFSClient and I do not believe that reusing connections is enough to noticeably speed up transfers in my case. I did some tests and on average it took roughly 14 minutes to transfer a 3.6 GB file to an HDFS on my local network (I tried the same operation using cURL, with simila

Re: libhdfs portability

2013-10-28 Thread Haohui Mai
I believe that the WebHDFS API is your best bet for now. The current implementation of WebHDFSClient does not reuse the HTTP connections, which leads to a large part of the performance penalty. You might want to implement your own version that reuses HTTP connection to see whether it meets your pe