[jira] [Created] (HDFS-5365) Fix native library compile error on FreeBSD9

2013-10-15 Thread Radim Kolar (JIRA)
Radim Kolar created HDFS-5365: - Summary: Fix native library compile error on FreeBSD9 Key: HDFS-5365 URL: https://issues.apache.org/jira/browse/HDFS-5365 Project: Hadoop HDFS Issue Type: Bug

Re: profiling hdfs write path

2012-12-07 Thread Radim Kolar
I'm not saying that the Hadoop process is perfect, far from it, but from where I sit (like you I'm a contributor but not yet a committer) it seems to be working OK so far for both you and I. It does not work for me OK. Its way too slow. i got just 2k LOC in committed and still floating around p

Re: profiling hdfs write path

2012-12-05 Thread Radim Kolar
YARN-223 YARN-211 YARN-210 MAPREDUCE-4839 MAPREDUCE-4827

Re: profiling hdfs write path

2012-12-04 Thread Radim Kolar
Agree. Want to write some? Its not about writing patches, its about to get them committed. I have experience that getting something committed takes months even on simple patch. I have about 10 patches floating around none of them was committed in last 4 weeks. They are really simple stuff. I

Re: profiling hdfs write path

2012-12-04 Thread Radim Kolar
If you're just going to insult us, please stay away. We don't need your help unless you're going to be constructive. Good units tests will catch code modifications like: from: long getLastByteOffsetBlock() { return lastByteOffsetInBlock; } to from: long getLastByteOffsetBlo

Re: profiling hdfs write path

2012-12-04 Thread Radim Kolar
It is definitely buggy, it might not actually be faster, and it probably isn't well commented. But feel free to have a go at it. thank you for your code, i got it merged with trunk. HDFS is crap code, private methods not documented at all, and unit tests are joke. I did some random code chang

Re: profiling hdfs write path

2012-11-29 Thread Radim Kolar
> Hoping to find time to get back to finishing the patches in the next few months. Todd, just attach these pathes to jira, they do not even needs to apply cleanly to trunk. I will get them finished within day. I do not have months which i can spare on waiting for work be done by you. If you

[jira] [Created] (HDFS-4225) Improve HDFS write performance

2012-11-26 Thread Radim Kolar (JIRA)
Radim Kolar created HDFS-4225: - Summary: Improve HDFS write performance Key: HDFS-4225 URL: https://issues.apache.org/jira/browse/HDFS-4225 Project: Hadoop HDFS Issue Type: Task

Re: profiling hdfs write path

2012-11-25 Thread Radim Kolar
Currently it's CPU-intensive for several reasons: 1) It doesn't yet use the native CRC code 2) It makes several unnecessary copies and byte buffer allocations, both in the client and in the DataNode There are open JIRAs for these, and I have a preliminary patch which helped a lot, but it hasn't

Re: murmur3 instead of crc32

2012-11-25 Thread Radim Kolar
its not that big speed difference in this test: http://www.strchr.com/hash_functions#results asm version of CRC32 on i5 is fastest, but Java8 switched to murmur3 for hashing strings, i didnt get why they use it instead of *java.util.zip.CRC32. The collisions seems to be about same.*

murmur3 instead of crc32

2012-11-25 Thread Radim Kolar
i just tested C version and murmur3 32_le is about 4 times faster then CRC32. I submitted yesterday murmur3 hash support. What it takes to change checksum method, does .metadata information what hash type is used inside?

profiling hdfs write path

2012-11-25 Thread Radim Kolar
anybody tried to profile why HDFS write path is so much CPU intensive?

Re: Is it possible to read a corrupted Sequence File?

2012-11-24 Thread Radim Kolar
also if you will be seek an expert help disconnect drives with disappearing data, they will be still there in good shape as long they are not overwritten. and of course do not delete any fsimage from name node.

Re: Is it possible to read a corrupted Sequence File?

2012-11-24 Thread Radim Kolar
Could you please provide a little more detail? Its low level task, repairing seq files with header missing is not so easy. But till date we were able to repair pretty much everything people needed including corrupted hbase metadata. But its few days of work, which is far beyond you can get fo

Re: Is it possible to read a corrupted Sequence File?

2012-11-23 Thread Radim Kolar
I wonder if I can read these corrupted SequenceFiles with missing blocks skipped ? Its possible to recover existing blocks and repair seq file structure.

Re: [DISCUSS] Increasing the default block size

2012-10-16 Thread Radim Kolar
Dne 16.10.2012 8:23, Uma Maheswara Rao G napsal(a): +1 for increasing it to higher value by defaut (128/256MB) we use 128mb