Re: Is Hadoop SequenceFile binary safe?

2013-05-02 Thread Hs
0 Chris Douglas > You're not missing anything, but the probability of a 16 (thought it > was 20?) byte collision with random bytes is vanishingly small. -C > > On Sat, Apr 27, 2013 at 4:30 AM, Hs wrote: > > Hi, > > > > I am learning hadoop. I read the SequenceFile.j

Is Hadoop SequenceFile binary safe?

2013-04-27 Thread Hs
Hi, I am learning hadoop. I read the SequenceFile.java in hadoop-1.0.4 source codes. And I find the sync(long position) method which is used to find a "sync marker" (a 16 bytes MD5 when generated at file creation time) in SequenceFile when splitting SequenceFile into splits in MapReduce. /** See

Re: Is it possible to read a corrupted Sequence File?

2012-11-23 Thread Hs
Could you please provide a little more detail? Should I run "hadoop fsck / -move " first to move broken files into /lost+found and then repair them? Or, I can repair them directly in current path? Thanks! 2012/11/24 Radim Kolar > > I wonder if I can read these corrupted SequenceFiles with mi

Is it possible to read a corrupted Sequence File?

2012-11-23 Thread Hs
Hi, I am running hadoop 1.0.3 and hbase-0.94.0on a 12-node cluster. For unknown operational faults, 6 datanodes have suffered a complete data loss(hdfs data directory gone). When I restart hadoop, it reports "*The ratio of reported blocks 0.8252*". I have a folder in hdfs containing many import