Hi Stu Hood, Yes, it may not result in extra repair, since the excess bytes of the Buffer may be same on different machine. e.g: all 0 bytes. But it depends on how the JDK (ByteArrayOutputStream) allocate memory, it is a risk for different JDK version.
In fact, we have added compression feature into cassandra-0.6 in our product, and more than one buffer objects use here, it result in problem. Schubert On Fri, Nov 12, 2010 at 1:31 AM, Stu Hood <stu.h...@rackspace.com> wrote: > At first glance, this appeared to be a very egregious bug, but the effect > is actually minimal: since the size of the buffer is deterministic based on > the size of the data, you will have equal amounts of excess/junk data for > equal rows. Combined with the fact that 0.6 doesn't reuse these buffers, I > don't think we're actually doing any extra repair. > > The problem is fixed in 0.7, but I've opened CASSANDRA-1729 to fix it in > 0.6, in case we start reusing row buffers. > > Thanks for the report! > Stu > > -----Original Message----- > From: "Schubert Zhang" <zson...@gmail.com> > Sent: Thursday, November 11, 2010 2:19am > To: dev@cassandra.apache.org, u...@cassandra.apache.org > Subject: MerkleTree.RowHash maybe a bug. > > Hi JE, > > 0.6.6: > org.apache.cassandra.service.AntiEntropyService > > I found the rowHash method uses "row.buffer.getData()" directly. > Since row.buffer.getData() is a byte[], and there may have some junk bytes > in the end by the buffer, I think we should use the exact length. > > private MerkleTree.RowHash rowHash(CompactedRow row) > { > validated++; > // MerkleTree uses XOR internally, so we want lots of output > bits here > byte[] rowhash = FBUtilities.hash("SHA-256", > row.key.key.getBytes(), row.buffer.getData()); > return new MerkleTree.RowHash(row.key.token, rowhash); > } > > > schubert.zh...@gmail.com > > >