At first glance, this appeared to be a very egregious bug, but the effect is 
actually minimal: since the size of the buffer is deterministic based on the 
size of the data, you will have equal amounts of excess/junk data for equal 
rows. Combined with the fact that 0.6 doesn't reuse these buffers, I don't 
think we're actually doing any extra repair.

The problem is fixed in 0.7, but I've opened CASSANDRA-1729 to fix it in 0.6, 
in case we start reusing row buffers.

Thanks for the report!
Stu

-----Original Message-----
From: "Schubert Zhang" <zson...@gmail.com>
Sent: Thursday, November 11, 2010 2:19am
To: dev@cassandra.apache.org, u...@cassandra.apache.org
Subject: MerkleTree.RowHash maybe a bug.

Hi JE,

0.6.6:
org.apache.cassandra.service.AntiEntropyService

I found the rowHash method uses "row.buffer.getData()" directly.
Since row.buffer.getData()  is a byte[], and there may have some junk bytes
in the end by the buffer, I think we should use the exact length.

        private MerkleTree.RowHash rowHash(CompactedRow row)
        {
            validated++;
            // MerkleTree uses XOR internally, so we want lots of output
bits here
            byte[] rowhash = FBUtilities.hash("SHA-256",
row.key.key.getBytes(), row.buffer.getData());
            return new MerkleTree.RowHash(row.key.token, rowhash);
        }


schubert.zh...@gmail.com


Reply via email to