cassandra 1.2 beta in production
Hi Guys, What known critical bugs are there that couldn't allow to use 1.2 beta 1 in production? We don't use cql and secondary indexes. -- Best regards** Zotov Alexey Grid Dynamics Skype: azotcsit
unnecessary tombstone's transmission during repair process
Hi Guys, I have a question about merkle tree construction and repair process. When mercle tree is constructing it calculates hashes. For DeletedColumn it calculates hash using value. Value of DeletedColumn is a serialized local deletion time. We know that local deletion time can be different on different nodes for the same tombstone. So hashes of the same tombstone on different nodes will be different. Is it true? I think that local deletion time shouldn't be considered in hash's calculation. We've provided several tests: // we have 3 node, RF=2, CL=QUORUM. So we have strong consistency. 1. Populate data to all nodes. Run repair process. No any streams were transmitted. It's predictable behaviour. 2. Then we removed some columns for some rows. No any nodes we down. All writes were done successfully. We run repair. There were some streams. It's strange for me, because all data should be consistent. We've created some patch and applied it. 1. Result of the first test is the same. 2. Result of the second test: there were no any unnecessary streams as I expected. My question is: Is transmission of the equals tombstones during repair process a feature? :) or is it a bug? If it's a bug, I'll create ticket and attach patch to it.
Re: unnecessary tombstone's transmission during repair process
Sylvain, I've seen to the code. Yes, you right about local deletion time. But it contradicts to the tests results. Do you have any thoughts how to explain result of the second test after patch applying? Our patch: diff --git a/src/java/org/apache/cassandra/db/DeletedColumn.java b/src/java/org/apache/cassandra/db/DeletedColumn.java index 18faeef..31744f6 100644 --- a/src/java/org/apache/cassandra/db/DeletedColumn.java +++ b/src/java/org/apache/cassandra/db/DeletedColumn.java @@ -17,10 +17,13 @@ */ package org.apache.cassandra.db; +import java.io.IOException; import java.nio.ByteBuffer; +import java.security.MessageDigest; import org.apache.cassandra.config.CFMetaData; import org.apache.cassandra.db.marshal.MarshalException; +import org.apache.cassandra.io.util.DataOutputBuffer; import org.apache.cassandra.utils.Allocator; import org.apache.cassandra.utils.ByteBufferUtil; import org.apache.cassandra.utils.HeapAllocator; @@ -46,6 +49,25 @@ public class DeletedColumn extends Column } @Override +public void updateDigest(MessageDigest digest) { +digest.update(name.duplicate()); +// it's commented to prevent consideration of the localDeletionTime in Merkle Tree construction +//digest.update(value.duplicate()); + +DataOutputBuffer buffer = new DataOutputBuffer(); +try +{ +buffer.writeLong(timestamp); +buffer.writeByte(serializationFlags()); +} +catch (IOException e) +{ +throw new RuntimeException(e); +} +digest.update(buffer.getData(), 0, buffer.getLength()); +} + +@Override public long getMarkedForDeleteAt() { return timestamp; -- Best regards** Zotov Alexey Grid Dynamics Skype: azotcsit
Re: Cassandra nodes loaded unequally
Hi Ben, I suggest you to compare amount of queries for each node. May be the problem is on the client side. Yoy can do that using JMX: "org.apache.cassandra.db:type=ColumnFamilies,keyspace=,columnfamily=","ReadCount" "org.apache.cassandra.db:type=ColumnFamilies,keyspace=,columnfamily=","WriteCount" Also I suggest to check output of "nodetool compactionstats". -- Alexey
Re: unnecessary tombstone's transmission during repair process
Gus, we've found the cause. It was a problem in Cassandra, but it has been already fixed in cassandra 1.1.6. Commit with the problem: 2c69e2ea757be9492a095aa22b5d51234c4b4102 You can see it at https://issues.apache.org/jira/secure/attachment/12544204/CASSANDRA-4561-CS.patch Commit with the fix: 988ea81d409968614d84dacb3a022dcb156172c3 There is no ticket in JIRA about that commit (at least I couldn't find the ticket). Also our client node just was not synchronized accordingly Cassandra's nodes. Client node lived "in the future" (just a few minutes). So that's the cause of described streams during repair process. Thanks all for the discussion!