cassandra 1.2 beta in production

2012-10-10 Thread Alexey Zotov
Hi Guys,

What known critical bugs are there that couldn't allow to use 1.2 beta 1 in
production?
We don't use cql and secondary indexes.


-- 

Best regards**

Zotov Alexey
Grid Dynamics
Skype: azotcsit


unnecessary tombstone's transmission during repair process

2012-10-11 Thread Alexey Zotov
Hi Guys,

I have a question about merkle tree construction and repair process. When
mercle tree is constructing it calculates hashes. For DeletedColumn it
calculates hash using value. Value of DeletedColumn is a serialized local
deletion time. We know that local deletion time can be different on
different nodes for the same tombstone. So hashes of the same tombstone on
different nodes will be different. Is it true? I think that local deletion
time shouldn't be considered in hash's calculation.

We've provided several tests:
// we have 3 node, RF=2, CL=QUORUM. So we have strong consistency.
1. Populate data to all nodes. Run repair process. No any streams were
transmitted. It's predictable behaviour.
2. Then we removed some columns for some rows. No any nodes we down. All
writes were done successfully. We run repair. There were some streams. It's
strange for me, because all data should be consistent.

We've created some patch and applied it.
1. Result of the first test is the same.
2. Result of the second test: there were no any unnecessary streams as I
expected.


My question is:
Is transmission of the equals tombstones during repair process a feature?
:) or is it a bug?
If it's a bug, I'll create ticket and attach patch to it.


Re: unnecessary tombstone's transmission during repair process

2012-10-12 Thread Alexey Zotov
Sylvain,

I've seen to the code. Yes, you right about local deletion time. But it
contradicts to the tests results.

Do you have any thoughts how to explain result of the second test after
patch applying?


Our patch:

diff --git a/src/java/org/apache/cassandra/db/DeletedColumn.java
b/src/java/org/apache/cassandra/db/DeletedColumn.java
index 18faeef..31744f6 100644
--- a/src/java/org/apache/cassandra/db/DeletedColumn.java
+++ b/src/java/org/apache/cassandra/db/DeletedColumn.java
@@ -17,10 +17,13 @@
  */
 package org.apache.cassandra.db;

+import java.io.IOException;
 import java.nio.ByteBuffer;
+import java.security.MessageDigest;

 import org.apache.cassandra.config.CFMetaData;
 import org.apache.cassandra.db.marshal.MarshalException;
+import org.apache.cassandra.io.util.DataOutputBuffer;
 import org.apache.cassandra.utils.Allocator;
 import org.apache.cassandra.utils.ByteBufferUtil;
 import org.apache.cassandra.utils.HeapAllocator;
@@ -46,6 +49,25 @@ public class DeletedColumn extends Column
 }

 @Override
+public void updateDigest(MessageDigest digest) {
+digest.update(name.duplicate());
+// it's commented to prevent consideration of the
localDeletionTime in Merkle Tree construction
+//digest.update(value.duplicate());
+
+DataOutputBuffer buffer = new DataOutputBuffer();
+try
+{
+buffer.writeLong(timestamp);
+buffer.writeByte(serializationFlags());
+}
+catch (IOException e)
+{
+throw new RuntimeException(e);
+}
+digest.update(buffer.getData(), 0, buffer.getLength());
+}
+
+@Override
 public long getMarkedForDeleteAt()
 {
 return timestamp;




-- 

Best regards**

Zotov Alexey
Grid Dynamics
Skype: azotcsit


Re: Cassandra nodes loaded unequally

2012-10-12 Thread Alexey Zotov
Hi Ben,

I suggest you to compare amount of queries for each node. May be the
problem is on the client side.
Yoy can do that using JMX:
"org.apache.cassandra.db:type=ColumnFamilies,keyspace=,columnfamily=","ReadCount"
"org.apache.cassandra.db:type=ColumnFamilies,keyspace=,columnfamily=","WriteCount"

Also I suggest to check output of "nodetool compactionstats".

-- 
Alexey


Re: unnecessary tombstone's transmission during repair process

2012-10-19 Thread Alexey Zotov
Gus, we've found the cause.

It was a problem in Cassandra, but it has been already fixed in cassandra
1.1.6.

Commit with the problem:
2c69e2ea757be9492a095aa22b5d51234c4b4102
You can see it at
https://issues.apache.org/jira/secure/attachment/12544204/CASSANDRA-4561-CS.patch

Commit with the fix:
988ea81d409968614d84dacb3a022dcb156172c3
There is no ticket in JIRA about that commit (at least I couldn't find the
ticket).

Also our client node just was not synchronized accordingly Cassandra's
nodes. Client node lived "in the future" (just a few minutes).

So that's the cause of described streams during repair process.

Thanks all for the discussion!