Re:Re: Data export with consistency problem

xutom Fri, 25 Mar 2016 20:42:47 -0700

Thanks for ur reply!
I am so sorry for my poor English.
My keyspace replication is 3 and client read and write CL both are QUORUM.
If we remove the network cable of one node, import 30 million rows of data into 
that table, and thenreconnect the network cable, we export the data immediately 
and we cannot get all the 30 million rows of data.
But if we manually run ' kill -9 pid' of one node, import 30 million rows of 
data into that table, and then restart the cassandra of that node, we export 
the data immediately and we cat get all the 30 million rows of data.


By the way, we do another test: we install a C* cluster with 3 nodes, we turn 
off the 'hinted handoff', and the keyspace replication is 3, the client CL 
write and read are ALL. Then we manually kill -9 pid of one node, and there are 
just two normal nodes, then we can import data into C* cluster. Why this happen 
when there are just two normal nodes and our write CL is ALL, but we can write 
data into C* cluster.


At 2016-03-25 18:26:55, "Alain RODRIGUEZ" <arodr...@gmail.com> wrote:

Hi Jerry,

It is all a matter of replication server side and consistency level client side.




The minimal setup to ensure availability and a strong consistency is RF= 3 and 
CL = (LOCAL_)QUORUM.


This way, one node can go down, you still can reach the 2 needed nodes to 
validate your reads & writes --> Availability
And as there are 3 replica and an operation is successful if it is successful 
at least on 2 replica, at least one node will be write to and read from, 
ensuring a strong and immediate consistency (multiple reads will always return 
the same value, no matter where you read).


Were you using those settings?


reconnect the network cable, we export the data immediately and we cannot all 
the 30 million rows of data


Not sure about 'export'  and 'we cannot all the 30 million rows'. But I imagine 
you were expecting to read the 30 million rows and did not.


Hinted Handoff is an optimisation (anything you can disable is an 
optimisation),  you can't rely on an optimisation like hinted handoff.


Let me know if this answer works for you before digging any further.


Also, I removed "d...@cassandra.apache.org" as this mailing list is used by the 
developers to discuss possible issues there is no issue spotted so far, just us 
trying to understand things, let's not bother those guys unless we find an 
issue :-).


C*heers,
-----------------------
Alain Rodriguez - al...@thelastpickle.com
France


The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com


2016-03-25 2:35 GMT+01:00 xutom <xutom2...@126.com>:

Hi all,
    I have a C* cluster with five nodes and my cassandra version is 2.1.1 and 
we also enable "Hinted Handoff" . Everything is fine while we use C* cluster to 
store up to 10 billion rows of data. But now we have a problem. During our 
test, after we import up to 40 billion rows of data into C* cluster, we 
manually remove the network cable of one node(eg: there are 5 nodes, and we 
remove just one network cable of node to simulate minor network problem with C* 
cluster), then we  create another table and import 30 million into this table. 
Before we reconnect the network cable of that node, we export the data of the 
new table, we can export all 30 million rows many times. But after we reconnect 
the network cable, we export the data immediately and we cannot all the 30 
million rows of data. Maybe a fewer minutes later, after the C* cluster balance 
all the datas( my guess) , then we do the exporting , we could export all the 
30 million rows of data.
    Is there something wrong with "Hinted Handoff"? Whille coping data from 
coordinator node to the newer incoming node, is the newer node can response the 
client`s request? Thanks in advances!

jerry

Re:Re: Data export with consistency problem

Reply via email to