Overhead of data types in cassandra

2016-09-08 Thread Alexandr Porunov
Hello, Where can I find information about overhead of data types in cassandra? I am interested about blob, text, uuid, timeuuid data types. Does a blob type store a value with the length of the blob data? If yes then which type of the length it is using (int, bigint)? If I want to store 80 bits ho

Re: Finding records that exist on Cassandra but not externally

2016-09-08 Thread Jens Rantil
Hi again Chris, Another option would be to have a look at using a Merkle Tree to quickly drill down to the differences. This is actually what Cassandra uses internally when running a repair between different nodes. Cheers, Jens On Wed, Sep 7, 2016 at 9:47 AM wrote: > First off I hope this appr

Re: Guidelines for configuring Thresholds for Cassandra metrics

2016-09-08 Thread Benedict Elliott Smith
The thing is, this isn't about opinions. I don't really want to get into an argument either, but characterising my statements as opinion does invite me to respond with their factual basis... The only legitimate opinions here would be around the prevalence of cluster characteristics, of which we b

Re: Guidelines for configuring Thresholds for Cassandra metrics

2016-09-08 Thread Ariel Weisberg
Hi, Apologies for a dev related hijack. We can continue the dev related discussion in JIRA or the dev list. Seeing this discussion makes me think that Benedict and Ryan you will both have opinions about https://issues.apache.org/jira/browse/CASSANDRA-12372 (remove memtable_cleanup_threshold, not

Re: Overhead of data types in cassandra

2016-09-08 Thread Oleksandr Petrov
You can find the information about that in Cassandra source code, for example. Search for serializers, like BytesSerializer: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/serializers/BytesSerializer.java to get an idea how the data is serialized. But I'd also check o

Is it safe to change RF in this situation?

2016-09-08 Thread Benyi Wang
* I have a keyspace with RF=2; * The client read the table using LOCAL_ONE; * There is a batch job loading data into the tables using ALL. I want to change RF to 3 and both the client and the batch job use LOCAL_QUORUM. My question is "Will the client still read the correct data when the repair i

Re: Is it safe to change RF in this situation?

2016-09-08 Thread Hannu Kröger
Hi, If you change RF=2 -> 3 first, the LOCAL_ONE reads might hit the new replica which is not there yet. So I would change LOCAL_ONE -> LOCAL_QUORUM first and then change the RF and then run the repair. LOCAL_QUORUM is effectively ALL in your case (RF=2) if you have just one DC, so you can chan

Re: Is it safe to change RF in this situation?

2016-09-08 Thread Benyi Wang
Thanks Hannu, Unfortunately, we started changing RF from 2 to 3, and did see the empty result rate is going higher. I assume that "If the LOCAL_ONE read hit the new replica which is not there yet, the CQL query will return nothing." Is my assumption correct? On Thu, Sep 8, 2016 at 11:49 AM, Hann

Re: Is it safe to change RF in this situation?

2016-09-08 Thread Hannu Kröger
Yep, you can fix it by running repair or even faster by changing the consistency level to local_quorum and deploying the new version of the app. Hannu > On 8 Sep 2016, at 17:51, Benyi Wang wrote: > > Thanks Hannu, > > Unfortunately, we started changing RF from 2 to 3, and did see the empty >

Re: Is it safe to change RF in this situation?

2016-09-08 Thread Benyi Wang
Thanks. What about this situation: * Change RF 2 => 3 * Start repair * Roll back RF 3 => 2 * repair is still running I'm wondering what the repair is trying to do? The repair is trying to fix as RF=2 or still trying to fix like RF=3? On Thu, Sep 8, 2016 at 2:53 PM, Hannu Kröger wrote: > Yep, y

Re: Is it safe to change RF in this situation?

2016-09-08 Thread Hannu Kröger
Ok, so I have to say that I’m not 100% sure how many replicas of data is it trying to maintain but it should not blow up (if repair crashes or something, it’s ok). So it should be safe to do. When the repair has run you can start with the plan I suggested and run repairs afterwards. Hannu > O

Re: Is it safe to change RF in this situation?

2016-09-08 Thread Benyi Wang
Thanks a lot. Will do as you suggested. On Thu, Sep 8, 2016 at 3:08 PM, Hannu Kröger wrote: > Ok, so I have to say that I’m not 100% sure how many replicas of data is > it trying to maintain but it should not blow up (if repair crashes or > something, it’s ok). So it should be safe to do. > > Wh

Re: Is a blob storage cost of cassandra is the same as bigint storage cost for long variables?

2016-09-08 Thread Romain Hardouin
Hi, Disk-wise it's the same because a bigint is serialized as a 8 bytes ByteBuffer and if you want to store a Long as bytes into a blob type it will take 8 bytes too, right?The difference is the validation. The blob ByteBuffer will be stored as is whereas the bigint will be validated. So technic

Partition size

2016-09-08 Thread Anshu Vajpayee
Is there any way to get partition size for a partition key ?

Re: Is a blob storage cost of cassandra is the same as bigint storage cost for long variables?

2016-09-08 Thread Alexandr Porunov
Hello Romain, Thank you very much for the explanation! I have just run a simple test to compare both situations. I have run two VM equivalent machines. Machine 1: CREATE KEYSPACE "test" WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }; CREATE TABLE test.simple ( id b