Re: Cassandra 2.0.7 keeps reporting errors due to no space left on device

2014-05-14 Thread DuyHai Doan
Thanks for the report back. If LCS falls back to SizeTiered it means that you have a workload that exposes intensive bursts on write. Maybe solving this by would be better than hard tweaking the LCS code Le 12 mai 2014 17:19, "Yatong Zhang" a écrit : Well, I finally resolved this issue by modify

Re: Bootstrap failure on C* 1.2.13

2014-05-14 Thread Paulo Ricardo Motta Gomes
Hello, After about 3 months I was able to solve this issue, which happened again after another node died. The problem is the datastax 1.2 node replacement docs [1] said that "This procedure applies to clusters using vnodes. If not using vnodes, use the instructions in the Cassandra 1.1 documentat

Backup Solution

2014-05-14 Thread ng
I want to discuss the question asked by Rene last year again. http://www.mail-archive.com/user%40cassandra.apache.org/msg28465.html Is the following a good backup solution. Create two data-centers: - A live data-center with multiple nodes (commodity hardware) (6 nodes with replication factor of

Cassandra hadoop job fails if any node is DOWN

2014-05-14 Thread Paulo Ricardo Motta Gomes
Hello, One of the nodes of our Analytics DC is dead, but ColumnFamilyInputFormat (CFIF) still assigns Hadoop input splits to it. This leads to many failed tasks and consequently a failed job. * Tasks fail with: java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: Failed to

Re: Efficient bulk range deletions without compactions by dropping SSTables.

2014-05-14 Thread Jeremy Powell
Hi Kevin, C* version: 1.2.xx Astyanax: 1.56.xx We basically do this same thing in one of our production clusters, but rather than dropping SSTables, we drop Column Families. We time-bucket our CFs, and when a CF has passed some time threshold (metadata or embedded in CF name), it is dropped. This

Re: Disable reads during node rebuild

2014-05-14 Thread Paulo Ricardo Motta Gomes
That's a nice workaround, will be really helpful in emergency situations like this. Thanks, On Mon, May 12, 2014 at 6:58 PM, Aaron Morton wrote: > I'm not able to replace a dead node using the ordinary procedure > (boostrap+join), and would like to rebuild the replacement node from > another DC

RE: Datacenter understanding question

2014-05-14 Thread Mark Farnan
Yes they will From: ng [mailto:pipeli...@gmail.com] Sent: Tuesday, May 13, 2014 11:07 PM To: user@cassandra.apache.org Subject: Datacenter understanding question If I have configuration of two data center with one node each. Replication factor is also 1. Will these 2 nodes going to be mirr

Re: NTS, vnodes and 0% chance of data loss

2014-05-14 Thread William Oberman
After sleeping on this, I'm sure my original conclusions are wrong. In all of the referenced cases/threads, I internalized "rack awareness" and "hotspots" to mean something different and wrong. A hotspot didn't mean multiple replicas in the same rack (as I had been thinking), it meant the process

Re: Really need some advices on large data considerations

2014-05-14 Thread Yatong Zhang
Thank you Aaron, but we're planning about 20T per node, is that feasible? On Mon, May 12, 2014 at 4:33 PM, Aaron Morton wrote: > We've learned that compaction strategy would be an important point cause > we've ran into 'no space' trouble because of the 'sized tiered' compaction > strategy. > >

Cassandra token range support for Hadoop (ColumnFamilyInputFormat)

2014-05-14 Thread Anton Brazhnyk
Greetings, I'm reading data from C* with Spark (via ColumnFamilyInputFormat) and I'd like to read just part of it - something like Spark's sample() function. Cassandra's API seems allow to do it with its ConfigHelper.setInputRange(jobConfiguration, startToken, endToken) method, but it doesn't w

RE: Question about READS in a multi DC environment.

2014-05-14 Thread Mark Farnan
Perfect, Thanks, that solved it. Regards Mark. From: Aaron Morton [mailto:aa...@thelastpickle.com] Sent: Monday, May 12, 2014 2:21 PM To: Cassandra User Subject: Re: Question about READS in a multi DC environment. > read_repair_chance=1.00 AND There’s your proble

Re: row caching for frequently updated column

2014-05-14 Thread Chris Burroughs
You are close. On 04/30/2014 12:41 AM, Jimmy Lin wrote: thanks all for the pointers. let' me see if I can put the sequences of event together 1.2 people mis-understand/mis-use row cache, that cassandra cached the entire row of data even if you are only looking for small subset of the row