Data not replicating to all datacenters

2012-12-03 Thread Owen Davies
We have a 2 data center test cassandra setup running, and are writing to it using LOCAL_QUORUM. When reading, sometimes the data is there, sometimes it is not, which we think is a replication issue, even though we have left it plenty of time after the writes. We have the following setup: cassandr

Re: Cassandra 1.1.5 - SerializingCacheProvider - possible memory leak?

2012-12-03 Thread Maciej Miklas
Size and Capacity are in bytes. The RAM is consumed right after Cassandra start (3GB heap) - the reason for this could be 400.000.000 rows on single node, serialized bloom filters take 1,2 GB HDD space. On Mon, Dec 3, 2012 at 10:14 AM, Maciej Miklas wrote: > Hi, > > I have following Cassandra s

Data not replicating to all datacenters

2012-12-03 Thread Шамим
Hello Owen, Seems you did not configure token for all nodes correctly. See the section Calculating Tokens for multiple data centers here http://www.datastax.com/docs/0.8/install/cluster_init Best regards Shamim --- On Mon, Dec 3, 2012 at 4:42 PM, Owen Davies wrote: We have a 2 data cent

Re: Data not replicating to all datacenters

2012-12-03 Thread Owen Davies
Hi Shamim I have read a bit about the Tokens. I understand how that could effect the data distribution at first, but I don't understand if we have specified Options: [dc1:3, dc2:3], surely after a while all the data will be on every server? Thanks, Owen On 3 December 2012 14:06, Шамим wrote: >

Re:Data not replicating to all datacenters

2012-12-03 Thread Шамим
Yes, it's should be. dc1:3 means u have 3 copy of every piece of row, with local quorum you always get a good consistency from 3 nodes. First you have to calculate token for data center dc1 and add offset 100 to token for the second data center which will resolve your problem. After creating th

Re: splitting large sstables

2012-12-03 Thread Andrey Ilinykh
Could you provide more details how to use it? Let's say I already have a huge sstable. What am i supposed to do to split it? Thank you, Andrey On Sat, Dec 1, 2012 at 11:29 AM, Radim Kolar wrote: > from time to time people ask here for splitting large sstables, here is > patch doing that > >

Re: splitting large sstables

2012-12-03 Thread Radim Kolar
apply patch + recompile. Define "max_sstable_size" compaction strategy property on CF you want to split, then run compaction.

Re: Cassandra 1.1.5 - SerializingCacheProvider - possible memory leak?

2012-12-03 Thread aaron morton
> SerializingCacheProvider reports size of 196 millions, how can I > interpret this number? Can you include the output from nodetoo info ? > 2) I am using default settings besides changes described above. Since key > cache is small, and off heap cache is active, what is taking space in Old

Re: Data not replicating to all datacenters

2012-12-03 Thread aaron morton
>>> When reading, sometimes the data is there, >>> sometimes it is not, which we think is a replication issue, even >>> though we have left it plenty of time after the writes. Can you provide some more information on this ? Are you talking about writes to one DC and reads from another ? Cheers

Re: Row caching + Wide row column family == almost crashed?

2012-12-03 Thread aaron morton
> > Disabling row caching on this new column family has resolved the issue > > for now, but, is there something fundamental about row caching that I am > > missing? What cache provider were you using ? Check the row_cache_provider setting in the yaml file. If you were using the ConcurrentLinkedH

Re: Data not replicating to all datacenters

2012-12-03 Thread Owen Davies
We have written a large amount of data to Cassandra from another database. When writing the client was set to write local quorum. A few days after writing the data, we tried this on cassandra-cli get example['key'][123]; Value was not found Elapsed time: 50 msec(s). Then a bit later get datapoi

Re: Hadoop Integration: Limiting scan to a range of keys

2012-12-03 Thread aaron morton
For background, you may find the wide row setting useful http://www.datastax.com/docs/1.1/cluster_architecture/hadoop_integration AFAIK all the input row readers for Hadoop do range scans. And I think the support for setting the start and end token is used so that jobs only select data which is

Re: failing reconcilliation during counter increment?

2012-12-03 Thread David Vanderfeesten
Txs Sylvain. Interesting feedback For now higher consistency level is not fault tolerant (two node setup) but worth to consider for other deployments. Further investigation learns that system has quit some iowaits. It seems that we are pushing it to far. It is obvious that counter cf's increments

Re: splitting large sstables

2012-12-03 Thread Rob Coli
On Sat, Dec 1, 2012 at 9:29 AM, Radim Kolar wrote: > from time to time people ask here for splitting large sstables, here is > patch doing that > > https://issues.apache.org/jira/browse/CASSANDRA-4897 Interesting, thanks for the contribution! :D For those who might find this thread via other mea

Re: Row caching + Wide row column family == almost crashed?

2012-12-03 Thread Bill de hÓra
A Cassandra JVM will generally not function well with with caches and wide rows. Probably the most important thing to understand is Ed's point, that the row cache caches the entire row, not just the slice that was read out. What you've seen is almost exactly the observed behaviour I'd expect wi

Re: Hadoop Integration: Limiting scan to a range of keys

2012-12-03 Thread Jamie Rothfeder
Thanks! Very helpful. On Mon, Dec 3, 2012 at 4:04 PM, aaron morton wrote: > For background, you may find the wide row setting useful > http://www.datastax.com/docs/1.1/cluster_architecture/hadoop_integration > > AFAIK all the input row readers for Hadoop do range scans. And I think the > support

Re: Row caching + Wide row column family == almost crashed?

2012-12-03 Thread Yiming Sun
I ran into a different problem with Row cache recently, sent a message to the list, but it didn't get picked up. I am hoping someone can help me understand the issue. Our data also has rather wide rows, not necessarily in the thousands range, but definitely in the upper-hundreds levels. They ar