We have a 2 data center test cassandra setup running, and are writing
to it using LOCAL_QUORUM. When reading, sometimes the data is there,
sometimes it is not, which we think is a replication issue, even
though we have left it plenty of time after the writes.
We have the following setup:
cassandr
Size and Capacity are in bytes. The RAM is consumed right after Cassandra
start (3GB heap) - the reason for this could be 400.000.000 rows on single
node, serialized bloom filters take 1,2 GB HDD space.
On Mon, Dec 3, 2012 at 10:14 AM, Maciej Miklas wrote:
> Hi,
>
> I have following Cassandra s
Hello Owen,
Seems you did not configure token for all nodes correctly. See the section
Calculating Tokens for multiple data centers here
http://www.datastax.com/docs/0.8/install/cluster_init
Best regards
Shamim
---
On Mon, Dec 3, 2012 at 4:42 PM, Owen Davies wrote:
We have a 2 data cent
Hi Shamim
I have read a bit about the Tokens. I understand how that could effect
the data distribution at first, but I don't understand if we have
specified Options: [dc1:3, dc2:3], surely after a while all the data
will be on every server?
Thanks,
Owen
On 3 December 2012 14:06, Шамим wrote:
>
Yes, it's should be. dc1:3 means u have 3 copy of every piece of row, with
local quorum you always get a good consistency from 3 nodes.
First you have to calculate token for data center dc1 and add offset 100 to
token for the second data center which will resolve your problem. After
creating th
Could you provide more details how to use it? Let's say I already have a
huge sstable. What am i supposed to do to split it?
Thank you,
Andrey
On Sat, Dec 1, 2012 at 11:29 AM, Radim Kolar wrote:
> from time to time people ask here for splitting large sstables, here is
> patch doing that
>
>
apply patch + recompile.
Define "max_sstable_size" compaction strategy property on CF you want to split,
then run compaction.
> SerializingCacheProvider reports size of 196 millions, how can I
> interpret this number?
Can you include the output from nodetoo info ?
> 2) I am using default settings besides changes described above. Since key
> cache is small, and off heap cache is active, what is taking space in Old
>>> When reading, sometimes the data is there,
>>> sometimes it is not, which we think is a replication issue, even
>>> though we have left it plenty of time after the writes.
Can you provide some more information on this ?
Are you talking about writes to one DC and reads from another ?
Cheers
> > Disabling row caching on this new column family has resolved the issue
> > for now, but, is there something fundamental about row caching that I am
> > missing?
What cache provider were you using ? Check the row_cache_provider setting in
the yaml file.
If you were using the ConcurrentLinkedH
We have written a large amount of data to Cassandra from another
database. When writing the client was set to write local quorum.
A few days after writing the data, we tried this on cassandra-cli
get example['key'][123];
Value was not found
Elapsed time: 50 msec(s).
Then a bit later
get datapoi
For background, you may find the wide row setting useful
http://www.datastax.com/docs/1.1/cluster_architecture/hadoop_integration
AFAIK all the input row readers for Hadoop do range scans. And I think the
support for setting the start and end token is used so that jobs only select
data which is
Txs Sylvain. Interesting feedback
For now higher consistency level is not fault tolerant (two node setup) but
worth to consider for other deployments.
Further investigation learns that system has quit some iowaits. It seems
that we are pushing it to far.
It is obvious that counter cf's increments
On Sat, Dec 1, 2012 at 9:29 AM, Radim Kolar wrote:
> from time to time people ask here for splitting large sstables, here is
> patch doing that
>
> https://issues.apache.org/jira/browse/CASSANDRA-4897
Interesting, thanks for the contribution! :D
For those who might find this thread via other mea
A Cassandra JVM will generally not function well with with caches and
wide rows. Probably the most important thing to understand is Ed's
point, that the row cache caches the entire row, not just the slice that
was read out. What you've seen is almost exactly the observed behaviour
I'd expect wi
Thanks! Very helpful.
On Mon, Dec 3, 2012 at 4:04 PM, aaron morton wrote:
> For background, you may find the wide row setting useful
> http://www.datastax.com/docs/1.1/cluster_architecture/hadoop_integration
>
> AFAIK all the input row readers for Hadoop do range scans. And I think the
> support
I ran into a different problem with Row cache recently, sent a message to
the list, but it didn't get picked up. I am hoping someone can help me
understand the issue. Our data also has rather wide rows, not necessarily
in the thousands range, but definitely in the upper-hundreds levels. They
ar
17 matches
Mail list logo