Re: Upgrading Cassandra

2014-04-10 Thread Robert Coli
On Thu, Apr 10, 2014 at 3:52 PM, Tyler Hobbs wrote: > > Given the complexity of it and the multiple places this could give you > trouble if you're not careful, I wouldn't suggest it as a general best > practice. > +1. Any plan that contains an extended period of split major version operation is

Re: Multiget performance

2014-04-10 Thread Tyler Hobbs
On Thu, Apr 10, 2014 at 6:26 PM, Allan C wrote: > > Looks like the amount of data returned has a big effect. When I only > return one column, python reports only 20ms compared to 150ms when > returning the whole row. Rows are each less than 1k in size, but there must > be client overhead. > That

Re: Upgrading Cassandra

2014-04-10 Thread Tyler Hobbs
On Thu, Apr 10, 2014 at 4:03 PM, Alain RODRIGUEZ wrote: > Would you guys consider this way of upgrading as a "best practice" to > achieve a safe major release upgrade in the cloud (where you can easily add > clusters and remove old ones) ? Given the complexity of it and the multiple places this

Re: How to replace cluster name without any impact?

2014-04-10 Thread Robert Coli
On Wed, Apr 9, 2014 at 10:50 PM, Mark Reddy wrote: > > Please be aware that you will have two partial clusters until you complete > your rolling restart. Also considering that the cluster name is only a > cosmetic value my opinion would be to leave it, as the risk far outweighs > the benefits of c

Re: Point in Time Recovery

2014-04-10 Thread Robert Coli
On Thu, Apr 10, 2014 at 1:19 AM, Dennis Schwan wrote: > do you know any description how to perform a point-in-time recovery > using the archived commitlogs? > We have already tried several things but it just did not work. > Are you restoring the entire *cluster* to a point in time, or a given nod

Re: Upgrading Cassandra

2014-04-10 Thread Alain RODRIGUEZ
Thanks for this confirmation Tyler. Would you guys consider this way of upgrading as a "best practice" to achieve a safe major release upgrade in the cloud (where you can easily add clusters and remove old ones) ? I am seriously thinking about giving it a try for our upcoming 1.2 to 2.0 migration

Re: Commitlog questions

2014-04-10 Thread Russell Hatch
> > If the commitlog is in periodic mode and the fsync happens every 10 > seconds, Cassandra is storing the stuff that needs to be sync'd somewhere > for a period of 10 seconds. I'm talking about before it even hits any > disk. This has to be in memory, correct? The information you are referri

Re: Cassandra memory consumption

2014-04-10 Thread DuyHai Doan
"what portion of the above is in the memtable ?" --> partition key + clustering key + stored data + memtable data structure size (actually it is a ConcurrentSkipListMap so I guess there is some overhead with the data structure) If the data has been "flushed" to disk (data directory) the memtable

RE: Cassandra memory consumption

2014-04-10 Thread Parag Patel
If I'm inserting the following : Partition key = 8 byte String Clustering key = 20 byte String Stored Data = 150 byte byte[] If the insert is still in the memtable, what portion of the above is in the memtable? All of it, or just the keys? If just the keys, where does the stored data live? (

Re: Cassandra memory consumption

2014-04-10 Thread DuyHai Doan
Data structures that are stored off heaps: 1) Row cache (if JNA enabled, otherwise on heap) 2) Bloom filter 3) Compression offset 4) Key Index sample On heap: 1) Memtables 2) Partition Key cache Hope that I did not forget anything Regards Duy Hai DOAN On Thu, Apr 10, 2014 at 9:13 PM, Pa

Re: binary protocol server side sockets

2014-04-10 Thread Eric Plowe
I am having the exact same issue. I see the connections pile up and pile up, but they never seem to come down. Any insight into this would be amazing. Eric Plowe On Wed, Apr 9, 2014 at 4:17 PM, graham sanderson wrote: > Thanks Michael, > > Yup keepalive is not the default. It is possible they

Cassandra memory consumption

2014-04-10 Thread Parag Patel
We're using Cassandra 1.2.12. What aspects of the data is stored in off heap memory vs heap memory?

RE: Commitlog questions

2014-04-10 Thread Parag Patel
Oleg, Thanks for the response. If the commitlog is in periodic mode and the fsync happens every 10 seconds, Cassandra is storing the stuff that needs to be sync'd somewhere for a period of 10 seconds. I'm talking about before it even hits any disk. This has to be in memory, correct? Parag

Re: Minimum database size and ops/second to start considering Cassandra

2014-04-10 Thread Tim Wintle
On Thu, 2014-04-10 at 11:17 -0700, motta.lrd wrote: > What is the minimum database size and number of Operations/Second (reads and > write) for which I should seriously consider this database? Significant number of writes / second -> possibly a good use case for cassandra. Database size is a di

Minimum database size and ops/second to start considering Cassandra

2014-04-10 Thread motta.lrd
Hello everyone, What is the minimum database size and number of Operations/Second (reads and write) for which I should seriously consider this database? I have recently studied the theoretical aspects of Cassandra distributions, and my doubts are left to what is a good fit (in terms of database s

C* 1.2.15 Decommission issues

2014-04-10 Thread Russell Bradberry
We have about a 30 node cluster running the latest C* 1.2 series DSE.  One datacenter uses VNodes and the other datacenter has VNodes Disabled (because it is running DSE-Seearch) We have been replacing nodes in the VNode datacenter with faster ones and we have yet to have a successful decommiss

More node imbalance questions

2014-04-10 Thread Oleg Dulin
At a different customer, I have this situation: 10.194.2.5RAC1Up Normal 192.2 GB50.00% 0 10.194.2.4RAC1Up Normal 348.07 GB 50.00% 127605887595351923798765477786913079295 10.194.2.7RAC1Up Normal 387.31 GB

Re: Point in Time Recovery

2014-04-10 Thread Jonathan Lacefield
Hello, Have you tried the procedure documented here: http://www.datastax.com/documentation/cassandra/1.2/cassandra/configuration/configLogArchive_t.html Thanks, Jonathan Jonathan Lacefield Solutions Architect, DataStax (404) 822 3487

Re: Commitlog questions

2014-04-10 Thread Panagiotis Garefalakis
The incoming mutations are written per column in a Memtable (an in memory cache) . The default size for this table is 64MB if I can recall correctly. For more information take a look here: https://wiki.apache.org/cassandra/MemtableSSTable http://wiki.apache.org/cassandra/MemtableThresholds Regards

AssertionError as a result of a timeout

2014-04-10 Thread Ben Hood
Hi all, This is just a follow up to say that this issue is being tracked here: https://issues.apache.org/jira/browse/CASSANDRA-6796 I managed to work around this issue for my workload by increasing the write timeout threshold in the server, but YMMV. Sorry that the original list thread had an e

Re: Multiget performance

2014-04-10 Thread DuyHai Doan
As far as I understood, the multiget performance is bound to the slowest node responding to the coordinator. If you are fetching 100 partitions within *n* nodes, the coordinator will issue requests to those nodes and wait until all the responses are given back before returning the results to the

Point in Time Recovery

2014-04-10 Thread Dennis Schwan
Hey there, do you know any description how to perform a point-in-time recovery using the archived commitlogs? We have already tried several things but it just did not work. We have a 20 Node Cluster (10 in each DC). Thanks in Advance, Dennis -- Dennis Schwan Oracle DBA Mail Core 1&1 Internet