Cassandra 2 Upgrade

2013-09-11 Thread Christopher Wirt
Hello, I'm keen on moving to 2.0. The new thrift server implementation and other performance improvements are getting me excited. I'm currently running 1.2.8 in 3 DC's with 3-3-9 nodes 64GB RAM, 3x200GB SSDs, thrift, LCS, Snappy, Vnodes, Is anyone using 2.0 in production yet? Had any is

Re: Composite Column Grouping

2013-09-11 Thread Laing, Michael
Then you can do this. I handle millions of entries this way and it works well if you are mostly interested in recent activity. If you need to span all activity then you can use a separate table to maintain the 'latest'. This table should also be sharded as entries will be 'hot'. Sharding will spre

Re: read consistency and clock drift and ntp

2013-09-11 Thread Paulo Motta
Here some links related do C* and clock synchronization: http://www.datastax.com/dev/blog/why-cassandra-doesnt-need-vector-clocks http://ria101.wordpress.com/2011/02/08/cassandra-the-importance-of-system-clocks-avoiding-oom-and-how-to-escape-oom-meltdown/ 2013/9/11 Jimmy Lin > hi, > I have

Re: Long running nodetool move operation

2013-09-11 Thread Ike Walker
The restart worked. Thanks, Rob! After the restart I ran 'nodetool move' again, used 'nodetool netstats | grep -v "0%"' to verify that data was actively streaming, and the move completed successfully. -Ike On Sep 10, 2013, at 11:04 AM, Ike Walker wrote: > Below is the output of "nodetool ne

Re: FileNotFoundException while inserting (1.2.8)

2013-09-11 Thread Keith Freeman
Yes, I started with a fresh keyspace (dropped and re-created) to run this test. On 09/10/2013 02:01 PM, sankalp kohli wrote: Have you dropped and recreated a keyspace with the same name recently? On Tue, Sep 10, 2013 at 8:40 AM, Keith Freeman <8fo...@gmail.com > wrot

Re: Error during startup - java.lang.OutOfMemoryError: unable to create new native thread

2013-09-11 Thread srmore
Thanks Viktor, - check (cassandra-env.sh) -Xss size, you may need to increase it for your JVM; This seems to have done the trick ! Thanks ! On Tue, Sep 10, 2013 at 12:46 AM, Viktor Jevdokimov < viktor.jevdoki...@adform.com> wrote: > For start: > > - check (cassandra-env.sh) -Xss size, yo

cass 1.2.8 -> 1.2.9

2013-09-11 Thread Christopher Wirt
Anyone had issues upgrading to 1.2.9? I tried upgrading one server in a three node DC. The server appeared to come online fine without any errors, handshaking, etc. looking at tpstats the machine was serving very few reads. Looking from the server side we were getting a lot of Unavailable

Re: making sure 1 copy per availability zone(rack) using EC2Snitch

2013-09-11 Thread rash aroskar
Thanks that is helpful. On Tue, Sep 10, 2013 at 3:52 PM, Robert Coli wrote: > On Mon, Sep 9, 2013 at 11:21 AM, rash aroskar wrote: > >> Are you suggesting deploying 1.2.9 only if using Cassandra "DC" outside >> of EC2 or if I wish to use rack replication at all? >> > > 1) use 1.2.9 no matter wh

Re: heavy insert load overloads CPUs, with MutationStage pending

2013-09-11 Thread Keith Freeman
I have RF=2 On 09/10/2013 11:18 AM, Robert Coli wrote: On Tue, Sep 10, 2013 at 10:17 AM, Robert Coli > wrote: On Tue, Sep 10, 2013 at 7:55 AM, Keith Freeman <8fo...@gmail.com > wrote: On my 3-node cluster (v1.2.8) with 4-cor

Re: heavy insert load overloads CPUs, with MutationStage pending

2013-09-11 Thread Keith Freeman
On 09/10/2013 11:42 AM, Nate McCall wrote: With SSDs, you can turn up memtable_flush_writers - try 3 initially (1 by default) and see what happens. However, given that there are no entries in 'All time blocked' for such, they may be something else. Tried that, it seems to have reduced the loads

Re: FileNotFoundException while inserting (1.2.8)

2013-09-11 Thread Robert Coli
On Wed, Sep 11, 2013 at 6:49 AM, Keith Freeman <8fo...@gmail.com> wrote: > Yes, I started with a fresh keyspace (dropped and re-created) to run this > test. > https://issues.apache.org/jira/browse/CASSANDRA-4219 =Rob

Re: FileNotFoundException while inserting (1.2.8)

2013-09-11 Thread Robert Coli
On Wed, Sep 11, 2013 at 10:12 AM, Keith Freeman <8fo...@gmail.com> wrote: > I had seen that issue before, but it's marked Resolved/Fixed in v1.1.1, > and I'm on v1.2.8. Also it talks about not being able to re-create the > keyspace, while my problem is that after re-creating, I eventually get >

Re: Cassandra 2 Upgrade

2013-09-11 Thread Robert Coli
On Wed, Sep 11, 2013 at 2:59 AM, Christopher Wirt wrote: > I’m keen on moving to 2.0. The new thrift server implementation and other > performance improvements are getting me excited. > > ** > > I’m currently running 1.2.8 in 3 DC’s with 3-3-9 nodes 64GB RAM, 3x200GB > SSDs, thrift, LCS, Snappy,

Re: FileNotFoundException while inserting (1.2.8)

2013-09-11 Thread Keith Freeman
I had seen that issue before, but it's marked Resolved/Fixed in v1.1.1, and I'm on v1.2.8. Also it talks about not being able to re-create the keyspace, while my problem is that after re-creating, I eventually get FileNotFound exceptions. This has happened to me several times in testing, this

Re: Cassandra 2 Upgrade

2013-09-11 Thread Robert Coli
On Wed, Sep 11, 2013 at 2:59 AM, Christopher Wirt wrote: > Am I ok running a mixed cluster for 24 hours? E.g. I switched just one DC > for 24 hours as a test. > (missed this line) This is unsupported, generally. It may or may not work. I wouldn't do it. =Rob

Re: Composite Column Grouping

2013-09-11 Thread Laing, Michael
Here's a slightly better version and a python script. -ml -- put this in and run using 'cqlsh -f DROP KEYSPACE latest; CREATE KEYSPACE latest WITH replication = { 'class': 'SimpleStrategy', 'replication_factor' : 1 }; USE latest; CREATE TABLE time_series ( bucket_userid text, --

RE: heavy insert load overloads CPUs, with MutationStage pending

2013-09-11 Thread Paul Cichonski
How much of the data you are writing is going against the same row key? I've experienced some issues using CQL to write a full wide-row at once (across multiple threads) that exhibited some of the symptoms you have described (i.e., high cpu, dropped mutations). This question goes into it a bi

Re: cqlsh error after enabling encryption

2013-09-11 Thread Les Hazlewood
bump. Any ideas? We're seeing the same issue on 2.0 as well. Thanks! On Tue, Sep 3, 2013 at 2:20 PM, David Laube wrote: > Hi All, > > After enabling encryption on our Cassandra 1.2.8 nodes, we receiving the > error "Connection error: TSocket read 0 bytes" while attempting to use CQLsh > to tal

Complex JSON objects

2013-09-11 Thread Hartzman, Leslie
Hi, What would be the recommended way to deal with a complex JSON structure, short of storing the whole JSON as a value to a column? What options are there to store dynamic data like this? e.g., { " readings": [ { "value" : 20, "ra

Re: heavy insert load overloads CPUs, with MutationStage pending

2013-09-11 Thread Keith Freeman
Thanks, I had seen your stackoverflow post. I've got hundreds of (wide-) rows, and the writes are pretty well distributed across them. I'm very reluctant to drop back to the thrift interface. On 09/11/2013 10:46 AM, Paul Cichonski wrote: How much of the data you are writing is going against

Re: Complex JSON objects

2013-09-11 Thread Edward Capriolo
I was playing a while back with the concept of storing JSON into cassandra columns in a sortable way. Warning: This is kinda just a cool idea, I never productionized it. https://github.com/edwardcapriolo/Cassandra-AnyType On Wed, Sep 11, 2013 at 2:26 PM, Hartzman, Leslie < leslie.d.hartz...@med

RE: heavy insert load overloads CPUs, with MutationStage pending

2013-09-11 Thread Paul Cichonski
I was reluctant to use the thrift as well, and I spent about a week trying to get the CQL inserts to work by partitioning the INSERTS in different ways and tuning the cluster. However, nothing worked remotely as well as the batch_mutate when it came to writing a full wide-row at once. I think C

Re: Cassandra input paging for Hadoop

2013-09-11 Thread Jiaan Zeng
Speaking of thrift client, i.e. ColumnFamilyInputFormat, yes, ConfigHelper.setRangeBatchSize() can reduce the number of rows sent to Cassandra. Depend on how big your column is, you may also want to increase thrift message length through setThriftMaxMessageLengthInMb(). Hope that helps. On Tue,

Re: Complex JSON objects

2013-09-11 Thread Paulo Motta
What you can do to store a complex json object in a C* skinny row is to serialize each field independently as a Json String and store each field as a C* column within the same row (representing a JSON object). So using the example you mentioned, you could store it in cassandra as: ColumnFamily["o

Re: Complex JSON objects

2013-09-11 Thread Laing, Michael
A way to do this would be to express the JSON structure as (path, value) tuples and then use a map to store them. For example, your JSON above can be expressed as shown below where the path is a list of keys/indices and the value is a scalar. You could also concatenate the path elements and use t

Re: is the select result grouped by the value of the partition key?

2013-09-11 Thread John Lumby
I would like to make quite sure about this implicit GROUP BY "feature", since it seems really important yet does not seem to be mentioned in the CQL reference documentation. Aaron,   you said "yes"  --   is that "yes,  always,   in all scenarios no matter what" or "yes usually"?  Is it so

Re: heavy insert load overloads CPUs, with MutationStage pending

2013-09-11 Thread Keith Freeman
I have the defaults as shown in your response. On 09/10/2013 01:59 PM, sankalp kohli wrote: What have you set these to? # commitlog_sync may be either "periodic" or "batch." # When in batch mode, Cassandra won't ack writes until the commit log # has been fsynced to disk. It will wait up to # co

VMs versus Physical machines

2013-09-11 Thread Shahab Yunus
Hello, We are deciding whether to get VMs or physical machines for a Cassandra cluster. I know this is a very high-level question depending on lots of factors and in fact I want to know that how to tackle this is and what factors should we take into consideration while trying to find the answer.

Re: VMs versus Physical machines

2013-09-11 Thread Aaron Turner
Physical machines unless you're running your cluster in the cloud (AWS/etc). Reason is simple: Look how Cassandra scales and provides redundancy. Aaron Turner http://synfin.net/ Twitter: @synfinatic https://github.com/synfinatic/tcpreplay - Pcap editing and replay tools for Unix & Windows

Re: VMs versus Physical machines

2013-09-11 Thread Shahab Yunus
Thanks Aaron for the reply. Yes, VMs or the nodes will be in cloud if we don't go the physical route. " Look how Cassandra scales and provides redundancy. " But how does it differ for physical machines or VMs (in cloud.) Or after your first comment, are you saying that there is no difference whet

Re: VMs versus Physical machines

2013-09-11 Thread Robert Coli
On Wed, Sep 11, 2013 at 4:40 PM, Shahab Yunus wrote: > But how does it differ for physical machines or VMs (in cloud.) Or after > your first comment, are you saying that there is no difference whether we > use physical or VMs (in cloud)? > Physical will always outperform virtual. He's just saying

Re: is the select result grouped by the value of the partition key?

2013-09-11 Thread Aaron Morton
> GROUP BY "feature", I would not think of it like that, this is about physical order of rows. > since it seems really important yet does not seem to be mentioned in the > CQL reference documentation. It's baked in, this is how the data is organised on the row. http://www.datastax.com/dev/blog

Re: Cassandra input paging for Hadoop

2013-09-11 Thread Aaron Morton
>> >> I'm looking at the ConfigHelper.setRangeBatchSize() and >> CqlConfigHelper.setInputCQLPageRowSize() methods, but a bit confused if >> that's what I need and if yes, which one should I use for those purposes. If you are using CQL 3 via Hadoop CqlConfigHelper.setInputCQLPageRowSize is the one

Re: FileNotFoundException while inserting (1.2.8)

2013-09-11 Thread Sankalp Kohli
The reason this is happening is that there are two instances of SStablereader object. A restart of Cassandra will fix the issue. On Sep 11, 2013, at 10:23, Robert Coli wrote: > On Wed, Sep 11, 2013 at 10:12 AM, Keith Freeman <8fo...@gmail.com> wrote: >> I had seen that issue before, but it'

Re: VMs versus Physical machines

2013-09-11 Thread Aaron Turner
On Wed, Sep 11, 2013 at 4:40 PM, Shahab Yunus wrote: > Thanks Aaron for the reply. Yes, VMs or the nodes will be in cloud if we > don't go the physical route. > > " Look how Cassandra scales and provides redundancy. " > But how does it differ for physical machines or VMs (in cloud.) Or after > yo

Re[2]: Cassandra input paging for Hadoop

2013-09-11 Thread Renat Gilfanov
Hello, So it means that job will process only first "cassandra.input.page.row.size" rows, and ignore the rest? Or CqlPagingRecordReader supports paging through the entire result set?   Aaron Morton : >>> >>>I'm looking at the ConfigHelper.setRangeBatchSize() and >>>CqlConfigHelper.setInputCQL