Re: Cassandra + Hadoop - 2 Task attempts with million of rows

2013-04-25 Thread Shamim
Hello Aaron, I have got the following Log from the server (Sorry for being late) job_201304231203_0004 attempt_201304231203_0004_m_000501_0 2013-04-23 16:09:14,196 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library 2013-04-23 16:09:14,438 INF

Secondary Index on table with a lot of data crashes Cassandra

2013-04-25 Thread Tamar Rosen
Hi, We have a case of a reproducible crash, probably due to out of memory, but I don't understand why. The installation is currently single node. We have a column family with approx 5 rows. In cql, the CF definition is: CREATE TABLE users ( user_name text PRIMARY KEY, big_json text,

Re: Secondary Index on table with a lot of data crashes Cassandra

2013-04-25 Thread Ondřej Černoš
Hi, if you are able to reproduce the issue, file a ticket on - my experience is developers respond quickly on issues that are clearly a bug. regards, ondrej cernos On Thu, Apr 25, 2013 at 10:03 AM, Tamar Rosen wrote: > Hi, > > We have a case of

Performance / limitations of WHERE ... IN queries

2013-04-25 Thread Thierry Templier
Hello, I wonder what are the performances of WHERE ... IN queries especially when the number of elements in the IN grows? Thanks very much for your help! Thierry

RE: Secondary Index on table with a lot of data crashes Cassandra

2013-04-25 Thread moshe.kranc
IMHO: user_name is not a column, it is the row key. Therefore, according to , the row does not contain a relevant column index, which causes the iterator to read each column (including value) of each row. I believe that instead of refer

Re: Repair Freeze / Gossip Invisibility / EC2 Public IP configuration

2013-04-25 Thread Ondřej Černoš
Hi, I have similar issue with stuck repair. Similar multiregion setup, only between us-east and private cloud at rackspace. The log mentiones merkle tree exchanges and I see a lot of dropped communication: I will comment on your ticket in Jira. regards, ondrej cernos On Fri, Apr 19, 2013 at

1.2.3 and 1.2.4 memory usage growth on idle cluster

2013-04-25 Thread Igor
Hello Does anybody seen memory problems on idle cluster? I have 8-node ring with cassandra 1.2.3 which never been used and stay idle for several weeks. Yesterday when I decided to upgrade it to 1.2.4 I found lot of messages like INFO 11:10:56,273 GC for ParNew: 1039 ms for 1 collections, 663

Deletes, null values

2013-04-25 Thread Alain RODRIGUEZ
Hi, I tried to delete some columns using cql2 as well as thrift on C*1.2.2 and instead of being unreachable, deleted columns have a null value. I am using no value in this CF, the only information I use is the existence of the column. So when I select all the column for a given key I have the foll

DB Change management tools for Cassandra?

2013-04-25 Thread Marko Asplund
hi, Do database change management tools similar to Liquibase and dbdeploy exist for Cassandra? I need to handle change management for CQL3 schema. thanks, marko

Re: DB Change management tools for Cassandra?

2013-04-25 Thread Brian O'Neill
I haven't seen any, which has one of our developers (CC'd) looking at extending myBatis migrations and/or Flyway with CQL to do it. -brian --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 2700 Horizon Drive € King of Prussia, PA € 19406 M

Re: Deletes, null values

2013-04-25 Thread Sorin Manolache
On 2013-04-25 11:48, Alain RODRIGUEZ wrote: Hi, I tried to delete some columns using cql2 as well as thrift on C*1.2.2 and instead of being unreachable, deleted columns have a null value. I am using no value in this CF, the only information I use is the existence of the column. So when I select

Adding nodes in 1.2 with vnodes requires huge disks

2013-04-25 Thread John Watson
After finally upgrading to 1.2.3 from 1.1.9, enabling vnodes, and running upgradesstables, I figured it would be safe to start adding nodes to the cluster. Guess not? It seems when new nodes join, they are streamed *all* sstables in the cluster.

How to change existing cluster to multi-center

2013-04-25 Thread Daning Wang
Hi All, We have 8 nodes cluster(replication factor is 3), about 50G data on each node. we need to change the cluster to multi-center environment(to EC2). the data need to have one replica on ec2. Here is the plan, - Change cluster config to mult-center. - Add 2 or 3 nodes in another center, whic

Cassandra remote backup solution

2013-04-25 Thread Daning Wang
Hi Guys, What is the cassandra solution for remote backup besides multi-center? I hope I can do incremental backup to remote database center. Thanks, Daning

Re: Cassandra remote backup solution

2013-04-25 Thread Robert Coli
On Thu, Apr 25, 2013 at 3:04 PM, Daning Wang wrote: > What is the cassandra solution for remote backup besides multi-center? I > hope I can do incremental backup to remote database center. Your semi-automated options which do not involve replicating to a remote cluster include : 1) tablesnap/ta

Re: Ec2Snitch to Ec2MultiRegionSnitch

2013-04-25 Thread aaron morton
> So you mean this part doesn't need more testing ? This will work for sure ? > Did you already did it yourself ? Always test. But if you only had one AZ then all nodes will be in one Rack, so the NTS will not behave differently. > C* will be able to reach the LOCAL_QUORUM everywhere, won't i

Re: readable (not hex encoded) column names using sstable2json

2013-04-25 Thread aaron morton
> First of all thanks for the response. We’re trying to copy existing data into > a keyspace with a different name on the same server. I’m not sure why our > operations team wants this. You can just rename the files. > java.lang.NumberFormatException: Non-hex characters in hertz.246944493-2012

Re: Advice on memory warning

2013-04-25 Thread aaron morton
There have been a lot of discussions about GC tuning on the mail thread. Here's a really quick set of guidelines I use, please search the mail archive if it does not answer your question. If heavy GC activity correlates with cassandra compaction, do one or more of: * reduce concurrent_compactio

Re: Unable to drop secondary index

2013-04-25 Thread aaron morton
You can drop the hints via JMX and stopping the node and deleting the SSTables. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton On 25/04/2013, at 12:27 AM, Michal Michalski wrote: > Not really sure if it has something

Re: Really odd issue (AWS related?)

2013-04-25 Thread aaron morton
> The messages appear right after the node "wakes up". Are you tracking CPU steal ? - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton On 25/04/2013, at 4:15 AM, Robert Coli wrote: > On Wed, Apr 24, 2013 at 5:03 AM, Michael Ther

Re: Really odd issue (AWS related?)

2013-04-25 Thread Michael Theroux
Sorry, Not sure what CPU steal is :) I have AWS console with detailed monitoring enabled... things seem to track close to the minute, so I can see the CPU load go to 0... then jump at about the minute Cassandra reports the dropped messages, -Mike On Apr 25, 2013, at 9:50 PM, aaron morton wrote

Re: Cassandra + Hadoop - 2 Task attempts with million of rows

2013-04-25 Thread aaron morton
> 2013-04-23 16:09:17,838 INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader: > Current split being processed ColumnFamilySplit((9197470410121435301, '-1] > @[p00nosql02.00, p00nosql01.00]) > Why it's split data from two nodes? we have 6 nodes cassandra cluster +

Re: Performance / limitations of WHERE ... IN queries

2013-04-25 Thread aaron morton
You are effectively doing a multi get. Getting more than one row at a time is normally faster, but there will be a drop off point where the improvements slow down. Run some tests. Also consider that each row you requests creates RF number of commands spread around the thread pools for the row.

Re: DB Change management tools for Cassandra?

2013-04-25 Thread John Sanda
I had cobbled together a solution using Liquibase and the Cassandra JDBC driver. I started implemented it before the CQL driver was announced. The solution involved a patch and some Liquibase extensions which live at The patch will go into the 3.0

vnodes and load balancing - 1.2.4

2013-04-25 Thread David McNelis
So, I had 7 nodes that I set up using vnodes, 256 tokens each, no problem. I added two 512 token nodes, no problem, things seemed to balance. The next 3 nodes I added, all at 256 tokens, and they have a cumulative load of 116mb (where as the other nodes are at ~100GB and ~200GB (256 and 512 respe

Re: DB Change management tools for Cassandra?

2013-04-25 Thread Marko Asplund
John Sanda wrote: > I had cobbled together a solution using Liquibase and the Cassandra JDBC > driver. I started implemented it before the CQL driver was announced. The > solution involved a patch and some Liquibase extensions which live at > The p