date:20120531

Re: Cassandra Data Archiving

2012-05-31 Thread Shubham Srivastava

Samal that's pretty smart stuff From: samal [mailto:samalgo...@gmail.com] Sent: Friday, June 01, 2012 11:24 AM To: user@cassandra.apache.org Subject: Re: Cassandra Data Archiving I believe you are talking about "HDD space", consumed by user generated data which is no longer required after 15 d

Re: Cassandra Data Archiving

2012-05-31 Thread samal

I believe you are talking about "HDD space", consumed by user generated data which is no longer required after 15 days or may required. First case to use TTL which you don't wan to use. 2nd as aaron pointed snapshotting data, but data still exist in cluster, only used for back up. I think of like

Re: Cassandra Data Archiving

2012-05-31 Thread Zhu Han

On Fri, Jun 1, 2012 at 12:28 PM, Harshvardhan Ojha < harshvardhan.o...@makemytrip.com> wrote: > Problem statement: > > We are keeping daily generated data(user generated content) in > Cassandra, but our application is using only 15 days old data. So how can > we archive data older than 15 da

RE: Cassandra Data Archiving

2012-05-31 Thread Harshvardhan Ojha

Problem statement: We are keeping daily generated data(user generated content) in Cassandra, but our application is using only 15 days old data. So how can we archive data older than 15 days so that we can reduce load on Cassandra ring. Note : we can't apply TTL, as this data may be needed in f

Re: 1.1 not removing commit log files?

2012-05-31 Thread aaron morton

Could be this https://issues.apache.org/jira/browse/CASSANDRA-4201 But that talks about segments not being cleared at startup. Does not explain why they were allowed to get past the limit in the first place. Can you share some logs from the time the commit log got out of control ? Cheers --

Re: newbie question :got error 'org.apache.thrift.transport.TTransportException'

2012-05-31 Thread aaron morton

Sounds like https://issues.apache.org/jira/browse/CASSANDRA-4219?attachmentOrder=desc Drop back to 1.0.10 and have a play. Good luck. - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 1/06/2012, at 6:38 AM, Chen, Simon wrote: > Hi, > I am new

Re: tokens and RF for multiple phases of deployment

2012-05-31 Thread aaron morton

> The ring (2 in DC1, 1 in DC2) looks OK, but the load on the new node in DC2 > is almost 0%. yeah, thats the way it will look. > But all the other rows are not in the new node. Do I need to copy the data > files from a node in DC1 to the new node? How did you add the node ? (see http://www.d

Re: java.net.SocketTimeoutException while Trying to Drop a Collection

2012-05-31 Thread aaron morton

The default value for rpc_timeout is 1 - 10 seconds. You want the socket timeout to be higher than the rpc_timeout otherwise the client will give up before the server. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 1/06/2012, at 3:2

Re: Invalid Counter Shard errors?

2012-05-31 Thread aaron morton

I suggest creating a ticket on https://issues.apache.org/jira/browse/CASSANDRA with the details. If it is an immediate concern see if you can find someone in the #cassandra chat room http://cassandra.apache.org/ Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.

Re: nodetool move 0 gets stuck in "moving" state forever

2012-05-31 Thread aaron morton

Look in the logs for errors or warnings. Also let us know what version you are using. Am guessing that node 2 still thought that node 1 was in the cluster when you did the move. Which should(?) have errored. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.t

Re: About Composite range queries

2012-05-31 Thread aaron morton

> If you hash 4 composite keys, let's say ('A','B','C'), ('A','D','C'), > ('A','E','X'), ('A','R','X'), you have only 4 hashes or you have more? Four > If it's 4, how come you are able to range query for example between > start_column=('A', 'D') and end_column=('A','E') and get this column > ('

Re: Cassandra Data Archiving

2012-05-31 Thread aaron morton

I'm not sure on your needs, but the simplest thing to consider is snapshotting and copying off node. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 1/06/2012, at 12:23 AM, Shubham Srivastava wrote: > I need to archive my Cassandra data i

Re: How can we use composite indexes and secondary indexes together

2012-05-31 Thread aaron morton

If you want to do arbitrary complex online / realtime queries look at Data Stax Enterprise, or https://github.com/tjake/Solandra or straight Solr. Alternatively denormalise the model to materialise the results when you insert so you query is a straight lookup. Or do some client side filtering /

RE: 1.1 not removing commit log files?

2012-05-31 Thread Bryce Godfrey

So this happened to me again, but it was only when the cluster had a node down for a while. Then the commit logs started piling up past the limit I set in the config file, and filled the drive. After the node recovered and hints had replayed the space was never reclaimed. A flush or drain did

Re: cassandra read latency help

2012-05-31 Thread crypto five

But I think it's bad idea, since hot data will be evenly distributed between multiple sstables and filesystem pages. On Thu, May 31, 2012 at 1:08 PM, crypto five wrote: > You may also consider disabling key/row cache at all. > 1mm rows * 400 bytes = 400MB of data, can easily be in fs cache, and

Re: cassandra read latency help

2012-05-31 Thread crypto five

You may also consider disabling key/row cache at all. 1mm rows * 400 bytes = 400MB of data, can easily be in fs cache, and you will access your hot keys with thousands of qps without hitting disk at all. Enabling compression can make situation even better. On Thu, May 31, 2012 at 12:01 PM, Gurpree

Re: cassandra read latency help

2012-05-31 Thread Gurpreet Singh

Aaron, Thanks for your email. The test kinda resembles how the actual application will be. It is going to be a simple key-value store with 500 million keys per node. The traffic will be read heavy in steady state, and there will be some keys that will have a lot more traffic than others. The expect

newbie question :got error 'org.apache.thrift.transport.TTransportException'

2012-05-31 Thread Chen, Simon

Hi, I am new to Cassandra. I have started a Cassandra instance (Cassandra.bat), played with it for a while, created a keyspace Zodiac. When I kill Cassandra instance and restarted, the keyspace is gone but when I tried to recreate it, I got 'org.apache.thrift.transport.TTransportException' err

Re: tokens and RF for multiple phases of deployment

2012-05-31 Thread Chong Zhang

Thanks Aaron. I might use LOCAL_QUORUM to avoid the waiting on the ack from DC2. Another question, after I setup a new node with token +1 in a new DC, and updated a CF with RF {DC1:2, DC2:1}. When i update a column on one node in DC1, it's also updated in the new node in DC2. But all the other r

Re: java.net.SocketTimeoutException while Trying to Drop a Collection

2012-05-31 Thread Christof Bornhoevd

Thanks a lot Aaron for the very fast response! I have increased the CassandraThriftSocketTimeout from 5000 to 9000. Is this a reasonable setting? configurator.setCassandraThriftSocketTimeout(9000); Cheers, Christof 2012/5/31 aaron morton > There are two times of timeouts. The thrift TimedOutEx

Invalid Counter Shard errors?

2012-05-31 Thread Charles Brophy

Hi guys, We're running a three node cluster of cassandra 1.1 servers, originally 1.0.7 and immediately after the upgrade the error logs of all three servers began filling up with the following message: ERROR [ReplicateOnWriteStage:177] 2012-05-31 08:17:02,236 CounterContext.java (line 381) invali

RE: nodetool move 0 gets stuck in "moving" state forever

2012-05-31 Thread Poziombka, Wade L

Let me elaborate a bit. two node cluster node1 has token 0 node2 has token 85070591730234615865843651857942052864 node1 goes down perminently. do a nodetool move 0 on node2. monitor with ring... is in Moving state forever it seems. From: Poziombka, Wade L Sent: Tuesday, May 29, 2012 4:29 P

Re: About Composite range queries

2012-05-31 Thread Cyril Auburtin

but sorry, I don"t undertand If you hash 4 composite keys, let's say ('A','B','C'), ('A','D','C'), ('A','E','X'), ('A','R','X'), you have only 4 hashes or you have more? If it's 4, how come you are able to range query for example between start_column=('A', 'D') and end_column=('A','E') and get th

How can we use composite indexes and secondary indexes together

2012-05-31 Thread Nury Redjepow

We want to use cassandra to store complex data. But we can't figure out, how to organize indexes. Our table (column family) looks like this: Users = { RandomId int, Firstname varchar, Lastname varchar, Age int, Country int, ChildCount int } In our queries we have mandatory fields (Firstname,Lastn

Re: About Composite range queries

2012-05-31 Thread aaron morton

it is hashed once. To the partitioner it's just some bytes. Other parts of the code car about it's structure. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 31/05/2012, at 7:00 PM, Cyril Auburtin wrote: > Thx for the answer > 1 more th

Re: will compaction delete empty rows after all columns expired?

2012-05-31 Thread aaron morton

> You can set the gc_grace_secs as a little value and force major compaction > after the row is expired. After then please check whether the row still > exists. There are some downsides to major compactions. (There have been some recent discussions). You can provoke (some) minor compactions by

Re: java.net.SocketTimeoutException while Trying to Drop a Collection

2012-05-31 Thread aaron morton

There are two times of timeouts. The thrift TimedOutException occurs when the coordinator times out waiting for the CL level nodes to respond. The error is transmitted back to the client and raised. This is a client side socket timeout waiting for the coordinator to respond. See the Cassandra

Re: commitlog_sync_batch_window_in_ms change in 0.7

2012-05-31 Thread aaron morton

Agree. Just happy to see people upgrade to something 1.X A - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 31/05/2012, at 8:24 AM, Rob Coli wrote: > On Tue, May 29, 2012 at 10:29 PM, Pierre Chalamet wrote: >> You'd better use version 1.0.9 (using

Re: tokens and RF for multiple phases of deployment

2012-05-31 Thread aaron morton

> Could you provide some guide on how to assign the tokens in this growing > deployment phases? background http://www.datastax.com/docs/1.0/install/cluster_init#calculating-tokens-for-a-multi-data-center-cluster Start with tokens for a 4 node cluster. Add the next 4 between between each of t

Re: Renaming a keyspace in 1.1

2012-05-31 Thread aaron morton

Not directly. * stop the cluster * rename the /var/lib/cassandra/data/mykeyspace directory * start the cluster * create the keyspace with new name * drop the keyspace with the old name Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 30/05/

Re: Retrieving old data version for a given row

2012-05-31 Thread aaron morton

> -Is there any other way to stract the contect of SSTable, writing a > java program for example instead of using sstable2json? Look at the code in sstale2json and copy it :) > -I tried to get tombstons using the thrift API, but seems to be not > possible, is it right? When I try, the program thro

Re: cassandra-hadoop mapper

2012-05-31 Thread Filippo Diotalevi

Hi, yes, the work can be split between different mappers, but each one will process one row at the time. In fact, the method > public void map(ByteBuffer key, SortedMap columns, > Context context) processes 1 row, with the specified ByteBuffer key and the list of columns SortedMap columns.

cassandra-hadoop mapper

2012-05-31 Thread murat migdisoglu

Hi, I'm working on some use cases to understand how cassandra-hadoop integration works. I have a very basic scenario: I have a column family that keeps the session id and some bson data that contains the username in two separate columns. I want to go through all rows and dump the row to a file wh

Re: About Composite range queries

2012-05-31 Thread Cyril Auburtin

Thx for the answer 1 more thing, a Composite key is not hashed only once I guess? It's hashed the number of part the composite have? So this means there are twice or 3 or ... as many keys as for normal column keys, is it true? Le 31 mai 2012 02:59, "aaron morton" a écrit : > Composite Columns com

Re: Cassandra Data Archiving

Re: Cassandra Data Archiving

Re: Cassandra Data Archiving

RE: Cassandra Data Archiving

Re: 1.1 not removing commit log files?

Re: newbie question :got error 'org.apache.thrift.transport.TTransportException'

Re: tokens and RF for multiple phases of deployment

Re: java.net.SocketTimeoutException while Trying to Drop a Collection

Re: Invalid Counter Shard errors?

Re: nodetool move 0 gets stuck in "moving" state forever

Re: About Composite range queries

Re: Cassandra Data Archiving

Re: How can we use composite indexes and secondary indexes together

RE: 1.1 not removing commit log files?

Re: cassandra read latency help

Re: cassandra read latency help

Re: cassandra read latency help

newbie question :got error 'org.apache.thrift.transport.TTransportException'

Re: tokens and RF for multiple phases of deployment

Re: java.net.SocketTimeoutException while Trying to Drop a Collection

Invalid Counter Shard errors?

RE: nodetool move 0 gets stuck in "moving" state forever

Re: About Composite range queries

How can we use composite indexes and secondary indexes together

Re: About Composite range queries

Re: will compaction delete empty rows after all columns expired?

Re: java.net.SocketTimeoutException while Trying to Drop a Collection

Re: commitlog_sync_batch_window_in_ms change in 0.7

Re: tokens and RF for multiple phases of deployment

Re: Renaming a keyspace in 1.1

Re: Retrieving old data version for a given row

Re: cassandra-hadoop mapper

cassandra-hadoop mapper

Re: About Composite range queries

34 matches

Site Navigation

Mail list logo

Footer information