RE: RE: Cassandra tombstones being created by updating rows with TTL's

2015-04-23 Thread Walsh, Stephen
Thanks Anij, You are correct in understanding of our setup. However when we set the gc to 10 seconds its manages our tombstone count, any higher than 10 seconds and we start getting tombstone warnings. I think your right, when I set the gc_grace to 0 , I don’t believe the compaction kicked in q

What is 'Read Reuqests' on OpsCenter exaclty?

2015-04-23 Thread Bongseo Jang
I have cassandra 2.1 + OpsCenter 5.1.1 and test them. When I monitored with opscenter 'read requests' graph, it seems the number on the graph is not what I expected, the number of client requests or responses. I recorded actual number of client request and compare it with graph, then found they'r

Re: timeout creating table

2015-04-23 Thread Jimmy Lin
Also I am not sure it matters, but I just realized the keyspace created has replication factor of 2 when my Cassandra is really just a single node. Is Cassandra smart enough to ignore the RF of 2 and work with only 1 single node? On Mon, Apr 20, 2015 at 8:23 PM, Jimmy Lin wrote: > hi, > there

Re: What is 'Read Reuqests' on OpsCenter exaclty?

2015-04-23 Thread Carlos Rolo
Probably it takes in account the read repair, plus a read that have consistency != 1 will produce reads on other machines (which are taken in account). I don't know the internals of opscenter but I would assume that this is the case. If you want to test it further, disable read_repair, and make al

Creating 'Put' requests

2015-04-23 Thread Matthew Johnson
Hi all, Currently looking at switching from HBase to Cassandra, and one big difference so far is that in HBase, we create a ‘Put’ object, add to it a set of column/value pairs, and send the Put to the server. So far in Cassandra 2.1.4 the tutorials seem to suggest using CQL3, which I really like

Re: Creating 'Put' requests

2015-04-23 Thread Jim Witschey
Are prepared statements what you're looking for? http://docs.datastax.com/en/developer/java-driver/2.1/java-driver/quick_start/qsSimpleClientBoundStatements_t.html Jim Witschey Software Engineer in Test | jim.witsc...@datastax.com On Thu, Apr 23, 2015 at 9:28 AM, Matthew Johnson wrote: > Hi

RE: Creating 'Put' requests

2015-04-23 Thread Matthew Johnson
Hi Jim, This would still involve either having a fixed(ish) schema, with a handful of pre-written prepared statements that I fill the values into, or some rather horrific StringBuilder that generates the statement based on some logic. Prepared Statements work great, for example, for inserting user

Re: timeout creating table

2015-04-23 Thread Sebastian Estevez
That is a problem, you should not have RF > N. Do an alter table to fix it. This will affect your reads and writes if you're doing anything > CL 1 --> timeouts. On Apr 23, 2015 4:35 AM, "Jimmy Lin" wrote: > Also I am not sure it matters, but I just realized the keyspace created > has replicatio

Data model suggestions

2015-04-23 Thread Ali Akhtar
Hey all, We are working on moving a mysql based application to Cassandra. The workflow in mysql is this: We have two tables: active and archive . Every hour, we pull in data from an external API. The records which are active, are kept in 'active' table. Once a record is no longer active, its dele

Re: What is 'Read Reuqests' on OpsCenter exaclty?

2015-04-23 Thread Sebastian Estevez
Carlos is right: *Read Requests* - The number of read requests per second on the coordinator nodes, analogous to client reads. Monitoring the number of requests over a given time period reveals system read workload and usage patterns. *Avg* - The average of values recorded during a time interval.

RE: Creating 'Put' requests

2015-04-23 Thread Matthew Johnson
Hi Jim, I think I have found what I was looking for here: https://gist.github.com/yangzhe1991/10349122 I would end up with code that looks something like this: *public* *void** createSchema() {* * System.**out**.println(**"CREATING SCHEMA"**);* * Cre

Re: Creating 'Put' requests

2015-04-23 Thread Alex Popescu
On Thu, Apr 23, 2015 at 8:50 AM, Matthew Johnson wrote: > Unfortunately it seems that I was misinformed on the “dynamically creating > timeseries columns” feature, and that this WAS deprecated in CQL3 – in > order to dynamically create columns I would have to issue an ‘ALTER TABLE’ > statement fo

Re: Data model suggestions

2015-04-23 Thread Manoj Khangaonkar
Hi, How do you determine if the record is no longer active ? Is it a perioidic process that goes through every record and checks when the last update happened ? regards On Thu, Apr 23, 2015 at 8:09 AM, Ali Akhtar wrote: > Hey all, > > We are working on moving a mysql based application to Cassa

Re: timeout creating table

2015-04-23 Thread Jimmy Lin
well i am pretty sure our CL is one. and the long pause seems happen somewhat randomly. But is creating keyspace or table statements has different treatment in terms of CL that may explain the long pause? thanks On Thu, Apr 23, 2015 at 8:04 AM, Sebastian Estevez < sebastian.este...@datastax.com>

Re: Data model suggestions

2015-04-23 Thread Ali Akhtar
That's returned by the external API we're querying. We query them for active records, if a previous active record isn't included in the results, that means its time to archive that record. On Thu, Apr 23, 2015 at 9:20 PM, Manoj Khangaonkar wrote: > Hi, > > How do you determine if the record is n

RE: timeout creating table

2015-04-23 Thread Matthew Johnson
Hi Jimmy, I have very limited experience with Cassandra so far, but from following some tutorials to create keyspaces, create tables, and insert data, it definitely seems to me like creating keyspaces and tables is way slower than inserting data. Perhaps a more experienced user can confirm if th

Re: How much data is bootstrapping supposed to send?

2015-04-23 Thread Robert Coli
On Wed, Apr 22, 2015 at 11:57 PM, Dave Galbraith wrote: > So I was expecting the load to drop to about 6.5 MB on my original node > while the new node would pick up about 6.5 MB, so they'd be balanced, but > instead the disk usage on my original node somehow increased by 2.5 MB > while the new no

Adding New Node Issue

2015-04-23 Thread Thomas Miller
Hello, Yesterday we ran into a serious issue while joining a new node to our existing 4 node Cassandra cluster (version 2.0.7). The average node data size is 152GB's with a replication factor of 3. The node was prepped just like the following document describes - http://docs.datastax.com/en/ca

Re: Drawbacks of Major Compaction now that Automatic Tombstone Compaction Exists

2015-04-23 Thread Andrei Ivanov
Just in case it helps - we are running C* with sstable sizes of something like 2.5 TB and ~4TB/node. No evident problems except the time it takes to compact. Andrei. On Wed, Apr 22, 2015 at 5:36 PM, Anuj Wadehra wrote: > Thanks Robert!! > > The JIRA was very helpful in understanding how tombsto

Re: Adding New Node Issue

2015-04-23 Thread Jeff Ferland
Sounds to me like your stream throughput value is too high. `notetool getstreamthroughput` and `notetool setstreamthroughput` will update this value live. Limit it to something lower so that the system isn’t overloaded by streaming. The bottleneck that slows things down is mostly to be disk or

RE: Adding New Node Issue

2015-04-23 Thread Thomas Miller
Jeff, Thanks for the response. I had come across that as a possible solution previously but there are discrepancies that would lead me to think that that is not the issue. It appears our stream throughput is currently set to 200Mbps but unless the Cassandra service shares that same throughput

Re: Data model suggestions

2015-04-23 Thread Manoj Khangaonkar
Hi, If your external API returns active records, that means I am guessing you need to do a select * on the active table to figure out which records in the table are no longer active. You might be aware that range selects based on partition key will timeout in cassandra. They can however be made t

Re: Adding New Node Issue

2015-04-23 Thread Ali Akhtar
What version are you running? On Fri, Apr 24, 2015 at 12:51 AM, Thomas Miller wrote: > Jeff, > > > > Thanks for the response. I had come across that as a possible solution > previously but there are discrepancies that would lead me to think that > that is not the issue. > > > > It appears our st

Re: Data model suggestions

2015-04-23 Thread Ali Akhtar
Good point about the range selects. I think they can be made to work with limits, though. Or, since the active records will never usually be > 500k, the ids may just be cached in memory. Most of the time, during reads, the queries will just consist of select * where primaryKey = someValue . One ro

RE: Adding New Node Issue

2015-04-23 Thread Thomas Miller
Ali, Our Cassandra version is 2.0.7. Thanks, Thomas Miller From: Ali Akhtar [mailto:ali.rac...@gmail.com] Sent: Thursday, April 23, 2015 4:22 PM To: user@cassandra.apache.org Subject: Re: Adding New Node Issue What version are you running? On Fri, Apr 24, 2015 at 12:51 AM, Thomas Miller mailt

Re: Adding New Node Issue

2015-04-23 Thread Andrei Ivanov
Thomas, just in case you missed it there is a bug with throughput setting prior to 2.0.13, here is the link: https://issues.apache.org/jira/browse/CASSANDRA-8852 So, it may happen you are setting it to 1600 megabytes Andrei On Thu, Apr 23, 2015 at 11:22 PM, Ali Akhtar wrote: > What version are

RE: Adding New Node Issue

2015-04-23 Thread Thomas Miller
Andrei, I did not see that bug report. Thanks for the heads up on that. I am thinking that that is still not the issue though since if this were the case then I should be seeing higher than 200Mbps on that interface. I am able to see that the two streaming nodes never get over 200Mbps via my Za

Re: Adding New Node Issue

2015-04-23 Thread Andrei Ivanov
Thomas, >From our experience, C* is almost degrading quite a bit when we bootstrap new nodes - no idea why, was never able to get any help or hints. And we never reach anywhere close to 200Mbps. Though we also see higher CPU usage.Actually, there is another way of adding nodes, I guess. Like start

Re: What is 'Read Reuqests' on OpsCenter exaclty?

2015-04-23 Thread Bongseo Jang
Thanks a lot Carlos, Sebastian :-) My test was with 1 node/1 replica settings, on which I assumed client request = read request on the graph. Because there seems no read_repair and already CL=ONE in my case, I need more explanation, don't I? Or can any other internals be still involved? Do you ha

Re: Drawbacks of Major Compaction now that Automatic Tombstone Compaction Exists

2015-04-23 Thread Anuj Wadehra
Great !!! Thanks Andrei !!! Thats the answer I was looking for :) Thanks Anuj Wadehra Sent from Yahoo Mail on Android From:"Andrei Ivanov" Date:Thu, 23 Apr, 2015 at 11:57 pm Subject:Re: Drawbacks of Major Compaction now that Automatic Tombstone Compaction Exists Just in case it helps - we

Re: Data model suggestions

2015-04-23 Thread Narendra Sharma
I think one table say record should be good. The primary key is record id. This will ensure good distribution. Just update the active attribute to true or false. For range query on active vs archive records maintain 2 indexes or try secondary index. On Apr 23, 2015 1:32 PM, "Ali Akhtar" wrote: >