Re: batch_size_warn_threshold_in_kb

2014-12-11 Thread Shane Hansen
I don't know why 5kb was chosen. The general trend is that larger batches will put more stress on the coordinator node. The precise point at which things fall over will vary. On Thu, Dec 11, 2014 at 1:43 PM, Mohammed Guller wrote: > Hi – > > The cassandra.yaml file has property called *batch_

Re: Performance Difference between Batch Insert and Bulk Load

2014-12-04 Thread Shane Hansen
I'd be really interested to know what sort of performance or load improvements you see by doing client side partitioning. Please post back some results if you've tried that strategy. On Thu, Dec 4, 2014 at 11:46 AM, Tyler Hobbs wrote: > > On Thu, Dec 4, 2014 at 11:50 AM, Dong Dai wrote: > >> As

Re: What causes NoHostAvailableException, WriteTimeoutException, and UnavailableException?

2014-11-24 Thread Shane Hansen
Not sure if this is what you're looking for, but api docs can be useful (I won't copy/paste the docs themselves) http://www.datastax.com/drivers/java/2.0/com/datastax/driver/core/exceptions/NoHostAvailableException.html http://www.datastax.com/drivers/java/2.0/com/datastax/driver/core/exceptions/

Re: Better option to load data to cassandra

2014-11-13 Thread Shane Hansen
So sstableloader is a cpu efficient online method of loading data if you already have sstables. An option you may not have considered is just using batch inserts. It was a surprise to me coming from another database system, but C*'s primary use case is shoving data to an append only log. Is there

Re: Exploring Simply Queueing

2014-10-06 Thread Shane Hansen
Sorry if I'm hijacking the conversation, but why in the world would you want to implement a queue on top of Cassandra? It seems like using a proper queuing service would make your life a lot easier. That being said, there might be a better way to play to the strengths of C*. Ideally everything you

Re: Increasing size of "Batch of prepared statements"

2014-10-03 Thread Shane Hansen
It appears to be configurable in cassandra.yaml using batch_size_warn_threshold https://issues.apache.org/jira/browse/CASSANDRA-6487 On Fri, Oct 3, 2014 at 10:47 AM, shahab wrote: > Hi, > > I am getting the following warning in the cassandra log: > " BatchStatement.java:258 - Batch of prepared

Re: Storage: upsert vs. delete + insert

2014-09-10 Thread Shane Hansen
My understanding is that a update is the same as an insert. So I would think delete+insert is a bad idea. Also insert+delete would put 2 entries in the commit log. On Sep 10, 2014 9:49 AM, "Michal Budzyn" wrote: > Is there any serious difference in the used disk and memory storage > between upser

Re: are dynamic columns supported at all in CQL 3?

2014-08-26 Thread Shane Hansen
Does this answer your question Ian? http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows On Tue, Aug 26, 2014 at 1:12 PM, Ian Rose wrote: > Is it possible in CQL to create a table that supports dynamic column > names? I am using C* v2.0.9, which I assume implies CQL ver

Re: EC2 SSD cluster costs

2014-08-19 Thread Shane Hansen
Again, depends on your use case. But we wanted to keep the data per node below 500gb, and we found raided ssds to be the best bang for the buck for our cluster. I think we moved to from the i2 to c3 because our bottleneck tended to be CPU utilization (from parsing requests). (Discliamer, we're n

Re: too many open files

2014-08-08 Thread Shane Hansen
Are you using apache or Datastax cassandra? The datastax distribution ups the file handle limit to 10. That number's hard to exceed. On Fri, Aug 8, 2014 at 1:35 PM, Marcelo Elias Del Valle < marc...@s1mbi0se.com.br> wrote: > Hi, > > I am using Cassandra 2.0.9 running on Debian Wheezy, and

Re: Case Study from Migrating from RDBMS to Cassandra

2014-07-22 Thread Shane Hansen
There's lots of info on migrating from a relational database to Cassandra here: http://www.datastax.com/relational-database-to-nosql On Tue, Jul 22, 2014 at 7:45 PM, Surbhi Gupta wrote: > Hi, > > Does anybody has the case study for Migrating from RDBMS to Cassandra ? > > Thanks >

Re: Cassandra Scaling Alerts

2014-07-22 Thread Shane Hansen
I would look at load (disk space used) and system.compactions_in_progress. On Tue, Jul 22, 2014 at 3:49 PM, Arup Chakrabarti wrote: > We have been going through and setting up alerts on our Cassandra > clusters. We have catastrophic alerts setup to let us know when things are > super broken, b

Re: Easy diff of schema from dev->production

2014-07-08 Thread Shane Hansen
I'd suggest looking at the system keyspace. Like schema_columns On Jul 8, 2014 9:39 AM, "Kevin Burton" wrote: > Are there any easy/elegant ways to compare dev schema to production > schema. I want to find if there are any rows/columns we need to add. > > I could try to format the output and just