Re: Consistency Level One Question

2014-02-20 Thread graham sanderson
Note also; that reading at ONE there will be no read repair, since the coordinator does not know that another replica has stale data (remember at ONE, basically only one node is asked for the answer). In practice for our use cases, we always write at LOCAL_QUORUM (failing the whole update if th

Re: Consistency Level One Question

2014-02-20 Thread graham sanderson
Writing at a consistency level of ONE means that your write will be acknowledged as soon as one replica confirms that it has made the write to memtable and the commit log (might not be quite synced to disk, but that’s a separate issue). All the writes are submitted in parallel, so it is very pos

Consistency Level One Question

2014-02-20 Thread Drew Kutcharian
Hi Guys, I wanted to get some clarification on what happens when you write and read at consistency level 1. Say I have a keyspace with replication factor of 3 and a table which will contain write-once/read-only wide rows. If I write at consistency level 1 and the write happens on node A and I r

Re: paging state will not work

2014-02-20 Thread Katsutoshi
Thank you for the reply. Added: https://issues.apache.org/jira/browse/CASSANDRA-6748 Katsutoshi 2014-02-21 2:14 GMT+09:00 Sylvain Lebresne : > That does sound like a bug. Would you mind opening a JIRA ( > https://issues.apache.org/jira/browse/CASSANDRA) ticket for it? > > > On Thu, Feb 20, 2014

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread Edward Capriolo
Hopefully in 3 years no one will be calling your schema 'legacy' and 'not suggested' like they do with mine. On Thursday, February 20, 2014, Laing, Michael wrote: > Just to add my 2 cents... > We are very happy CQL users, running in production. > I have had no problems modeling whatever I have ne

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread Laing, Michael
Just to add my 2 cents... We are very happy CQL users, running in production. I have had no problems modeling whatever I have needed to, including problems similar to the examples set forth previously, in CQL. Personally I think it is an excellent improvement to Cassandra, and we have no intenti

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread Edward Capriolo
On Thursday, February 20, 2014, Robert Coli wrote: > On Thu, Feb 20, 2014 at 9:12 AM, Sylvain Lebresne wrote: >> >> Of course, if everyone was using that reasoning, no-one would ever test new features and report problems/suggest improvement. So thanks to anyone like Rüdiger that actually tries st

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread Peter Lin
Yeah Slowly nosql products are adding schema :) At least Cassandra is ahead of the curve Sent from my iPhone > On Feb 20, 2014, at 7:37 PM, Edward Capriolo wrote: > > Recomendations in cassandra have a shelf life of about 1 to 2 years. If you > try to assert a recomendation from year ago y

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread Mohit Anchlia
On Thu, Feb 20, 2014 at 4:37 PM, Edward Capriolo wrote: > Recomendations in cassandra have a shelf life of about 1 to 2 years. If > you try to assert a recomendation from year ago you stand a solid chance of > someone telling you there is now a better way. > > Casaandra once loved being a schemale

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread Edward Capriolo
Recomendations in cassandra have a shelf life of about 1 to 2 years. If you try to assert a recomendation from year ago you stand a solid chance of someone telling you there is now a better way. Casaandra once loved being a schemaless datastore. Imagine that? On Thursday, February 20, 2014, Pete

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread Robert Coli
On Thu, Feb 20, 2014 at 9:12 AM, Sylvain Lebresne wrote: > Of course, if everyone was using that reasoning, no-one would ever test > new features and report problems/suggest improvement. So thanks to anyone > like Rüdiger that actually tries stuff and take the time to report problems > when they t

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread Edward Capriolo
Just read this. i did not mean to offend or start a debate. Generally when people ask me for help I give them the simplest option I know that works. It pains be to watch new users struggling with incompatible drivers and bugs. On Thursday, February 20, 2014, Sylvain Lebresne wrote: > On Thu, Feb

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread Peter Lin
good example Ed. I'm so happy to see other people doing things like this. Even if the official DataStax docs recommend don't mix static and dynamic, to me that's a huge disservice to Cassandra users. If someone really wants to stick to relational model, then NewSql is a better fit, plus gives use

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread Edward Capriolo
CASSANDRA-6561 is interesting. Though having statically defined columns are not exactly a solution to do everything in "thrift". http://planetcassandra.org/blog/post/poking-around-with-an-idea-ranged-metadata/ Before collections or CQL existed I did some of these concepts myself. Say you have a

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread Peter Lin
thanks Erick. hopefully sylvain will forgive me for misquoting him. My goal was to share knowledge and get people thinking about how best to use both thrift and cql. Whenever I hear people say "cql is the future" I get annoyed. My bias feeling is they compliment each other very well and users shou

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread Erick Ramirez
Wow! What a fantastic robust discussion. I've just been educated. Peter --- Thanks for providing those use cases. They are great examples. Rudiger --- From what you've done so far, I wouldn't have said your are new to Cassandra. Well done. Cheers, Erick

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread DuyHai Doan
Rüdiger "SortedMap>" When using a RandomPartitioner or Murmur3Partitioner, the outer map is a simple Map, not SortedMap. The only case you have a SortedMap for row key is when using OrderPreservingPartitioner, which is clearly not advised for most cases because of hot spots in the cluster.

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread Rüdiger Klaehn
Hi Sylvain, I applied the patch to the cassandra-2.0 branch (this required some manual work since I could not figure out which commit it was supposed to apply for, and it did not apply to the head of cassandra-2.0). The benchmark now runs in pretty much identical time to the thrift based benchmar

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread Peter Lin
Hi Ed, you're definitely not mad. I've seen this all over the place. We have several large retail customers and they all suffer the EAV horror. Having built EAV horrors in the past and guilty of inflicting that pain on people, mixing static and dynamic is "Totally Freaking awesome!" I know many

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread Edward Capriolo
Peter, I must meet you and shake your hand. I was actually having a debate with a number of people about a week back claiming there was "no reason to mix static and dynamic". We do it all the time I am glad someone else besides me "gets it" and I am not totally mad. Ed On Thu, Feb 20, 2014 at 3

[BETA RELEASE] Apache Cassandra 2.1.0-beta1 released

2014-02-20 Thread Sylvain Lebresne
The Cassandra team is pleased to announce the release of the first beta for the future Apache Cassandra 2.1.0. Let me first stress that this is beta software and as such is *not* ready for production use. The goal of this release is to give a preview of what will become Cassandra 2.1 and to get w

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread DuyHai Doan
Ok I see what you mean Peter. After reading CASSANDRA-6561the use case is pretty clear. On Thu, Feb 20, 2014 at 9:26 PM, Peter Lin wrote: > > Hi Duyhai, > > yes, I am talking about mixing static and dynamic columns in a single > column fam

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread Peter Lin
Hi Duyhai, yes, I am talking about mixing static and dynamic columns in a single column family. Let me give you an example from retail. Say you're amazon and you sell over 10K different products. How do you store all those products with all the different properties like color, size, dimensions, e

Exception while iterating over large data

2014-02-20 Thread ankit tyagi
Hello guys, I was going through http://www.datastax.com/dev/blog/client-side-improvements-in-cassandra-2-0, and it is mentioned that automatically pagination is taken care of. I am using below code to iterate over large data for particular primary key. Statement stmt = new SimpleStatement("SELECT

How do you remote backup your cassandra nodes ?

2014-02-20 Thread user 01
What is your strategy/tools set to backup your Cassandra nodes, apart from from cluster replication/ snapshots within cluster?

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread DuyHai Doan
"Developers can use what ever type they want for the name or value in a dynamic column and the framework will handle it appropriately." What do you mean by "dynamic" column ? If you want to be able to insert an arbitrary number of columns in one physical row, CQL3 clustering is there and does pre

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread Peter Lin
my apologies Sylvain, I didn't mean to misquote you. I still feel that even if someone is only going to use CQL, it is "worth it" to learn thrift. In the interest of discussion, I looked at both jira tickets and I don't see how that makes it so a developer can specify the name and value type for a

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread Edward Capriolo
The only thing you really can not do CQL3 loses some of the concept of CQL2 metadata, namedly the default validation and then column specific validation. In cassandra-cql we can say (butchering the syntax) create column family x DEFAULT_VALIDATOR = UTF8Type columns named y are int columns named z

C-driver to be used with nginx?

2014-02-20 Thread Jan Algermissen
Hi, does anyone know of a C-driver that can be / has been used with nginx? I am afraid that the C++ drivers[1] threading and connection pooling approach interferes with nginx's threading model. Doe anyone have any ideas? Jan [1] https://github.com/datastax/cpp-driver

Re: paging state will not work

2014-02-20 Thread Edward Capriolo
Cassandra has no null. So in this context setting a column to null or updating null is a delete. I think. I remember debating the semantics of null once. On Tuesday, February 18, 2014, Katsutoshi wrote: > Hi. > > I am using Cassandra 2.0.5 version. If null is explicitly set to a column, paging_st

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread Benedict Elliott Smith
> > Cassandra will throw an exception indicating the type is different than > the default type. If you want untyped data, store blobs. Or store in a different column (they're free when empty, after all). Type safety is considered a good thing by many. On 20 February 2014 17:26, Peter Lin wrote

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread Sylvain Lebresne
On Thu, Feb 20, 2014 at 6:26 PM, Peter Lin wrote: > > I disagree with the sentiment that "thrift is not worth the trouble". > Way to quote only part of my sentence and get mental on it. My full sentence was "it's probably not worth the trouble to start with thrift if you're gonna use CQL later".

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread Mohit Anchlia
+1 I like hector client that uses thrift interface and exposes APIs that is similar to how Cassandra physically stores the values. On Thu, Feb 20, 2014 at 9:26 AM, Peter Lin wrote: > > I disagree with the sentiment that "thrift is not worth the trouble". > > CQL and all SQL inspired dialects li

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread Peter Lin
I disagree with the sentiment that "thrift is not worth the trouble". CQL and all SQL inspired dialects limit one's ability to use arbitrary typed data in dynamic columns. With thrift it's easy and straight forward. With CQL there is no way to tell Cassandra the type of the name and value for a dy

Re: paging state will not work

2014-02-20 Thread Sylvain Lebresne
That does sound like a bug. Would you mind opening a JIRA ( https://issues.apache.org/jira/browse/CASSANDRA) ticket for it? On Thu, Feb 20, 2014 at 3:06 PM, Edward Capriolo wrote: > I would try a fetch size other then 1. Cassandras slices are start > inclusive so maybe that is a bug. > > > On Tu

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread Sylvain Lebresne
On Thu, Feb 20, 2014 at 2:16 PM, Edward Capriolo wrote: > For what it is worth you schema is simple and uses compact storage. Thus > you really dont need anything in cassandra 2.0 as far as i can tell. You > might be happier with a stable release like 1.2.something and just hector > or astyanax. Y

Re: Intermittent long application pauses on nodes

2014-02-20 Thread Joel Samuelsson
Hi Frank, We got a (quite) long GC pause today on 2.0.5: INFO [ScheduledTasks:1] 2014-02-20 13:51:14,528 GCInspector.java (line 116) GC for ParNew: 1627 ms for 1 collections, 425562984 used; max is 4253024256 INFO [ScheduledTasks:1] 2014-02-20 13:51:14,542 GCInspector.java (line 116) GC for Conc

Re: paging state will not work

2014-02-20 Thread Edward Capriolo
I would try a fetch size other then 1. Cassandras slices are start inclusive so maybe that is a bug. On Tuesday, February 18, 2014, Katsutoshi wrote: > Hi. > > I am using Cassandra 2.0.5 version. If null is explicitly set to a column, paging_state will not work. My test procedure is as follows: >

Re: High CPU load on one node in the cluster

2014-02-20 Thread Edward Capriolo
Upgrade from 2.0.3. There are several bugs, On Wednesday, February 19, 2014, Yogi Nerella wrote: > You should start your Cassandra daemon with -verbose:gc (please check syntax) and then run it in foreground, as Cassandra closes the standard out) > Please see other emails in this forum for getting

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread Edward Capriolo
Dont worry there will be plenty of time to upgrade to 2.0 or 2.1 later. It is an easy upgrade path an you will likely do it 2-4 tmes a year. Dont chose the latest and gteatest now thnking that you are future proofing. In reality you are volunteering as a beta tester. On Thursday, February 20, 2014

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread Edward Capriolo
For what it is worth you schema is simple and uses compact storage. Thus you really dont need anything in cassandra 2.0 as far as i can tell. You might be happier with a stable release like 1.2.something and just hector or astyanax. You are really dealing with many issues you should not have to jus