Re: Cassandra users survey

2015-10-20 Thread Jonathan Ellis
Thanks for all the responses! The results (minus suggestions and emails) are available here: https://docs.google.com/spreadsheets/d/1FegCArZgj2DNAjNkcXi1n2Y1Kfvf6cdZedkMPYQdvC0/edit?usp=sharing I've included charts on separate sheets for each question, but unfortunately I couldn't figure out how

Re: Cassandra users survey

2015-10-07 Thread Jonathan Ellis
I think what would be most useful would be to pick your largest cluster, and answer based on that. If you have multiple applications in the cluster, then the sum; otherwise, just one. On Thu, Oct 1, 2015 at 9:50 PM, Jim Ancona wrote: > Hi Jonathan, > > The survey asks about "your application."

Re: Cassandra users survey

2015-10-01 Thread Jim Ancona
Hi Jonathan, The survey asks about "your application." We have multiple applications using Cassandra. Are you looking for information about each application separately, or the sum of all of them? Jim On Wed, Sep 30, 2015 at 2:18 PM, Jonathan Ellis wrote: > With 3.0 approaching, the Apache Cass

Cassandra users survey

2015-09-30 Thread Jonathan Ellis
With 3.0 approaching, the Apache Cassandra team would appreciate your feedback as we work on the project roadmap for future releases. I've put together a brief survey here: https://docs.google.com/forms/d/1TEG0umQAmiH3RXjNYdzNrKoBCl1x7zurMroMzAFeG2Y/viewform?usp=send_form Please take a few minute

Re: Second Cassandra users survey

2011-12-06 Thread Matthias Pfau
It took some time to gather our requirements and to check what are our most important needs. However, here they are: * Column position range queries: We would like to access columns not by their name, but by their position in the row. Example: row("A":v1, "B":v2, "C":v3, "D":v4); ; ordered by

Re: Second Cassandra users survey

2011-11-28 Thread Aditya
Ability to mix counter columns & normal columns in same column family. On Thu, Nov 17, 2011 at 6:46 PM, Boris Yen wrote: > I was wondering if it is possible to provide a funtion like "delete from > cf where column='value' " > > I think this shold be useful for people who use secondary index a

Re: Second Cassandra users survey

2011-11-17 Thread Boris Yen
I was wondering if it is possible to provide a funtion like "delete from cf where column='value' " I think this shold be useful for people who use secondary index a lot. On Nov 15, 2011 11:05 AM, "Edward Ribeiro" wrote: > > +1 on co-processors. > > > Edward

Re: Second Cassandra users survey

2011-11-14 Thread Edward Ribeiro
+1 on co-processors. Edward

Re: Second Cassandra users survey

2011-11-14 Thread Dean Hiller
oh yeah, one more BIG one.in memory writes with asynch write-behind to disk like cassandra does for speed. So if you have atomic locking, it writes to the primary node(memory) and some other node(memory) and returns with success to the client. asynch then writes to disk later. This prove to

Re: Second Cassandra users survey

2011-11-14 Thread Dean Hiller
+1 on coprocessors On Mon, Nov 14, 2011 at 6:51 PM, Mohit Anchlia wrote: > On Mon, Nov 14, 2011 at 4:44 PM, Jake Luciani wrote: > > Re Simpler "elasticity": > > Latest opscenter will now rebalance cluster optimally > > http://www.datastax.com/dev/blog/whats-new-in-opscenter-1-3 > > > > Doe

Re: Second Cassandra users survey

2011-11-14 Thread Mohit Anchlia
On Mon, Nov 14, 2011 at 4:44 PM, Jake Luciani wrote: > Re  Simpler "elasticity": > Latest opscenter will now rebalance cluster optimally > http://www.datastax.com/dev/blog/whats-new-in-opscenter-1-3 > Does it cause any impact on reads and writes while re-balance is in progress? How is it handled

Re: Second Cassandra users survey

2011-11-14 Thread Jake Luciani
Re Simpler "elasticity": Latest opscenter will now rebalance cluster optimally http://www.datastax.com/dev/blog/whats-new-in-opscenter-1-3 -Jake On Mon, Nov 14, 2011 at 7:27 PM, Chris Burroughs wrote: > - It would be super cool if all of that counter work made it possible > to support other

Re: Second Cassandra users survey

2011-11-14 Thread Chris Burroughs
- It would be super cool if all of that counter work made it possible to support other atomic data types (sets? CAS? just pass a assoc/commun Function to apply). - Again with types, pluggable type specific compression. - Wishy washy wish: Simpler "elasticity" I would like to go from 6-->8-->7

Re: Second Cassandra users survey

2011-11-11 Thread Edward Capriolo
It seems like you could use a composite key partioner to accomplish this On Monday, November 7, 2011, Daniel Doubleday wrote: > Allow for deterministic / manual sharding of rows. > > Right now it seems that there is no way to force rows with different row keys will be stored on the same nodes in

Re: Second Cassandra users survey

2011-11-11 Thread Aaron Turner
Oh, and one more thing: If you're doing a select and you get no results, then an indication of no columns or no rows matching would be nice. Kinda painful when you're typing in long strings and get no result, wonder why, only to find out you fat fingered your row key. :( On Fri, Nov 11, 2011 at

Re: Second Cassandra users survey

2011-11-11 Thread Aaron Turner
Lately I've been working on some data processing code in Cassandra and apparently I don't write bug-free code the very first time. :) Hence, while debugging, I often need to look at data in Cassandra to see what my code is doing/should be finding, etc. This turns out to be harder then it should be

Re: Second Cassandra users survey

2011-11-09 Thread Vijay
My wish list: 1) Conditional updates: if a column has a value then put column in the column family atomically else fail. 2) getAndSet: on counters: a separate API 3) Revert the count when client disconnects or receives a exception (so they can safely retry). 4) Something like a freeze API for upda

Re: Second Cassandra users survey

2011-11-09 Thread Jake Luciani
ence that had > already implemented what I mentioned. It didn't offer any atomicity, just > co-locating a family of data on the same node. > > From: Jake Luciani > Reply-To: "user@cassandra.apache.org" > Date: Wed, 9 Nov 2011 02:53:20 -0800 > To: "user@cassandr

Re: Second Cassandra users survey

2011-11-09 Thread Todd Burruss
.org<mailto:user@cassandra.apache.org>" mailto:user@cassandra.apache.org>> Date: Wed, 9 Nov 2011 02:53:20 -0800 To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" mailto:user@cassandra.apache.org>> Subject: Re: Second Cassandra users survey Hi Todd,

Re: Second Cassandra users survey

2011-11-09 Thread Aaron Turner
I think this was already asked for, but you can add my vote for TTL support for Counters. On Tue, Nov 1, 2011 at 3:59 PM, Jonathan Ellis wrote: > Hi all, > > Two years ago I asked for Cassandra use cases and feature requests. > [1]  The results [2] have been extremely useful in setting and > prio

Re: Second Cassandra users survey

2011-11-09 Thread Jake Luciani
Hi Todd, Entity Groups : https://issues.apache.org/jira/browse/CASSANDRA-1684 -Jake On Wed, Nov 9, 2011 at 6:44 AM, Todd Burruss wrote: > I believe I heard someone talk at Cassandra SF conference about creating a > partitioner that was a derivation of RandomPartitioner. It essentially > would

Re: Second Cassandra users survey

2011-11-08 Thread Todd Burruss
A use case that could use this (but isn't in my top requests) is usage history for a given user. I use a single row to save history per user, each column is a user action with name a TimeUUID and value is a blob. I use the TimeUUID to sort the actions, but I don't really care about exact time. a

Re: Second Cassandra users survey

2011-11-08 Thread Todd Burruss
I believe I heard someone talk at Cassandra SF conference about creating a partitioner that was a derivation of RandomPartitioner. It essentially would look for keys that adhere to a certain pattern, like :. The portion would be used for determining the location on the ring, but : for actually s

Re: Second Cassandra users survey

2011-11-08 Thread Daniel Doubleday
Ah cool - thanks for the pointer! On Nov 7, 2011, at 5:25 PM, Ed Anuff wrote: > This is basically what entity groups are about - > https://issues.apache.org/jira/browse/CASSANDRA-1684 > > On Mon, Nov 7, 2011 at 5:26 AM, Peter Lin wrote: >> This feature interests me, so I thought I'd add some co

Re: Second Cassandra users survey

2011-11-07 Thread Colin Taylor
Decompression without compression (for lack of a better name). We store into Cassandra log batches that come in over http either uncompressed, deflate, snappy. We just add 'magic e.g. \0 \s \n \a \p \p \y as a prefix to the column value so we can decode it when serve it back up. Seems like Cassa

Re: Second Cassandra users survey

2011-11-07 Thread Brian O'Neill
It should be dead-simple to build a slick GUI on the REST layer. (@Virgil ) I had planned to crank one out this week (using ExtJS) that mimicked the Squirrel/Toad look and feel. The UI would have a tree-panel of keyspaces and column families o

Re: Second Cassandra users survey

2011-11-07 Thread Daniel Doubleday
Well - given the example in our case the prefix that determines the endpoints where a token should be routed to could be something like a user-id so with key = "userid" + "." + "userthingid"; instead of // this is happening right now getEndpoints(hash(key)) you would have getEndpoints("user

Re: Second Cassandra users survey

2011-11-07 Thread Ian Danforth
> > > Wish list: A decent GUI to explore data kept in Cassandra would be much > valuable. It should also be extendable to > provide viewers for custom data. > > +1 to that. @jonathan - This is what google moderator is really good at. Perhaps start one and move the idea creation / voting there.

RE: Second Cassandra users survey

2011-11-07 Thread Deeter, Derek
-Original Message- From: Mohit Anchlia [mailto:mohitanch...@gmail.com] Sent: Sunday, November 06, 2011 10:58 AM To: user@cassandra.apache.org Subject: Re: Second Cassandra users survey Transparent on disk encryption with pluggable keyprovider will also be really helpful to secure sensitive i

Re: Second Cassandra users survey

2011-11-07 Thread Ed Anuff
This is basically what entity groups are about - https://issues.apache.org/jira/browse/CASSANDRA-1684 On Mon, Nov 7, 2011 at 5:26 AM, Peter Lin wrote: > This feature interests me, so I thought I'd add some comments. > > Having used partition features in existing databases like DB2, Oracle > and m

Re: Second Cassandra users survey

2011-11-07 Thread Jeremiah Jordan
- Batch read/slice from multiple column families. On 11/01/2011 05:59 PM, Jonathan Ellis wrote: Hi all, Two years ago I asked for Cassandra use cases and feature requests. [1] The results [2] have been extremely useful in setting and prioritizing goals for Cassandra development. But with the

Re: Second Cassandra users survey

2011-11-07 Thread Jeremiah Jordan
Actually, the data will be visible at QUORUM as well if you can see it with ONE. QUORUM actually gives you a higher chance of seeing the new value than ONE does. In the case of R=3 you have 2/3 chance of seeing the new value with QUORUM, with ONE you have 1/3... And this JIRA fixed an issue

Re: Second Cassandra users survey

2011-11-07 Thread Radim Kolar
> So my question related deterministic sharding is this, "what rebalance feature(s) would be useful or needed once the partitions get unbalanced?" In current cassandra you can use "nodetool move" for rebalancing. Its fast operation, portion of existing data is moved to new server.

Re: Second Cassandra users survey

2011-11-07 Thread Flavio Baronti
We are using Cassandra for time series storage. Strong points: write performance. Pain points: dinamically adding column families as new time series come in. Caused a lot of headaches, mismatchers between nodes, etc. In the end we just put everything together in a single (huge) column family. Wis

Re: Second Cassandra users survey

2011-11-07 Thread Peter Lin
This feature interests me, so I thought I'd add some comments. Having used partition features in existing databases like DB2, Oracle and manual partitioning, one of the biggest challenges is keeping the partitions balanced. What I've seen with manual partitioning is that often the partitions get u

Re: Second Cassandra users survey

2011-11-07 Thread Daniel Doubleday
Allow for deterministic / manual sharding of rows. Right now it seems that there is no way to force rows with different row keys will be stored on the same nodes in the ring. This is our number one reason why we get data inconsistencies when nodes fail. Sometimes a logical transaction requires w

Re: Second Cassandra users survey

2011-11-07 Thread Radim Kolar
Take a look at this: http://www.oracle.com/technetwork/database/nosqldb/overview/index.html > I understand the limitation/advantages of the architecture. Read this http://en.wikipedia.org/wiki/CAP_theorem

RE: Second Cassandra users survey

2011-11-06 Thread Pierre Chalamet
ber 07, 2011 8:02 AM To: user@cassandra.apache.org Subject: Re: Second Cassandra users survey > Yeah, I can use HBase too. but why you are not using hbase if its feature set fits your needs better and want to have same functionality in cassandra? Its good that both projects are different in

Re: Second Cassandra users survey

2011-11-06 Thread Radim Kolar
Yeah, I can use HBase too. but why you are not using hbase if its feature set fits your needs better and want to have same functionality in cassandra? Its good that both projects are different in this area. From rest of your post it looks like you want to have cassandra ACID compliant, which i

Re: Second Cassandra users survey

2011-11-06 Thread Robert Jackson
On Nov 6, 2011, at 3:41 PM, Ed Anuff wrote: > I'd like to see official support for Zookeeper inside of Cassandra. > I'd like it to be something that can be optionally configured. I'd > like to be able to make batch mutations atomic using it. Not sure how possible this is, but we are forced to u

Re: Second Cassandra users survey

2011-11-06 Thread Ed Anuff
On Sun, Nov 6, 2011 at 12:52 AM, Radim Kolar wrote: > - support for atomic operations or batches (if QUORUM fails, data should not > be visible with ONE) > zookeeper is solving that. I'd like to see official support for Zookeeper inside of Cassandra. I'd like it to be something that can be option

RE: Second Cassandra users survey

2011-11-06 Thread Pierre Chalamet
>>- support for atomic operations or batches (if QUORUM fails, data should not be visible with ONE) >zookeeper is solving that. Yeah, I can use HBase too. I might have screwed up a little bit since I didn't talk about isolation; let's reformulate: support for read committed (using DB terminolog

Re: Second Cassandra users survey

2011-11-06 Thread Mohit Anchlia
nd put". >> Did I miss something in my reading of intent? >> -Sarah >> >> -Original Message- >> From: Aaron Turner [mailto:synfina...@gmail.com] >> Sent: Sunday, November 06, 2011 8:25 AM >> To: user@cassandra.apache.org >> Subject: Re: Se

Re: Second Cassandra users survey

2011-11-06 Thread Aaron Turner
ed to leave such utilities external.  At its core was "get > and put". > Did I miss something in my reading of intent? > -Sarah > > -Original Message- > From: Aaron Turner [mailto:synfina...@gmail.com] > Sent: Sunday, November 06, 2011 8:25 AM > To: user@cass

RE: Second Cassandra users survey

2011-11-06 Thread Sarah Baker
omething in my reading of intent? -Sarah -Original Message- From: Aaron Turner [mailto:synfina...@gmail.com] Sent: Sunday, November 06, 2011 8:25 AM To: user@cassandra.apache.org Subject: Re: Second Cassandra users survey 1. Basic SQL-like summary transforms for both CQL and Thrift API c

Re: Second Cassandra users survey

2011-11-06 Thread Aaron Turner
1. Basic SQL-like summary transforms for both CQL and Thrift API clients like: SUM AVG MIN MAX 2. Native 64bit UNsigned datatype 3. Add support for matching column names via LIKE (% and _ wildcards) for ascii type -- Aaron Turner http://synfin.net/         Twitter: @synfinatic http://tcprep

Re: Second Cassandra users survey

2011-11-06 Thread Radim Kolar
- support for atomic operations or batches (if QUORUM fails, data should not be visible with ONE) zookeeper is solving that. - TTL on CF, rows and counters TTL on counters will be nice, but i am good with rest as it is

Re: Second Cassandra users survey

2011-11-05 Thread Brandon Williams
On Fri, Nov 4, 2011 at 9:50 PM, Jim Newsham wrote: > Our use case is time-series data (such as sampled sensor data).  Each row > describes a particular statistic over time, the column name is a time, and > the column value is the sample.  So it makes perfect sense to want to delete > columns for a

RE: Second Cassandra users survey

2011-11-05 Thread Pierre Chalamet
I missed something. - Pierre -Original Message- From: Jonathan Ellis [mailto:jbel...@gmail.com] Sent: mercredi 2 novembre 2011 00:00 To: user Subject: Second Cassandra users survey Hi all, Two years ago I asked for Cassandra use cases and feature requests. [1] The results [2] have been

Re: Second Cassandra users survey

2011-11-04 Thread Jim Newsham
On 11/4/2011 4:32 PM, Brandon Williams wrote: On Fri, Nov 4, 2011 at 9:19 PM, Jim Newsham wrote: - Bulk column deletion by (column name) range. Without this feature, we are forced to perform a range query and iterate over all of the columns, deleting them one by one (we do this in a batch, but

Re: Second Cassandra users survey

2011-11-04 Thread Brandon Williams
On Fri, Nov 4, 2011 at 9:19 PM, Jim Newsham wrote: > - Bulk column deletion by (column name) range.  Without this feature, we are > forced to perform a range query and iterate over all of the columns, > deleting them one by one (we do this in a batch, but it's still a very slow > approach).  See C

Re: Second Cassandra users survey

2011-11-04 Thread Jim Newsham
- Bulk column deletion by (column name) range. Without this feature, we are forced to perform a range query and iterate over all of the columns, deleting them one by one (we do this in a batch, but it's still a very slow approach). See CASSANDRA-494/3448. If anyone else has a need for this

Re: Second Cassandra users survey

2011-11-03 Thread Todd Burruss
- Better performance when access random columns in a wide row - caching subsets of wide rows - possibly on the same boundaries as the index - some sort of notification architecture when data is inserted. This could be co-processors, triggers, plugins, etc - auto load balance when adding new nodes

Re: Second Cassandra users survey

2011-11-03 Thread Konstantin Naryshkin
I realize that it is not realistic to expect it, but is would be good to have a Partitioner that supports both range slices and automatic load balancing. On Thu, Nov 3, 2011 at 13:57, Ertio Lew wrote: > Provide an option to sort columns by timestamp i.e, in the order they have > been added to the

Re: Second Cassandra users survey

2011-11-03 Thread Ertio Lew
Provide an option to sort columns by timestamp i.e, in the order they have been added to the row, with the facility to use any column names. On Wed, Nov 2, 2011 at 4:29 AM, Jonathan Ellis wrote: > Hi all, > > Two years ago I asked for Cassandra use cases and feature requests. > [1] The results

Re: Second Cassandra users survey

2011-11-03 Thread Peter Tillotson
To: user@cassandra.apache.org; Peter Tillotson Sent: Thursday, 3 November 2011, 14:15 Subject: Re: Second Cassandra users survey On Thu, Nov 3, 2011 at 5:46 AM, Peter Tillotson wrote: > I'm using Cassandra as a big graph database, loading large volumes of data > live and linking on the fl

Re: Second Cassandra users survey

2011-11-03 Thread Mohit Anchlia
Through GoldenOrb and Hadoop writables a managed to get both a BigTable and > Pregel access model onto my Cassandra data. It was schema specific, but > provided a local compute model. > p > ________ > From: Jonathan Ellis > To: user > Sent: Tuesday, 1 N

Re: Second Cassandra users survey

2011-11-03 Thread Radim Kolar
* Compaction is expensive Yes, it is. Thats why i deciced not to go with hadoop hdfs backed by cassandra.

Re: Second Cassandra users survey

2011-11-03 Thread Peter Tillotson
pute model.  p  From: Jonathan Ellis To: user Sent: Tuesday, 1 November 2011, 22:59 Subject: Second Cassandra users survey Hi all, Two years ago I asked for Cassandra use cases and feature requests. [1]  The results [2] have been extremely useful in setting and prioritizing goals for Cassan

Re: Second Cassandra users survey

2011-11-02 Thread Boris Yen
1. entity groups 2. cql support in cassandra-cli. 3. offset support in slice_range. 4. more sophisticated secondary index implementation. On Wed, Nov 2, 2011 at 8:38 PM, Patrick Julien wrote: > - entity groups > - co-processors > - materialized views > - CQL support directly in cassandra-cli > >

Re: Second Cassandra users survey

2011-11-02 Thread Patrick Julien
- entity groups - co-processors - materialized views - CQL support directly in cassandra-cli On Tue, Nov 1, 2011 at 6:59 PM, Jonathan Ellis wrote: > Hi all, > > Two years ago I asked for Cassandra use cases and feature requests. > [1]  The results [2] have been extremely useful in setting and > p

Re: Second Cassandra users survey

2011-11-01 Thread Ramesh Natarajan
Here is my wish list - I would love Cassandra to - provide a efficient method to retrieve the count of columns for a given row without resorting to read all columns and calculate the count for a given row key. - support auto increment column names - Column slice based query doesn't take advant