compression

2012-09-23 Thread Tamar Fraenkel
Hi! In datastax documentationthere is an explanation of what CFs are a good fit for compression: When to Use Compression Compression is best suited for column families where there are many rows, with each row having the same columns, or at leas

Re: Correct model

2012-09-23 Thread Marcelo Elias Del Valle
2012/9/20 aaron morton > I would consider: > > # User CF > * row_key: user_id > * columns: user properties, key=value > > # UserRequests CF > * row_key: where partition_start is the start > of a time partition that makes sense in your domain. e.g. partition > monthly. Generally want to avoid row

Re: Correct model

2012-09-23 Thread Hiller, Dean
But the only advantage in this solution is to split data among partitions? You need to split data among partitions or your query won't scale as more and more data is added to table. Having the partition means you are querying a lot less rows. What do you mean here by current partition? He mea

found major difference in CQL vs Scalable SQL(PlayOrm) and question

2012-09-23 Thread Hiller, Dean
I have been digging more and more into CQL vs. PlayOrm S-SQL and found a major difference that is quite interesting(thought you might be interested plus I have a question). CQL uses a composite row key with the prefix so now any other tables that want to reference that entity have references to th

Re: batch_mutate and erlang

2012-09-23 Thread Tyler Hobbs
It's a pretty solid standard at this point. The large majority of client library work from this point on will be based on cql. On Sun, Sep 23, 2012 at 12:45 AM, Bradford Toney wrote: > Yeah i've seen how it's done in CQL3 is just wasn't sure if it was a solid > standard yet. I will probably go t

Re: compression

2012-09-23 Thread Tyler Hobbs
Due to repetition in the column metadata, you're still likely to get a reasonable amount of compression. This is especially true if there is some amount of repetition in the column names, values, or TTLs in wide rows. Compression will almost always be beneficial unless you're already somehow CPU b

Re: Cassandra Messages Dropped

2012-09-23 Thread Michael Theroux
There were no errors in the log (other than the messages dropped exception pasted below), and the node does recover. We have only a small number of secondary indexes (3 in the whole system). However, I went through the cassandra code, and I believe I've worked through this problem. Just to fi

Re: compression

2012-09-23 Thread Hiller, Dean
As well as your unlimited column names may all have the same prefix, right? Like "accounts".rowkey56, "accounts".rowkey78, etc. etc. so the "accounts gets a ton of compression then. Later, Dean From: Tyler Hobbs mailto:ty...@datastax.com>> Reply-To: "user@cassandra.apache.org

Secondary index loss on node restart

2012-09-23 Thread Michael Theroux
Hello, We have been noticing an issue where, about 50% of the time in which a node fails or is restarted, secondary indexes appear to be partially lost or corrupted. A drop and re-add of the index appears to correct the issue. There are no errors in the cassandra logs that I see. Part of the

Re: [problem with OOM in nodes]

2012-09-23 Thread aaron morton
> /var/log/cassandra$ cat system.log | grep "Compacting large" | grep -E > "[0-9]+ bytes" -o | cut -d " " -f 1 | awk '{ foo = $1 / 1024 / 1024 ; > print foo "MB" }' | sort -nr | head -n 50 > Is it bad signal? Sorry, I do not know what this is outputting. >> As I can see in cfstats, compacted r

Re: any ways to have compaction use less disk space?

2012-09-23 Thread Віталій Тимчишин
If you think about space, use Leveled compaction! This won't only allow you to fill more space, but also will shrink you data much faster in case of updates. Size compaction can give you 3x-4x more space used than there are live data. Consider the following (our simplified) scenario: 1) The data is

Re: any ways to have compaction use less disk space?

2012-09-23 Thread Aaron Turner
On Sun, Sep 23, 2012 at 8:18 PM, Віталій Тимчишин wrote: > If you think about space, use Leveled compaction! This won't only allow you > to fill more space, but also will shrink you data much faster in case of > updates. Size compaction can give you 3x-4x more space used than there are > live data

Re: CQL 2, CQL 3 and Thrift confusion

2012-09-23 Thread Sylvain Lebresne
In CQL3, names are case insensitive by default, while they were case sensitive in CQL2. You can force whatever case you want in CQL3 however using double quotes. So in other words, in CQL3, USE "TestKeyspace"; should work as expected. -- Sylvain On Sun, Sep 23, 2012 at 9:22 PM, Oleksandr Petrov

Re: Disk configuration in new cluster node

2012-09-23 Thread Aaron Turner
On Fri, Sep 21, 2012 at 2:05 AM, aaron morton wrote: >> Would it help if I partitioned the computing resources of my physical >> machines into VMs? > > No. > Just like cutting a cake into smaller pieces does not mean you can eat more > without getting fat. > > In the general case, regular HDD and

Re: Varchar indexed column and IN(...)

2012-09-23 Thread aaron morton
> If this is intended behavior, could somebody please point me to where this is > documented? It is intended. The docs don't make it totally clear though: syntax is: { = | < | > | <= | >= } IN ( [,...]) http://www.datastax.com/docs/1.1/references/cql/SELECT the key_value means only the pr

Re: Correct model

2012-09-23 Thread aaron morton
Yup. (Multi get is just a convenience method, it explodes into multiple gets on the server side. ) Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 24/09/2012, at 5:01 AM, "Hiller, Dean" wrote: > But the only advantage in this solution is

Re: Cassandra Messages Dropped

2012-09-23 Thread aaron morton
> To put in other words, Cassandra will lock down all tables until all pending > flush requests fit in the pending queue. This was the first issue I looked at in my Cassandra SF talk http://www.datastax.com/events/cassandrasummit2012/presentations I've seen it occur more often with lots-o-second

Re: Cassandra Messages Dropped

2012-09-23 Thread Michael Theroux
Love the Mars lander analogies :) On Sep 23, 2012, at 5:39 PM, aaron morton wrote: >> To put in other words, Cassandra will lock down all tables until all pending >> flush requests fit in the pending queue. > This was the first issue I looked at in my Cassandra SF talk > http://www.datastax.com

Cassandra simulator

2012-09-23 Thread Shankaranarayanan P N
Hi, Has there been any updates on the cassandra simulator https://issues.apache.org/jira/browse/CASSANDRA-561 ? I have been trying to build it using Cassandra 0.4 (which I believe was the version the simulator was built with ), but the build breaks at multiple places. I thought it would be useful

Re: Cassandra simulator

2012-09-23 Thread Tyler Hobbs
You might find these two projects useful: - ccm, which makes it easy to run a cluster on a single machine: https://github.com/pcmanus/ccm - Cassanova, which supports a large portion of the Thrift API with a lightweight python process: https://github.com/riptano/Cassanova On Sun, Sep 23, 2012 at

Re: [problem with OOM in nodes]

2012-09-23 Thread Denis Gabaydulin
On Sun, Sep 23, 2012 at 10:41 PM, aaron morton wrote: > /var/log/cassandra$ cat system.log | grep "Compacting large" | grep -E > "[0-9]+ bytes" -o | cut -d " " -f 1 | awk '{ foo = $1 / 1024 / 1024 ; > print foo "MB" }' | sort -nr | head -n 50 > > > Is it bad signal? > > Sorry, I do not know what

Re: CQL 2, CQL 3 and Thrift confusion

2012-09-23 Thread Oleksandr Petrov
Yup, that was exactly the cause. Somehow I could not figure out why it was downcasing my keyspace name all the time. May be good to put it somewhere in reference material with a more detailed explanation. On Sun, Sep 23, 2012 at 9:30 PM, Sylvain Lebresne wrote: > In CQL3, names are case insensiti

Re: compression

2012-09-23 Thread Tamar Fraenkel
Thanks all, that helps. Will start with one - two CFs and let you know the effect *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Sun, Sep 23, 2012 at 8:21 PM, Hiller, Dean