Re: composite column validation_class question

2012-11-08 Thread Wei Zhu
Any thoughts? Thanks. -Wei From: Wei Zhu To: Cassandr usergroup Sent: Wednesday, November 7, 2012 12:47 PM Subject: composite column validation_class question Hi All, I am trying to design my schema using composite column. One thing I am a bit confused

read request distribution

2012-11-08 Thread Wei Zhu
Hi All, I am doing a benchmark on a Cassandra. I have a three node cluster with RF=3. I generated 6M rows with sequence  number from 1 to 6m, so the rows should be evenly distributed among the three nodes disregarding the replicates. I am doing a benchmark with read only requests, I generate re

Indexing Data in Cassandra with Elastic Search

2012-11-08 Thread Brian O'Neill
For those looking to index data in Cassandra with Elastic Search, here is what we decided to do: http://brianoneill.blogspot.com/2012/11/big-data-quadfecta-cassandra-storm.html -brian -- Brian ONeill Lead Architect, Health Market Science (http://healthmarketscience.com) mobile:215.588.6024 blog:

Re: Multiple keyspaces vs Multiple CFs

2012-11-08 Thread Edward Capriolo
It is not as bad with hector, but still each Keyspace object is another socket open to Cassandra. If you have 500 webservers and 10 keyspaces. Instead of having 5000 connections you now have 5000. On Thu, Nov 8, 2012 at 6:35 PM, sankalp kohli wrote: > I think this code is from the thrift part. I

Re: get_range_slice gets no rowcache support?

2012-11-08 Thread Manu Zhang
I did overlook something. get_range_slice will invoke cfs.getRawCachedRow instead of cfs.getThroughCache. Hence, no row will be cached if it's not present in the row cache. Well, this puzzles me further as to that how the range of rows is expected to get stored into the row cache in the first place

Multiple Clusters Keyspacse to one core cluster

2012-11-08 Thread ws
If I have multiple clusters can I replicate a keyspace from each of those cluster to separate cluster?

unsubscribe

2012-11-08 Thread Jeremy McKay
smime.p7s Description: S/MIME cryptographic signature

Re: Loading data on-demand in Cassandra

2012-11-08 Thread sal
Pierre Chalamet chalamet.net> writes: > > Hi,You do not need to have 700 Go of data in RAM. Cassandra is able to store on disks and query from there if data is not cached in memory. Caches are maintained by C* by itself but you still have to some configuration.Supposing you want to store around

Re: Multiple keyspaces vs Multiple CFs

2012-11-08 Thread sankalp kohli
I think this code is from the thrift part. I use hector. In hector, I can create multiple keyspace objects for each keyspace and use them when I want to talk to that keyspace. Why will it need to do a round trip to the server for each switch. On Thu, Nov 8, 2012 at 3:28 PM, Edward Capriolo wrote:

Re: Multiple keyspaces vs Multiple CFs

2012-11-08 Thread Edward Capriolo
In the old days the API looked like this. client.insert("Keyspace1", key_user_id, new ColumnPath("Standard1", null, "name".getBytes("UTF-8")), "Chris Goffinet".getBytes("UTF-8"), timestamp, ConsistencyLevel.ONE); bu

Re: Multiple keyspaces vs Multiple CFs

2012-11-08 Thread sankalp kohli
I am a bit confused. One connection pool I know is the one which MessageService has to other nodes. Then there will be incoming connections via thrift from clients. How are they affected by multiple keyspaces? On Thu, Nov 8, 2012 at 3:14 PM, Edward Capriolo wrote: > Any connection pool. Imagine

Re: Multiple keyspaces vs Multiple CFs

2012-11-08 Thread Edward Capriolo
Any connection pool. Imagine if you have 10 column families in 10 keyspaces. You pull a connection off the pool and the odds are 1 in 10 of it being connected to the keyspace you want. So 9 out of 10 times you have to have a network round trip just to change the keyspace, or you have to build a key

Re: Strange delay in query

2012-11-08 Thread Josep Blanquer
Can it be that you have tons and tons of tombstoned columns in the middle of these two? I've seen plenty of performance issues with wide rows littered with column tombstones (you could check with dumping the sstables...) Just a thought... Josep M. On Thu, Nov 8, 2012 at 12:23 PM, André Cruz wro

Read during digest mismatch

2012-11-08 Thread sankalp kohli
Hi, Lets say I am reading with consistency TWO and my replication is 3. The read is eligible for global read repair. It will send a request to get data from one node and a digest request to two. If there is a digest mismatch, what I am reading from the code looks like it will get the data from

Re: Multiple keyspaces vs Multiple CFs

2012-11-08 Thread sankalp kohli
Which connection pool are you talking about? On Thu, Nov 8, 2012 at 2:19 PM, Edward Capriolo wrote: > it is better to have one keyspace unless you need to replicate the > keyspaces differently. The main reason for this is that changing > keyspaces requires an RPC operation. Having 10 keyspaces w

Re: Multiple keyspaces vs Multiple CFs

2012-11-08 Thread Edward Capriolo
it is better to have one keyspace unless you need to replicate the keyspaces differently. The main reason for this is that changing keyspaces requires an RPC operation. Having 10 keyspaces would mean having 10 connection pools. On Thu, Nov 8, 2012 at 4:59 PM, sankalp kohli wrote: > Is it better t

Multiple keyspaces vs Multiple CFs

2012-11-08 Thread sankalp kohli
Is it better to have 10 Keyspaces with 10 CF in each keyspace. or 100 keyspaces with 1 CF each. I am talking in terms of memory footprint. Also I would be interested to know how much better one is over other. Thanks, Sankalp

Re: Hinted Handoff runs every ten minutes

2012-11-08 Thread Mike Heffner
Is there a ticket open for this for 1.1.6? We also noticed this after upgrading from 1.1.3 to 1.1.6. Every node runs a 0 row hinted handoff every 10 minutes. N-1 nodes hint to the same node, while that node hints to another node. On Tue, Oct 30, 2012 at 1:35 PM, Vegard Berget wrote: > Hi, > >

Re: leveled compaction and tombstoned data

2012-11-08 Thread B. Todd Burruss
@ben, thx, we will be deploying 2.2.1 of DSE soon and will try to setup a traffic sampling node so we can test leveled compaction. we essentially keep a rolling window of data written once. it is written, then after N days it is deleted, so it seems that leveled compaction should help On Thu, No

Re: Strange delay in query

2012-11-08 Thread André Cruz
These are the two columns in question: => (super_column=13957152-234b-11e2-92bc-e0db550199f4, (column=attributes, value=, timestamp=1351681613263657) (column=blocks, value=A4edo5MhHvojv3Ihx_JkFMsF3ypthtBvAZkoRHsjulw06pez86OHch3K3OpmISnDjHODPoCf69bKcuAZSJj-4Q, timestamp=1351681613263657

Re: leveled compaction and tombstoned data

2012-11-08 Thread B. Todd Burruss
thanks for the links! i had forgotten about live sampling On Thu, Nov 8, 2012 at 11:41 AM, Brandon Williams wrote: > On Thu, Nov 8, 2012 at 1:33 PM, Aaron Turner wrote: >> There are also ways to bring up a test node and just run Level Compaction on >> that. Wish I had a URL handy, but hopefull

Re: leveled compaction and tombstoned data

2012-11-08 Thread Ben Coverston
Also to answer your question, LCS is well suited to workloads where overwrites and tombstones come into play. The tombstones are _much_ more likely to be merged with LCS than STCS. I would be careful with the patch that was referred to above, it hasn't been reviewed, and from a glance it appears t

Re: leveled compaction and tombstoned data

2012-11-08 Thread Brandon Williams
On Thu, Nov 8, 2012 at 1:33 PM, Aaron Turner wrote: > There are also ways to bring up a test node and just run Level Compaction on > that. Wish I had a URL handy, but hopefully someone else can find it. This rather handsome fellow wrote a blog about it: http://www.datastax.com/dev/blog/whats-new

Re: leveled compaction and tombstoned data

2012-11-08 Thread Ben Coverston
http://www.datastax.com/docs/1.1/operations/tuning#testing-compaction-and-compression Write Survey mode. After you have it up and running you can modify the column family mbean to use LeveledCompactionStrategy on that node to see how your hardware/load fares with LCS. On Thu, Nov 8, 2012 at 11:

Re: leveled compaction and tombstoned data

2012-11-08 Thread Jeremy Hanna
LCS works well in specific circumstances, this blog post gives some good considerations: http://www.datastax.com/dev/blog/when-to-use-leveled-compaction On Nov 8, 2012, at 1:33 PM, Aaron Turner wrote: > "kill performance" is relative. Leveled Compaction basically costs 2x disk > IO. Look at

Re: leveled compaction and tombstoned data

2012-11-08 Thread Aaron Turner
"kill performance" is relative. Leveled Compaction basically costs 2x disk IO. Look at iostat, etc and see if you have the headroom. There are also ways to bring up a test node and just run Level Compaction on that. Wish I had a URL handy, but hopefully someone else can find it. Also, if you'r

Re: leveled compaction and tombstoned data

2012-11-08 Thread B. Todd Burruss
we are running Datastax enterprise and cannot patch it. how bad is "kill performance"? if it is so bad, why is it an option? On Thu, Nov 8, 2012 at 10:17 AM, Radim Kolar wrote: > Dne 8.11.2012 19:12, B. Todd Burruss napsal(a): > >> my question is would leveled compaction help to get rid of the

Kundera 2.2 released

2012-11-08 Thread Amresh Kumar Singh
Hi All, We are happy to announce release of Kundera 2.2. Kundera is a JPA 2.0 based, object-datastore mapping library for NoSQL datastores. The idea behind Kundera is to make working with NoSQL Databases drop-dead simple and fun. It currently supports Cassandra, HBase, MongoDB and relational da

Re: How to insert composite column in CQL3?

2012-11-08 Thread Alan Ristić
Ok, this article answered all the confusion in my head: http://www.datastax.com/dev/blog/thrift-to-cql3 It's a must read for noobs (like me). It perfectly explains mappings and diffs between internals and CQL3(abstractions). First read this and THEN go study all the resources out there ;) Lp, Ala

Re: leveled compaction and tombstoned data

2012-11-08 Thread Radim Kolar
Dne 8.11.2012 19:12, B. Todd Burruss napsal(a): my question is would leveled compaction help to get rid of the tombstoned data faster than size tiered, and therefore reduce the disk space usage? leveled compaction will kill your performance. get patch from jira for maximum sstable size per CF

leveled compaction and tombstoned data

2012-11-08 Thread B. Todd Burruss
we are having the problem where we have huge SSTABLEs with tombstoned data in them that is not being compacted soon enough (because size tiered compaction requires, by default, 4 like sized SSTABLEs). this is using more disk space than we anticipated. we are very write heavy compared to reads, an

Re: Strange delay in query

2012-11-08 Thread Andrey Ilinykh
What is the size of columns? Probably those two are huge. On Thu, Nov 8, 2012 at 4:01 AM, André Cruz wrote: > On Nov 7, 2012, at 12:15 PM, André Cruz wrote: > > > This error also happens on my application that uses pycassa, so I don't > think this is the same bug. > > I have narrowed it down t

How to insert composite column in CQL3?

2012-11-08 Thread Alan Ristić
Hi there! I'm strugguling to figure out (for quite few hours now) how can I insert for example column with TimeUUID name and empy value in CQL3 in fictional table. And what's the table design? I'm interested in syntax (e.g. example). I'm trying to do something like Matt Dennis did here (*Cassandr

Re: Compact and Repair

2012-11-08 Thread Henrik Schröder
No, we haven't changed RF, but it's been a very long time since we repaired last, so we're guessing this is an effect of not running repair regularly, and that doing it regularly will fix it. It would just be nice to know. Also, running major compaction after the repair made the data size shrink b

Re: Compact and Repair

2012-11-08 Thread Alain RODRIGUEZ
Did you change the RF or had a node down since you repaired last time ? 2012/11/8 Henrik Schröder > No, we're not using columns with TTL, and I performed a major compaction > before the repair, so there shouldn't be vast amounts of tombstones moving > around. > > And the increase happened durin

Re: Compact and Repair

2012-11-08 Thread Henrik Schröder
No, we're not using columns with TTL, and I performed a major compaction before the repair, so there shouldn't be vast amounts of tombstones moving around. And the increase happened during the repair, the nodes gained ~20-30GB each. /Henrik On Thu, Nov 8, 2012 at 12:40 PM, horschi wrote: > H

Re: Strange delay in query

2012-11-08 Thread André Cruz
On Nov 7, 2012, at 12:15 PM, André Cruz wrote: > This error also happens on my application that uses pycassa, so I don't think > this is the same bug. I have narrowed it down to a slice between two consecutive columns. Observe this behaviour using pycassa: >>> DISCO_CASS.col_fam_nsrev.get(uui

Compact and Repair

2012-11-08 Thread Henrik Schröder
Hi, We recently ran a major compaction across our cluster, which reduced the storage used by about 50%. This is fine, since we do a lot of updates to existing data, so that's the expected result. The day after, we ran a full repair -pr across the cluster, and when that finished, each storage node

Storage limit for a particular user on Cassandra

2012-11-08 Thread mallikharjun.vemana
Hi, Is there a way we can limit the data of a particular user on the Cassandra cluster? Say for example, I have three users namely, Jsmith, Elvis, Dilbert configured in my Cassandra deployment. And I wanted to limit the data usage for them as follows. Jsmith - 1 GB Elvis - 2 GB Dilbert - 500 M

Re: can't start cqlsh on new Amazon node

2012-11-08 Thread Tamar Fraenkel
Hi A bit more info on that I have one working setup with python-cql1.0.9-1 python-thrift 0.6.0-2~riptano1 cassandra1.0.8 The setup where cqlsh is not working has: python-cql1.0.10-1 python-thrift 0.6.0-2~riptano1 cassandra1.0.11 Maybe this will give someone a hint of what the pr