Re: Correct way to set strategy options in cqlsh?

2012-05-22 Thread Romain HARDOUIN
You *must* remove the hyphen. According to the csql 2.0 documentation, here is the correct syntax to create keyspace: ::= "CREATE" "KEYSPACE" "WITH" "=" ( "AND" "=" )* ; ::= | ":"

Re: Using EC2 ephemeral 4disk raid0 cause high iowait trouble

2012-05-22 Thread Tamar Fraenkel
Did you upgrade DataStax AMIs? Did you add a node to an existing ring? Thanks *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Wed, May 23, 2012 at 2:00 AM, Deno Vichas wrot

how to get list of snapshots

2012-05-22 Thread Илья Шипицин
Hello! I'm about to schedule backups in the following way a) snapshots are done daily b) increment backups are enabled so, backup will be consistent, very old snapshots must be removed (I guess, a week depth should be enough). couple of questions: 1) is there any good guide for scheduling back

Re: supercolumns with TTL columns not being compacted correctly

2012-05-22 Thread samal
Thanks I didn't knew two stage removal process. On 23-May-2012 2:20 AM, "Jonathan Ellis" wrote: > Correction: the first compaction after expiration + gcgs can remove > it, even if it hasn't been turned into a tombstone previously. > > On Tue, May 22, 2012 at 9:37 AM, Jonathan Ellis wrote: > > A

Re: Cassandra 0.8.5: Column name mystery in create column family command

2012-05-22 Thread samal
I an not able to reproduce this in cli. On 22-May-2012 8:12 PM, "Roshan Dawrani" wrote: > Can you please let me know why? Because I have created very similar column > familes many times with comparator = BytesType, and never run into this > issue before. > > Here is an example: > > --

Confusion regarding the terms "replica" and "replication factor"

2012-05-22 Thread java jalwa
Hi all, I am a bit confused regarding the terms "replica" and "replication factor". Assume that I am using RandomPartitioner and NetworkTopologyStrategy for replica placement. >From what I understand, with a RandomPartitioner, a row key will always be hashed and be stored on the node

Re: Number of keyspaces

2012-05-22 Thread Franc Carter
On Wed, May 23, 2012 at 7:42 AM, aaron morton wrote: > 1 KS with 24 CF's will use roughly the same resources as 24 KS's with 1 > CF. Each CF: > > * loads the bloom filter for each SSTable > * samples the index for each sstable > * uses row and key cache > * has a current memtable and potentially m

Re: exception when cleaning up...

2012-05-22 Thread Boris Yen
Hi Aaron, Rob, Thanks for the information, I will try it. Regards, Boris On Tue, May 22, 2012 at 11:47 PM, Rob Coli wrote: > On Tue, May 22, 2012 at 3:00 AM, aaron morton > wrote: > > 1) Isolating the node from the cluster to stop write activity. You can > > either start the node with the -Dc

Re: Using EC2 ephemeral 4disk raid0 cause high iowait trouble

2012-05-22 Thread koji Lin
Hi Thanks for your information, we will try that. koji 2012/5/23 Deno Vichas > for what it's worth i've been having pretty good success using the > Datastax AMIs. > > > > On 5/17/2012 6:59 PM, koji Lin wrote: > > Hi > > We use amazon ami 3.2.12-3.2.4.amzn1.x86_64 > > and some of our data file

Re: Using EC2 ephemeral 4disk raid0 cause high iowait trouble

2012-05-22 Thread koji Lin
Hi I think amazon ami is based on RHEL. thank you 2012/5/21 aaron morton > Are you using the Ubuntu operating system ? > > Cheers > > - > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 18/05/2012, at 1:59 PM, koji Lin wrote: > > Hi > >

Re: schema fail to load on some nodes

2012-05-22 Thread Yiming Sun
It indeed looks almost the same, except in our case, we are only using UTF8Type. Hopefully when they release 1.1.1, all will be fixed. Thanks for making me aware of this issue, Tyler. -- Y. On Tue, May 22, 2012 at 7:28 PM, Tyler Hobbs wrote: > Looks like this: https://issues.apache.org/jira/b

Re: schema fail to load on some nodes

2012-05-22 Thread Tyler Hobbs
Looks like this: https://issues.apache.org/jira/browse/CASSANDRA-4269 On Tue, May 22, 2012 at 4:10 PM, Yiming Sun wrote: > Hi, > > We are setting up a 6-node cassandra cluster within one data center. 3 in > rack1 and the other 3 in rack2. The tokens are assigned alternating > between rack 1 an

Re: Using EC2 ephemeral 4disk raid0 cause high iowait trouble

2012-05-22 Thread Deno Vichas
for what it's worth i've been having pretty good success using the Datastax AMIs. On 5/17/2012 6:59 PM, koji Lin wrote: Hi We use amazon ami 3.2.12-3.2.4.amzn1.x86_64 and some of our data file are more than 10G thanks koji 2012-5-16 下午6:00 於 "aaron morton"

Re: Number of keyspaces

2012-05-22 Thread aaron morton
1 KS with 24 CF's will use roughly the same resources as 24 KS's with 1 CF. Each CF: * loads the bloom filter for each SSTable * samples the index for each sstable * uses row and key cache * has a current memtable and potentially memtables waiting to flush. * had secondary index CF's I would gen

Replication factor

2012-05-22 Thread Daning Wang
Hello, What is the pros and cons to choose different number of replication factor in term of performance? if space is not a concern. for example, if I have 4 nodes cluster in one data center, how can RF=2 vs RF=4 affect read performance? If consistency level is ONE, looks reading does not need to

schema fail to load on some nodes

2012-05-22 Thread Yiming Sun
Hi, We are setting up a 6-node cassandra cluster within one data center. 3 in rack1 and the other 3 in rack2. The tokens are assigned alternating between rack 1 and rack 2. There is one seed node in each rack. Below is the ring: r1-node1DC1 r1 0 (seed) r2-node1DC1

Re: supercolumns with TTL columns not being compacted correctly

2012-05-22 Thread Jonathan Ellis
Correction: the first compaction after expiration + gcgs can remove it, even if it hasn't been turned into a tombstone previously. On Tue, May 22, 2012 at 9:37 AM, Jonathan Ellis wrote: > Additionally, it will always take at least two compaction passes to > purge an expired column: one to turn it

RE: 1.1 not removing commit log files?

2012-05-22 Thread Bryce Godfrey
The nodes appear to be holding steady at the 8G that I set it to in the config file now. I'll keep an eye on them. From: aaron morton [mailto:aa...@thelastpickle.com] Sent: Tuesday, May 22, 2012 4:08 AM To: user@cassandra.apache.org Subject: Re: 1.1 not removing commit log files? 4096 is also t

Re: Astyanax Error

2012-05-22 Thread Eran Landau
This is normally the result of not having the client properly set up to talk to Cassandra. Can you send a code snippet of how you are initializing Astyanax? - Eran From: Abhijit Chanda mailto:abhijit.chan...@gmail.com>> Reply-To: mailto:user@cassandra.apache.org>> Date: Tue, 22 May 2012 16:33:

unknown exception with hector

2012-05-22 Thread Deno Vichas
could somebody clue me in to the cause of this exception? i see these randomly. AnalyzerService-2 2012-05-22 13:28:00,385 :: WARN cassandra.connection.HConnectionManager - Exception: me.prettyprint.hector.api.exceptions.HectorTransportException: org.apache.thrift.transport.TTransportExcepti

Re: Welcome committers Dave Brosius and Yuki Morishita!

2012-05-22 Thread Sylvain Lebresne
Thanks for all the work and congratulations to both of you. -- Sylvain On Tue, May 22, 2012 at 5:06 PM, Edward Capriolo wrote: > Congrats! > > On Tue, May 22, 2012 at 10:43 AM, Jonathan Ellis wrote: >> Thanks to both of you for your help! >> >> -- >> Jonathan Ellis >> Project Chair, Apache Cass

Re: exception when cleaning up...

2012-05-22 Thread Rob Coli
On Tue, May 22, 2012 at 3:00 AM, aaron morton wrote: > 1) Isolating the node from the cluster to stop write activity. You can > either start the node with the -Dcassandra.join_ring=false  JVM option or > use nodetool disablethrift and disablegossip to stop writes. Note that this > will not stop ex

Re: Number of keyspaces

2012-05-22 Thread Luís Ferreira
I have 24 keyspaces, each with a columns family and am considering changing it to 1 keyspace with 24 CFs. Would this be beneficial? On May 22, 2012, at 12:56 PM, samal wrote: > Not ideally, now cass has global memtable tuning. Each cf correspond to > memory in ram. Year wise cf means it will be

Re: Correct way to set strategy options in cqlsh?

2012-05-22 Thread Damick, Jeffrey
Thanks, but that would be for the cli, not cqlsh CREATE KEYSPACE something ... WITH strategy_class = 'NetworkTopologyStrategy' AND strategy_options={us-east:1}; Invalid syntax at line 2, char 72 WITH strategy_class = 'NetworkTopologyStrategy' AND strategy_options={us-eas

Re: Correct way to set strategy options in cqlsh?

2012-05-22 Thread Yiming Sun
AND strategy_options={us-east:1, us-west:1}; On Tue, May 22, 2012 at 11:10 AM, Damick, Jeffrey < jeffrey.dam...@neustar.biz> wrote: > What’s the correct way to set the strategy options for the > networktopologystrategy with cqlsh? > > I’ve tried several variations, but what’s expected way to e

Correct way to set strategy options in cqlsh?

2012-05-22 Thread Damick, Jeffrey
What's the correct way to set the strategy options for the networktopologystrategy with cqlsh? I've tried several variations, but what's expected way to escape the hyphen in "us-east" ? Thanks, -jeff CREATE KEYSPACE something ... WITH strategy_class = 'NetworkTopologyStrategy' ... AN

Re: Welcome committers Dave Brosius and Yuki Morishita!

2012-05-22 Thread Edward Capriolo
Congrats! On Tue, May 22, 2012 at 10:43 AM, Jonathan Ellis wrote: > Thanks to both of you for your help! > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of DataStax, the source for professional Cassandra support > http://www.datastax.com

Re: Cassandra 0.8.5: Column name mystery in create column family command

2012-05-22 Thread Roshan Dawrani
Can you please let me know why? Because I have created very similar column familes many times with comparator = BytesType, and never run into this issue before. Here is an example: ColumnFamily: Client Key Validation Class: or

Re: supercolumns with TTL columns not being compacted correctly

2012-05-22 Thread Jonathan Ellis
Additionally, it will always take at least two compaction passes to purge an expired column: one to turn it into a tombstone, and a second (after gcgs) to remove it. On Tue, May 22, 2012 at 9:21 AM, Yuki Morishita wrote: > Data will not be deleted when those keys appear in other stables outside o

Re: supercolumns with TTL columns not being compacted correctly

2012-05-22 Thread Yuki Morishita
Data will not be deleted when those keys appear in other stables outside of compaction. This is to prevent obsolete data from appearing again. yuki On Tuesday, May 22, 2012 at 7:37 AM, Pieter Callewaert wrote: > > Hi Samal, > > > > > > Thanks for your time looking into this. > >

Re: Cassandra 0.8.5: Column name mystery in create column family command

2012-05-22 Thread samal
Change your comparator to utf8type. On 22-May-2012 4:32 PM, "Roshan Dawrani" wrote: > Hi, > > I use Cassandra 0.8.5 and am suddenly noticing some strange behavior. I > run a "create column family" command with some column meta-data and it runs > fine, but when I do "describe keyspace", it shows m

Re: Tuning cassandra (compactions overall)

2012-05-22 Thread Alain RODRIGUEZ
"not sure what you mean by And after restarting the second one I have lost all the consistency of my data. All my statistics since September are totally false now in production Can you give some examples?" After restarting my 2 nodes (one after the other), All my counters have become wrong. The c

Re: Astyanax Error

2012-05-22 Thread samal
Are you able to connect through cli? Can you share your client code? On 22-May-2012 5:59 PM, "Abhijit Chanda" wrote: > Samal, > > > But I am setting up the Host. > > On Tue, May 22, 2012 at 5:30 PM, samal wrote: > >> Host not found in client. >> On 22-May-2012 4:34 PM, "Abhijit Chanda" >> wrote

RE: supercolumns with TTL columns not being compacted correctly

2012-05-22 Thread Pieter Callewaert
Hi Samal, Thanks for your time looking into this. I force the compaction by using forceUserDefinedCompaction on only that particular sstable. This gurantees me the new sstable being written only contains the data from the old sstable. The data in the sstable is more than 31 days old and gc_grac

Re: supercolumns with TTL columns not being compacted correctly

2012-05-22 Thread samal
Data will remain till next compaction but won't be available. Compaction will delete old sstable create new one. On 22-May-2012 5:47 PM, "Pieter Callewaert" wrote: > Hi, > > ** ** > > I’ve had my suspicions some months, but I think I am sure about it. > > Data is being written by the SST

Re: Astyanax Error

2012-05-22 Thread Abhijit Chanda
Samal, But I am setting up the Host. On Tue, May 22, 2012 at 5:30 PM, samal wrote: > Host not found in client. > On 22-May-2012 4:34 PM, "Abhijit Chanda" > wrote: > >> Hi All, >> >> Can any one suggest me why i am getting this error in Astyanax >> NoAvailableHostsException: [host=None(0.0.0.0

Re: RE Ordering counters in Cassandra

2012-05-22 Thread samal
Secondary index is not supported for counters plus you must know column name to support secondary index on regular column. On 22-May-2012 5:34 PM, "Filippo Diotalevi" wrote: > Thanks for all the answers, they definitely helped. > > Just out of curiosity, is there any underlying architectural rea

supercolumns with TTL columns not being compacted correctly

2012-05-22 Thread Pieter Callewaert
Hi, I've had my suspicions some months, but I think I am sure about it. Data is being written by the SSTableSimpleUnsortedWriter and loaded by the sstableloader. The data should be alive for 31 days, so I use the following logic: int ttl = 2678400; long timestamp = System.currentTimeMillis() * 1

Re: RE Ordering counters in Cassandra

2012-05-22 Thread Filippo Diotalevi
Thanks for all the answers, they definitely helped. Just out of curiosity, is there any underlying architectural reason why it's not possible to order a row based on its counters values? or is it something that might be in the roadmap in the future? -- Filippo Diotalevi On Tuesday, 22 M

Re: Astyanax Error

2012-05-22 Thread samal
Host not found in client. On 22-May-2012 4:34 PM, "Abhijit Chanda" wrote: > Hi All, > > Can any one suggest me why i am getting this error in Astyanax > NoAvailableHostsException: [host=None(0.0.0.0):0, latency=0(0), > attempts=0] No hosts to borrow from > > > Thanks In Advance > Abhijit >

Re: Number of keyspaces

2012-05-22 Thread samal
Not ideally, now cass has global memtable tuning. Each cf correspond to memory in ram. Year wise cf means it will be in read only state for next year, memtable will still consume ram. On 22-May-2012 5:01 PM, "Franc Carter" wrote: > On Tue, May 22, 2012 at 9:19 PM, aaron morton wrote: > >> It's

Re: Number of keyspaces

2012-05-22 Thread Franc Carter
On Tue, May 22, 2012 at 9:19 PM, aaron morton wrote: > It's more the number of CF's than keyspaces. > Oh - does increasing the number of Column Families affect performance ? The design we are working on at the moment is considering using a Column Family per year. We were thinking this would isol

Re: Number of keyspaces

2012-05-22 Thread R. Verlangen
Hmm, you got me on that. I assumed (~ wrong) that more keyspaces would mean more CF's. 2012/5/22 aaron morton > It's more the number of CF's than keyspaces. > > Cheers > > - > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 22/05/2012, at 6

Re: Number of keyspaces

2012-05-22 Thread aaron morton
It's more the number of CF's than keyspaces. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 22/05/2012, at 6:58 PM, R. Verlangen wrote: > Yes, it does. However there's no real answer what's the limit: it depends on > your hardware and clu

Re: how can we get (a lot) more performance from cassandra

2012-05-22 Thread aaron morton
> I would look into the problems you are having with GC... When ParNew runs the jvm pauses https://blogs.oracle.com/jonthecollector/entry/our_collectors . If it's pausing for 4 seconds it's not processing queries. > Then check the throughput on the san and the steal on the VM's. Check to se

Cassandra 0.8.5: Column name mystery in create column family command

2012-05-22 Thread Roshan Dawrani
Hi, I use Cassandra 0.8.5 and am suddenly noticing some strange behavior. I run a "create column family" command with some column meta-data and it runs fine, but when I do "describe keyspace", it shows me different column names for those index columns. a) Here is what I run: "create column family

Re: 1.1 not removing commit log files?

2012-05-22 Thread aaron morton
4096 is also the internal hard coded default for commitlog_total_space_in_mb If you are seeing more that 4GB of commit log files let us know. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 22/05/2012, at 6:35 AM, Bryce Godfrey wrote: > T

Cassandra 0.8.5: Column name mystery in create column family command

2012-05-22 Thread Roshan Dawrani
Hi, I use Cassandra 0.8.5 and am suddenly noticing some strange behavior. I run a "create column family" command with some column meta-data and it runs fine, but when I do "describe keyspace", it shows me different column names for those index columns. a) Here is what I run: "create column family

Re: Ordering counters in Cassandra

2012-05-22 Thread aaron morton
> What's the best approach to perform this task? get the columns in slices of a 100 or so and order on the client. Then write a new row that is a pivot so the column name is the aggregate count and the column value is the old column name. To slice the row, make the first call with no start_col

Re: Tuning cassandra (compactions overall)

2012-05-22 Thread aaron morton
not sure what you mean by > And after restarting the second one I have lost all the consistency of > my data. All my statistics since September are totally false now in > production Can you give some examples? Counter are not idempotent so if the client app retries TimedOut requests you can get

Re: endless hinted handoff with 1.1

2012-05-22 Thread aaron morton
kinds of like https://issues.apache.org/jira/browse/CASSANDRA-3733 but maybe different. Have you recently dropped as CF ? it looks like the hints CF is only compacted if hints are replayed. If they are dropped because the CF no longer exists compaction is not forced ( https://github.com/apac

Re: Exception when truncate

2012-05-22 Thread aaron morton
The first part of the name is the current system time in milliseconds. If you run it twice do you get log messages about failing to create the same directory twice ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 21/05/2012, at 5:09 AM,

Re: exception when cleaning up...

2012-05-22 Thread aaron morton
CASSANDRA-3712 has not been applied to 0.8.X. If I understand the problem correctly the issue is 0.8.10. You may be able to avoid the race condition by: 1) Isolating the node from the cluster to stop write activity. You can either start the node with the -Dcassandra.join_ring=false JVM opti

Re: Does Cassandra support parallel query processing?

2012-05-22 Thread aaron morton
In general read queries run on multiple nodes. But each node computes the complete result to the query. There is no support for aggregate queries. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 20/05/2012, at 6:49 PM, Majid Azimi wrote:

Re: cassandra read latency help

2012-05-22 Thread aaron morton
With > heap size = 4 gigs I would check for GC activity in the logs and consider setting it to 8 given you have 16 GB. You can also check if the IO system is saturated (http://spyced.blogspot.co.nz/2010/01/linux-performance-basics.html) Also take a look at nodetool cfhistogram perhaps to see

Re: unable to nodetool to remote EC2

2012-05-22 Thread ramesh
On 05/22/2012 12:45 AM, Tamar Fraenkel wrote: Thanks for the response. But it still does not work. I am running the script from a git bash on my windows 7. adding some debug prints, this is what I am running

Re: Repair Process Taking too long

2012-05-22 Thread aaron morton
It repairs the ranges they have in common. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 20/05/2012, at 4:05 PM, Raj N wrote: > Can I infer from this that if I have 3 replicas, then running repair without > -pr won 1 node will repair th

Re: Data aggregation - averages, sums, etc.

2012-05-22 Thread aaron morton
Continuous computation is the sort of thing Storm (https://github.com/nathanmarz/storm) can help with. And good news everybody, storing the output from Storm is the sort of thing Cassandra can help with http://www.youtube.com/watch?v=cF8a_FZwULI Cheers - Aaron Morton Freelance

Re: nodetool repair taking forever

2012-05-22 Thread aaron morton
> I also dont understand if all these nodes are replicas of each other why is > that the first node has almost double the data. Have you performed any token moves ? Old data is not deleted unless you run nodetool cleanup. Another possibility is things like a lot of hints. Admittedly it would hav

Re: RE Ordering counters in Cassandra

2012-05-22 Thread Romain HARDOUIN
I mean iterate over each column -- more precisly: *bunches of columns* using slices -- and write new columns in the inversed index. Tamar's data model is made for real time analysis. It's maybe overdesigned for a daily ranking. I agree with Samal, you should split your data across the space of to

Re: RE Ordering counters in Cassandra

2012-05-22 Thread samal
In some cases Cassandra is really good and in some cases it is not. The way I see your approach is your are recording all of your events in single "key" is it? Not recommended. It can go really big also if your have cluster of servers, "It will hit only one server all the time make it overwhelm, a