Re: Exception in Hadoop Word Count sample

2011-09-14 Thread Tharindu Mathew
Found it. 'ant artifacts' On Thu, Sep 15, 2011 at 12:02 PM, Tharindu Mathew wrote: > Yes. That's the problem. Thanks Jonathan. > > I'm actually using trunk against a 0.7. How can I generate the distro in > trunk? > > Forgive my ignorance, I'm more used to maven. > > > On Thu, Sep 15, 2011 at 1:08

Re: Exception in Hadoop Word Count sample

2011-09-14 Thread Tharindu Mathew
Yes. That's the problem. Thanks Jonathan. I'm actually using trunk against a 0.7. How can I generate the distro in trunk? Forgive my ignorance, I'm more used to maven. On Thu, Sep 15, 2011 at 1:08 AM, Jonathan Ellis wrote: > You're using a 0.8 wordcount against a 0.7 Cassandra? > > On Wed, Sep

Re: Configuring the keyspace correctly - NTS

2011-09-14 Thread Anthony Ikeda
Aaron, when using the RackInferringSnitch, is the octet correlated from the rpc_address or listen_address? I just noticed that when I tried to configure this locally on my laptop I had to "0" (127.0.0.1) instead of "160" (192.160.202.235) Anthony On Wed, Sep 14, 2011 at 3:15 PM, aaron morton wro

Re: Nodetool removetoken taking days to run.

2011-09-14 Thread Brandon Williams
On Wed, Sep 14, 2011 at 4:25 PM, Ryan Hadley wrote: > Hi Brandon, > > Thanks for the reply. Quick question though: > > 1. We write all data to this ring with a TTL of 30 days > 2. This node hasn't been in the ring for at least 90 days, more like 120 days > since it's been in the ring. > > So, if

Re: Configuring the keyspace correctly - NTS

2011-09-14 Thread Anthony Ikeda
Great that makes perfect sense - I apologise for not getting this right it seems I'm doing someone elses job here. Anthony On Wed, Sep 14, 2011 at 3:15 PM, aaron morton wrote: > The strategy_options for NTS accept the data centre name and the rf, > [{ : }] > > Where the DC name comes from the s

Re: Get CL ONE / NTS

2011-09-14 Thread aaron morton
> Are you advising CL.ONE does not worth the game when considering > read performance ? Consistency is not performance, it's a whole new thing to tune in your application. If you have performance issues deal with those as performance issues, better code / data model / hard ware. > By the way, I

Re: Configuring the keyspace correctly - NTS

2011-09-14 Thread aaron morton
The strategy_options for NTS accept the data centre name and the rf, [{ : }] Where the DC name comes from the snitch, so… SimpleSnitch (gotta love this guy, in there day in day out putting in the hard yards) puts all the nodes in "datacenter1" which is why thats in the defaults. RackInferring

RE: Get CL ONE / NTS

2011-09-14 Thread Pierre Chalamet
Thanks Aaron, didn't seen your answer before mine. I do agree for 2/ I might have read error. Good suggestion to use EACH_QUORUM - it could be a good trade off to read at this level if ONE fails. Maybe using LOCAL_QUORUM might be a good answer and will avoid headache after all. Are you advising

RE: Get CL ONE / NTS

2011-09-14 Thread Pierre Chalamet
After reading Cassandra source code, I will try to answer myself. It's kind of good exercise :) >1/ Will I have an error because DC2 does not have any copy of the data ? I've not been able to find how endpoints are determined for the read request, but I guess endpoints are just coming from the cu

Re: Get CL ONE / NTS

2011-09-14 Thread aaron morton
Your current approach to Consistency opens the door to some inconsistent behavior. > 1/ Will I have an error because DC2 does not have any copy of the data ? If you read from DC2 at CL ONE and the data is not replicated it will not be returned. > 2/ Will Cassandra try to get the data from DC

Configuring the keyspace correctly - NTS

2011-09-14 Thread Anthony Ikeda
Okay, in a previous post, it was stated that I could use a NetworkTopologyStrategy in a singel data centre by setting up my keyspace with: create keyspace KeyspaceDEV with placement_strategy = 'org.apache.cassandra.locator.NetworkTopologyStrategy' and strategy_options=[{datacenter1:3}];

Re: Nodetool removetoken taking days to run.

2011-09-14 Thread Ryan Hadley
On Sep 14, 2011, at 2:08 PM, Brandon Williams wrote: > On Wed, Sep 14, 2011 at 8:54 AM, Ryan Hadley wrote: >> Hi, >> >> So, here's the backstory: >> >> We were running Cassandra 0.7.4 and at one point in time had a node in the >> ring at 10.84.73.18. We removed this node from the ring success

Re: Index search in provided list of rows (list of rowKeys).

2011-09-14 Thread aaron morton
The way specify more restrictions to the query is to specify them in the index_clause. The index clause is applied to the set of all rows in the database, not a sub set, applying them to a sub set is implicitly supporting a sub query. Currently it's doing "select then project", this would be "s

Re: what's the difference between repair CF separately and repair the entire node?

2011-09-14 Thread Peter Schuller
>> It is a serious issue if you really need to repair one CF at the time. > > Why is it serious to do repair one CF at a time, if I cannot do that it at a > CF level, then does it mean that I cannot use more than 50% disk space? Is > this specific to this problem or is that a general statement? I a

Re: Nodetool removetoken taking days to run.

2011-09-14 Thread Brandon Williams
On Wed, Sep 14, 2011 at 8:54 AM, Ryan Hadley wrote: > Hi, > > So, here's the backstory: > > We were running Cassandra 0.7.4 and at one point in time had a node in the > ring at 10.84.73.18. We removed this node from the ring successfully in > 0.7.4. It stopped showing in the nodetool ring comman

Re: selective replication

2011-09-14 Thread Adrian Cockcroft
This has been proposed a few times, there are some good use cases for it, and there is no current mechanism for it, but it's been discussed as a possible enhancement. Adrian On Wed, Sep 14, 2011 at 11:06 AM, Todd Burruss wrote: > Has anyone done any work on what I'll call "selective replication"

Re: Exception in Hadoop Word Count sample

2011-09-14 Thread Jonathan Ellis
You're using a 0.8 wordcount against a 0.7 Cassandra? On Wed, Sep 14, 2011 at 2:19 PM, Tharindu Mathew wrote: > I see $subject. Can anyone help me to rectify this? > Stacktrace: > Exception in thread "main" org.apache.thrift.TApplicationException: Required > field 'replication_factor' was not fou

selective replication

2011-09-14 Thread Todd Burruss
Has anyone done any work on what I'll call "selective replication" between DCs? I want to use Cassandra to replicate data to another virtual DC (for analytical purposes), but only "inserts", not "deletes". Picture having two data centers, DC1 for OLTP of short lived data (say 90 day window) an

Re: what's the difference between repair CF separately and repair the entire node?

2011-09-14 Thread Anand Somani
On Tue, Sep 13, 2011 at 3:57 PM, Peter Schuller wrote: > > I think it is a serious problem since I can not "repair". I am > > using cassandra on production servers. is there some way to fix it > > without upgrade? I heard of that 0.8.x is still not quite ready in > > production environment.

Re: Error in upgrading cassandra to 0.8.5

2011-09-14 Thread Jonathan Ellis
Added to NEWS: - After upgrading, run nodetool scrub against each node before running repair, moving nodes, or adding new ones. 2011/9/14 Jonas Borgström : > On 09/13/2011 05:21 PM, Jonathan Ellis wrote: >> More or less.  NEWS.txt explains upgrade procedure in more detail. > > When mov

Nodetool removetoken taking days to run.

2011-09-14 Thread Ryan Hadley
Hi, So, here's the backstory: We were running Cassandra 0.7.4 and at one point in time had a node in the ring at 10.84.73.18. We removed this node from the ring successfully in 0.7.4. It stopped showing in the nodetool ring command. But occasionally we'd still get weird log entries about faili

Get CL ONE / NTS

2011-09-14 Thread Pierre Chalamet
Hello, I have 2 datacenters. Cassandra is configured as follow: - RackInferringSnitch - NetworkTopologyStrategy for CF - strategy_options: DC1:3 DC2:3 Data are written using CL LOCAL_QUORUM so data written from one datacenter will eventually be replicated to the other datacenter. Data is always

Re: Index search in provided list of rows (list of rowKeys).

2011-09-14 Thread Evgeniy Ryabitskiy
Why it's radically? It will be same get_indexes_slices search but in specified set of rows. So mostly it will be one more Search Expression over rowIDs not only column values. Usually the more restrictions you could specify in search query, the faster search it can be (not slower at least). About

Re: Error in upgrading cassandra to 0.8.5

2011-09-14 Thread Jonas Borgström
On 09/13/2011 05:21 PM, Jonathan Ellis wrote: > More or less. NEWS.txt explains upgrade procedure in more detail. When moving from 0.7.x to 0.8.5 do I need to scrub all sstables post upgrade? NEWS.txt doesn't mention anything about that but your comment here seems to indicate so: https://issues

Re: segment fault with 0.8.5

2011-09-14 Thread Jonathan Ellis
That's a pretty old JDK. You should upgrade. On Wed, Sep 14, 2011 at 5:13 AM, Yan Chunlu wrote: > just tried cassandra 0.8.5 binary version, and got Segment fault > > I am using Sun JDK so this is not CASSANDRA-2441 > > OS is Debian 5.0 > > java -version > > java version "1.6.0_04" > > Java(

Re: Cassandra cluster on ec2 and ebs volumes

2011-09-14 Thread Jonathan Ellis
[moving to user@] On Wed, Sep 14, 2011 at 6:22 AM, Giannis Neokleous wrote: > Hello, > > We currently have a cluster running on ec2 and all of the data are on > the instance disks. We also have some old data which are now constant > that we want to serve off from a different cluster still running

Re: segment fault with 0.8.5

2011-09-14 Thread Roshan Dawrani
On Wed, Sep 14, 2011 at 3:43 PM, Yan Chunlu wrote: > I also found that the format of configuration file "cassandra.yaml" is > different, are they compatible? > Format of 0.8.5 cassandra.yaml is different from what? You didn't mention what u r comparing it to. I recently did migration of a simple

segment fault with 0.8.5

2011-09-14 Thread Yan Chunlu
just tried cassandra 0.8.5 binary version, and got Segment fault I am using Sun JDK so this is not CASSANDRA-2441 OS is Debian 5.0 java -version java version "1.6.0_04" Java(TM) SE Runtime Environment (build 1.6.0_04-b12) Java HotSpot(TM) Server VM (build 10.0-b19, mixed mode) uname -

Re: what's the difference between repair CF separately and repair the entire node?

2011-09-14 Thread Yan Chunlu
thanks a lot for the help! I have read the post and think 0.8 might be good enough for me, especially 0.8.5. also change gc_grace_seconds is a acceptable solution. On Wed, Sep 14, 2011 at 4:03 PM, Sylvain Lebresne wrote: > On Wed, Sep 14, 2011 at 9:27 AM, Yan Chunlu wrote: > > is 0.8 ready

Re: what's the difference between repair CF separately and repair the entire node?

2011-09-14 Thread Sasha Dolgy
It was mentioned in another thread that Twitter uses 0.8 in productionfor me that was a fairly strong testimonial... On Sep 14, 2011 9:28 AM, "Yan Chunlu" wrote: > is 0.8 ready for production use? as I know currently many companies > including reddit.com are using 0.7, how does they get rid of

Re: what's the difference between repair CF separately and repair the entire node?

2011-09-14 Thread Sylvain Lebresne
On Wed, Sep 14, 2011 at 9:27 AM, Yan Chunlu wrote: > is 0.8 ready for production use? some related discussion here: http://www.mail-archive.com/user@cassandra.apache.org/msg17055.html but my personal answer is yes. > as I know currently many companies including reddit.com are using 0.7, how > d

Re: what's the difference between repair CF separately and repair the entire node?

2011-09-14 Thread Yan Chunlu
is 0.8 ready for production use? as I know currently many companies including reddit.com are using 0.7, how does they get rid of the repair problem? On Wed, Sep 14, 2011 at 2:47 PM, Sylvain Lebresne wrote: > On Wed, Sep 14, 2011 at 2:38 AM, Yan Chunlu wrote: > > me neither don't want to repair