Re: Repair Process Taking too long

2012-04-11 Thread Frank Ng
Thank you for confirming that the per node data size is most likely causing the long repair process. I have tried a repair on smaller column families and it was significantly faster. On Wed, Apr 11, 2012 at 9:55 PM, aaron morton wrote: > If you have 1TB of data it will take a long time to repair

Re: Repair Process Taking too long

2012-04-11 Thread aaron morton
If you have 1TB of data it will take a long time to repair. Every bit of data has to be read and a hash generated. This is one of the reasons we often suggest that around 300 to 400Gb per node is a good load in the general case. Look at nodetool compactionstats .Is there a validation compaction

Re: Why so many SSTables?

2012-04-11 Thread Watanabe Maki
If you increase sstable_size_in_mb to 200MB, you will need more IO for each compaction. For example, if your memtable will be flushed, and LCS needs to compact it with 10 overwrapped L1 sstables, you will need almost 2GB read and 2GB write for the single compaction. From iPhone On 2012/04/11,

Re: Initial token - newbie question (version 1.0.8)

2012-04-11 Thread Watanabe Maki
auto_bootstrap parameter has been removed and always enabled since 1.0. maki On 2012/04/12, at 6:10, Paolo Bernardi wrote: > I think that setting auto_bootstrap = true or false into cassandra.yaml is > enough (if it isn't there already just add it, for example, after > initial_token) > > Pa

Re: Why so many SSTables?

2012-04-11 Thread Ben Coverston
>In general I would limit the data load per node to 300 to 400GB. Otherwise > things can painful when it comes time to run compaction / repair / move . +1 on more nodes of moderate size

RE: Initial token - newbie question (version 1.0.8)

2012-04-11 Thread Paolo Bernardi
I think that setting auto_bootstrap = true or false into cassandra.yaml is enough (if it isn't there already just add it, for example, after initial_token) Paolo On Apr 11, 2012 10:34 PM, "Jay Parashar" wrote: > Thanks a lot Jeremiah. > Also would you be able to tell me where to configure the a

RE: Initial token - newbie question (version 1.0.8)

2012-04-11 Thread Jay Parashar
Thanks a lot Jeremiah. Also would you be able to tell me where to configure the auto_bootstrap parameter in version 1.0.8? Thanks Jay -Original Message- From: Jeremiah Jordan [mailto:jeremiah.jor...@morningstar.com] Sent: Wednesday, April 11, 2012 3:03 PM To: user@cassandra.apache.org S

RE: Initial token - newbie question (version 1.0.8)

2012-04-11 Thread Jeremiah Jordan
You have to use nodetool move to change the token after the node has started the first time. The value in the config file is only used on first startup. Unless you were using RF=3 on your 3 node ring, you can't just start with a new token without using nodetool. You have to do move so that the

Re: Why so many SSTables?

2012-04-11 Thread aaron morton
In general I would limit the data load per node to 300 to 400GB. Otherwise things can painful when it comes time to run compaction / repair / move . Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 12/04/2012, at 1:00 AM, Dave Brosius wrote

Re: need of regular nodetool repair

2012-04-11 Thread aaron morton
HH in 1.X+ is very good, but it is still an optimisation for achieving consistency. >> So I expect that even if I loose some HH then some other replica will reply >> with data. Is it correct? Run a repair and see. Cheers - Aaron Morton Freelance Developer @aaronmorton http:/

Re: Materialized Views or Index CF - data model question

2012-04-11 Thread aaron morton
> a) "These queries are not easily supported on standard Cassandra" > select * from book where price < 992 order by price descending limit 30; > > This is a typical (time series data)timeline query well supported by > Cassandra, from my understanding. Queries that use a secondary index (on pric

Re: insert in cql

2012-04-11 Thread puneet loya
thank you :) On Wed, Apr 11, 2012 at 8:55 PM, Eric Evans wrote: > On Wed, Apr 11, 2012 at 5:20 AM, puneet loya wrote: > > insert into users (KEY) values (512313); > > > > users is my column family and key is its only attribute.. > > > > It is giving an error > > bad request : line 1:24 required

Re: Issue with SStable loader.

2012-04-11 Thread aaron morton
See this post for info on how the table loader is configured http://www.datastax.com/dev/blog/bulk-loading Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 11/04/2012, at 6:07 AM, aaron morton wrote: > Did you update the config for sstablel

Initial token - newbie question (version 1.0.8)

2012-04-11 Thread Jay Parashar
I created a 3 node ring with the intial_token blank. Of course as expected, Cassandra generated its own tokens on startup (e.g. tokens X, Y and Z) The nodes or course were not properly balanced, so I did the following steps 1) stopped all the 3 nodes 2) assigned initial_tokens (A,

Re: Trouble with wrong data

2012-04-11 Thread aaron morton
> However after recovering from this issue (freeing some space and fixing the > value of "commitlog_total_space_in_mb" in cassandra.yaml) Did the commit log grow larger than commitlog_total_space_in_mb ? > I realized that all statistics were all destroyed. I have bad values on every > single c

Re: json2sstable error: Can not write to the Standard columns Super Column Family

2012-04-11 Thread aaron morton
Your json is for a standard CF but you are trying to load it into a super CF. There is a dedicated bulk loader interface you may find useful http://www.datastax.com/dev/blog/bulk-loading Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 11/

Re: Cassandra running out of memory?

2012-04-11 Thread aaron morton
> 'system_memory_in_mb' (3760) and the 'system_cpu_cores' (1) according to our > nodes' specification. We also changed the 'MAX_HEAP_SIZE' to 2G and the > 'HEAP_NEWSIZE' to 200M (we think the second is related to the Garbage > Collection). It's best to leave the default settings unless you know

Re: cassandra 0.8.7 + hector 0.8.3: All Quorum reads result in writes?

2012-04-11 Thread Thibaut Britz
I will disable read repair for slice requests fully (we can handle those on the application side) until we upgrade to 1.0.8. Thanks, Thibaut On Wed, Apr 11, 2012 at 7:04 PM, Jeremy Hanna wrote: > I backported this to 0.8.4 and it didn't fix the problem we were seeing > (as I outlined in my para

Re: cassandra 0.8.7 + hector 0.8.3: All Quorum reads result in writes?

2012-04-11 Thread Jeremy Hanna
I backported this to 0.8.4 and it didn't fix the problem we were seeing (as I outlined in my parallel post) but if it fixes it for you, then beautiful. Just wanted to let you know our experience with similar symptoms. On Apr 11, 2012, at 11:56 AM, Thibaut Britz wrote: > Fixed in https://issue

Re: cassandra 0.8.7 + hector 0.8.3: All Quorum reads result in writes?

2012-04-11 Thread Thibaut Britz
Fixed in https://issues.apache.org/jira/browse/CASSANDRA-3843 On Wed, Apr 11, 2012 at 5:58 PM, Thibaut Britz < thibaut.br...@trendiction.com> wrote: > We have read repair disabled (0.0). > > Even if this would be the case, this also doesn't explain why the writes > are executed again and again

Re: cassandra 0.8.7 + hector 0.8.3: All Quorum reads result in writes?

2012-04-11 Thread Jeremy Hanna
fwiw - we had a similar problem reading at quorum with 0.8.4 when reading with hadoop. The symptom we see is when reading a column family with hadoop using quorum using 0.8.4, we have lots of minor compactions as a result of heavy writes. When we read at CL.ONE or move to 1.0.8 the problem is

RE: INserting data in Cassandra

2012-04-11 Thread Aliou SOW
Thanks :) But finally i used Hector and it works fine :D Date: Wed, 11 Apr 2012 17:19:15 +0200 From: berna...@gmail.com To: user@cassandra.apache.org Subject: Re: INserting data in Cassandra On 04/11/12 11:42, Aliou SOW wrote: And I u

Re: cassandra 0.8.7 + hector 0.8.3: All Quorum reads result in writes?

2012-04-11 Thread Thibaut Britz
We have read repair disabled (0.0). Even if this would be the case, this also doesn't explain why the writes are executed again and again when going over the same range again and again. The keyspace is new, it doesn't contain any thumbstones and only 1 keys. On Wed, Apr 11, 2012 at 5:52 PM

Re: cassandra 0.8.7 + hector 0.8.3: All Quorum reads result in writes?

2012-04-11 Thread R. Verlangen
Are you sure this isn't read-repair? http://wiki.apache.org/cassandra/ReadRepair 2012/4/11 Thibaut Britz > Also executing the same multiget rangeslice query over the same range > again will trigger the same writes again and again. > > On Wed, Apr 11, 2012 at 5:41 PM, Thibaut Britz < > thibaut.br

Re: cassandra 0.8.7 + hector 0.8.3: All Quorum reads result in writes?

2012-04-11 Thread Thibaut Britz
Also executing the same multiget rangeslice query over the same range again will trigger the same writes again and again. On Wed, Apr 11, 2012 at 5:41 PM, Thibaut Britz < thibaut.br...@trendiction.com> wrote: > Hi, > > I just diagnosted this strange behavior: > > When I fetch a rangeslice through

cassandra 0.8.7 + hector 0.8.3: All Quorum reads result in writes?

2012-04-11 Thread Thibaut Britz
Hi, I just diagnosted this strange behavior: When I fetch a rangeslice through hector and set the consistency level to quorum, according to cfstats (and also to the output files on the hd), cassandra seems to execute a write request for each read I execute. The write count in cfstats is increased

Re: insert in cql

2012-04-11 Thread Eric Evans
On Wed, Apr 11, 2012 at 5:20 AM, puneet loya wrote: > insert into users (KEY) values (512313); > > users is my column family and key is its only attribute.. > > It is giving an error > bad request : line 1:24 required (...)+ loop did not match anything at input > ')' > > do you find any error here

Re: INserting data in Cassandra

2012-04-11 Thread Paolo Bernardi
On 04/11/12 11:42, Aliou SOW wrote: And I used the tool json2sstable, but that does not work, I always have an error: java.lang.RuntimeException: Can't write Super columns to the Standard Column Family. So I have two questions: 1) What I did wrong, must I define the complete structure of m

Re: Repair Process Taking too long

2012-04-11 Thread Frank Ng
Can you expand further on your issue? Were you using Random Patitioner? thanks On Tue, Apr 10, 2012 at 5:35 PM, David Leimbach wrote: > I had this happen when I had really poorly generated tokens for the ring. > Cassandra seems to accept numbers that are too big. You get hot spots > when you

RE: INserting data in Cassandra

2012-04-11 Thread Aliou SOW
Hello, Any help Or idea? Thanks From: aliouji...@hotmail.com To: user@cassandra.apache.org Subject: INserting data in Cassandra Date: Wed, 11 Apr 2012 09:42:52 + Hello all, We would like to adopt Cassandra solution for storing our biological data which are essentially microarray d

Re: Why so many SSTables?

2012-04-11 Thread Dave Brosius
It's easy to spend other people's money, but handling 1TB of data with 1.5 g heap? Memory is cheap, and just a little more will solve many problems. On 04/11/2012 08:43 AM, Romain HARDOUIN wrote: Thank you for your answers. I originally post this question because we encoutered an OOM Excep

Re: Why so many SSTables?

2012-04-11 Thread Sylvain Lebresne
On Wed, Apr 11, 2012 at 2:43 PM, Romain HARDOUIN wrote: > > Thank you for your answers. > > I originally post this question because we encoutered an OOM Exception on 2 > nodes during repair session. > Memory analyzing shows an hotspot: an ArrayList of SSTableBoundedScanner > which contains as many

Re: Why so many SSTables?

2012-04-11 Thread Romain HARDOUIN
Thank you for your answers. I originally post this question because we encoutered an OOM Exception on 2 nodes during repair session. Memory analyzing shows an hotspot: an ArrayList of SSTableBoundedScanner which contains as many objects there are SSTables on disk (7747 objects at the time). Thi

insert in cql

2012-04-11 Thread puneet loya
insert into users (KEY) values (512313); users is my column family and key is its only attribute.. It is giving an error bad request : line 1:24 required (...)+ loop did not match anything at input ')' do you find any error here?

Re: need of regular nodetool repair

2012-04-11 Thread Igor
On 04/11/2012 12:04 PM, ruslan usifov wrote: HH - this is hinted handoff? Yes 2012/4/11 Igor mailto:i...@4friends.od.ua>> On 04/11/2012 11:49 AM, R. Verlangen wrote: Not everything, just HH :) I hope this works for me for the next reasons: I have quite large RF (6 datacente

Re: need of regular nodetool repair

2012-04-11 Thread ruslan usifov
HH - this is hinted handoff? 2012/4/11 Igor > On 04/11/2012 11:49 AM, R. Verlangen wrote: > > Not everything, just HH :) > > I hope this works for me for the next reasons: I have quite large RF (6 > datacenters, each carry one replica of all dataset), read and write at CL > ONE, relatively smal

Re: need of regular nodetool repair

2012-04-11 Thread Igor
On 04/11/2012 11:49 AM, R. Verlangen wrote: Not everything, just HH :) I hope this works for me for the next reasons: I have quite large RF (6 datacenters, each carry one replica of all dataset), read and write at CL ONE, relatively small TTL - 10 days, I have no deletes, servers almost never

Re: need of regular nodetool repair

2012-04-11 Thread R. Verlangen
Well, if everything works 100% at any time there should be nothing to repair, however with a distributed cluster it would be pretty rare for that to occur. At least that is how I interpret this. 2012/4/11 Igor > BTW, I heard that we don't need to run repair if all your data have TTL, > all HH w

Re: need of regular nodetool repair

2012-04-11 Thread Igor
BTW, I heard that we don't need to run repair if all your data have TTL, all HH works, and you never delete your data. On 04/11/2012 11:34 AM, ruslan usifov wrote: Sorry fo my bad english, so QUORUM allow doesn't make repair regularity? But form your anser it does not follow 2012/4/11 R. Ve

Re: need of regular nodetool repair

2012-04-11 Thread ruslan usifov
Sorry fo my bad english, so QUORUM allow doesn't make repair regularity? But form your anser it does not follow 2012/4/11 R. Verlangen > Yes, I personally have configured it to perform a repair once a week, as > the GCGraceSeconds is at 10 days. > > This is also what's in the manual > http://wi

Re: need of regular nodetool repair

2012-04-11 Thread R. Verlangen
Yes, I personally have configured it to perform a repair once a week, as the GCGraceSeconds is at 10 days. This is also what's in the manual http://wiki.apache.org/cassandra/Operations#Repairing_missing_or_inconsistent_data (point 2) 2012/4/11 ruslan usifov > Hello > > I have follow question, i

need of regular nodetool repair

2012-04-11 Thread ruslan usifov
Hello I have follow question, if we Read and write to cassandra claster with QUORUM consistency level, does this allow to us do not call nodetool repair regular? (i.e. every GCGraceSeconds)