Thank you for confirming that the per node data size is most likely causing
the long repair process. I have tried a repair on smaller column families
and it was significantly faster.
On Wed, Apr 11, 2012 at 9:55 PM, aaron morton wrote:
> If you have 1TB of data it will take a long time to repair
If you have 1TB of data it will take a long time to repair. Every bit of data
has to be read and a hash generated. This is one of the reasons we often
suggest that around 300 to 400Gb per node is a good load in the general case.
Look at nodetool compactionstats .Is there a validation compaction
If you increase sstable_size_in_mb to 200MB, you will need more IO for each
compaction. For example, if your memtable will be flushed, and LCS needs to
compact it with 10 overwrapped L1 sstables, you will need almost 2GB read and
2GB write for the single compaction.
From iPhone
On 2012/04/11,
auto_bootstrap parameter has been removed and always enabled since 1.0.
maki
On 2012/04/12, at 6:10, Paolo Bernardi wrote:
> I think that setting auto_bootstrap = true or false into cassandra.yaml is
> enough (if it isn't there already just add it, for example, after
> initial_token)
>
> Pa
>In general I would limit the data load per node to 300 to 400GB. Otherwise
> things can painful when it comes time to run compaction / repair / move .
+1 on more nodes of moderate size
I think that setting auto_bootstrap = true or false into cassandra.yaml is
enough (if it isn't there already just add it, for example, after
initial_token)
Paolo
On Apr 11, 2012 10:34 PM, "Jay Parashar" wrote:
> Thanks a lot Jeremiah.
> Also would you be able to tell me where to configure the a
Thanks a lot Jeremiah.
Also would you be able to tell me where to configure the auto_bootstrap
parameter in version 1.0.8?
Thanks
Jay
-Original Message-
From: Jeremiah Jordan [mailto:jeremiah.jor...@morningstar.com]
Sent: Wednesday, April 11, 2012 3:03 PM
To: user@cassandra.apache.org
S
You have to use nodetool move to change the token after the node has started
the first time. The value in the config file is only used on first startup.
Unless you were using RF=3 on your 3 node ring, you can't just start with a new
token without using nodetool. You have to do move so that the
In general I would limit the data load per node to 300 to 400GB. Otherwise
things can painful when it comes time to run compaction / repair / move .
Cheers
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On 12/04/2012, at 1:00 AM, Dave Brosius wrote
HH in 1.X+ is very good, but it is still an optimisation for achieving
consistency.
>> So I expect that even if I loose some HH then some other replica will reply
>> with data. Is it correct?
Run a repair and see.
Cheers
-
Aaron Morton
Freelance Developer
@aaronmorton
http:/
> a) "These queries are not easily supported on standard Cassandra"
> select * from book where price < 992 order by price descending limit 30;
>
> This is a typical (time series data)timeline query well supported by
> Cassandra, from my understanding.
Queries that use a secondary index (on pric
thank you :)
On Wed, Apr 11, 2012 at 8:55 PM, Eric Evans wrote:
> On Wed, Apr 11, 2012 at 5:20 AM, puneet loya wrote:
> > insert into users (KEY) values (512313);
> >
> > users is my column family and key is its only attribute..
> >
> > It is giving an error
> > bad request : line 1:24 required
See this post for info on how the table loader is configured
http://www.datastax.com/dev/blog/bulk-loading
Cheers
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On 11/04/2012, at 6:07 AM, aaron morton wrote:
> Did you update the config for sstablel
I created a 3 node ring with the intial_token blank. Of course as expected,
Cassandra generated its own tokens on startup (e.g. tokens X, Y and Z)
The nodes or course were not properly balanced, so I did the following steps
1) stopped all the 3 nodes
2) assigned initial_tokens (A,
> However after recovering from this issue (freeing some space and fixing the
> value of "commitlog_total_space_in_mb" in cassandra.yaml)
Did the commit log grow larger than commitlog_total_space_in_mb ?
> I realized that all statistics were all destroyed. I have bad values on every
> single c
Your json is for a standard CF but you are trying to load it into a super CF.
There is a dedicated bulk loader interface you may find useful
http://www.datastax.com/dev/blog/bulk-loading
Cheers
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On 11/
> 'system_memory_in_mb' (3760) and the 'system_cpu_cores' (1) according to our
> nodes' specification. We also changed the 'MAX_HEAP_SIZE' to 2G and the
> 'HEAP_NEWSIZE' to 200M (we think the second is related to the Garbage
> Collection).
It's best to leave the default settings unless you know
I will disable read repair for slice requests fully (we can handle those on
the application side) until we upgrade to 1.0.8.
Thanks,
Thibaut
On Wed, Apr 11, 2012 at 7:04 PM, Jeremy Hanna wrote:
> I backported this to 0.8.4 and it didn't fix the problem we were seeing
> (as I outlined in my para
I backported this to 0.8.4 and it didn't fix the problem we were seeing (as I
outlined in my parallel post) but if it fixes it for you, then beautiful. Just
wanted to let you know our experience with similar symptoms.
On Apr 11, 2012, at 11:56 AM, Thibaut Britz wrote:
> Fixed in https://issue
Fixed in https://issues.apache.org/jira/browse/CASSANDRA-3843
On Wed, Apr 11, 2012 at 5:58 PM, Thibaut Britz <
thibaut.br...@trendiction.com> wrote:
> We have read repair disabled (0.0).
>
> Even if this would be the case, this also doesn't explain why the writes
> are executed again and again
fwiw - we had a similar problem reading at quorum with 0.8.4 when reading with
hadoop. The symptom we see is when reading a column family with hadoop using
quorum using 0.8.4, we have lots of minor compactions as a result of heavy
writes. When we read at CL.ONE or move to 1.0.8 the problem is
Thanks :)
But finally i used Hector and it works fine :D
Date: Wed, 11 Apr 2012 17:19:15 +0200
From: berna...@gmail.com
To: user@cassandra.apache.org
Subject: Re: INserting data in Cassandra
On 04/11/12 11:42, Aliou SOW wrote:
And I u
We have read repair disabled (0.0).
Even if this would be the case, this also doesn't explain why the writes
are executed again and again when going over the same range again and again.
The keyspace is new, it doesn't contain any thumbstones and only 1 keys.
On Wed, Apr 11, 2012 at 5:52 PM
Are you sure this isn't read-repair?
http://wiki.apache.org/cassandra/ReadRepair
2012/4/11 Thibaut Britz
> Also executing the same multiget rangeslice query over the same range
> again will trigger the same writes again and again.
>
> On Wed, Apr 11, 2012 at 5:41 PM, Thibaut Britz <
> thibaut.br
Also executing the same multiget rangeslice query over the same range again
will trigger the same writes again and again.
On Wed, Apr 11, 2012 at 5:41 PM, Thibaut Britz <
thibaut.br...@trendiction.com> wrote:
> Hi,
>
> I just diagnosted this strange behavior:
>
> When I fetch a rangeslice through
Hi,
I just diagnosted this strange behavior:
When I fetch a rangeslice through hector and set the consistency level to
quorum, according to cfstats (and also to the output files on the hd),
cassandra seems to execute a write request for each read I execute. The
write count in cfstats is increased
On Wed, Apr 11, 2012 at 5:20 AM, puneet loya wrote:
> insert into users (KEY) values (512313);
>
> users is my column family and key is its only attribute..
>
> It is giving an error
> bad request : line 1:24 required (...)+ loop did not match anything at input
> ')'
>
> do you find any error here
On 04/11/12 11:42, Aliou SOW wrote:
And I used the tool json2sstable, but that does not work, I always
have an error:
java.lang.RuntimeException: Can't write Super columns to the Standard
Column Family.
So I have two questions:
1) What I did wrong, must I define the complete structure of m
Can you expand further on your issue? Were you using Random Patitioner?
thanks
On Tue, Apr 10, 2012 at 5:35 PM, David Leimbach wrote:
> I had this happen when I had really poorly generated tokens for the ring.
> Cassandra seems to accept numbers that are too big. You get hot spots
> when you
Hello,
Any help Or idea?
Thanks
From: aliouji...@hotmail.com
To: user@cassandra.apache.org
Subject: INserting data in Cassandra
Date: Wed, 11 Apr 2012 09:42:52 +
Hello
all,
We would like to
adopt Cassandra solution
for storing our biological data which
are essentially microarray d
It's easy to spend other people's money, but handling 1TB of data with
1.5 g heap? Memory is cheap, and just a little more will solve many
problems.
On 04/11/2012 08:43 AM, Romain HARDOUIN wrote:
Thank you for your answers.
I originally post this question because we encoutered an OOM Excep
On Wed, Apr 11, 2012 at 2:43 PM, Romain HARDOUIN
wrote:
>
> Thank you for your answers.
>
> I originally post this question because we encoutered an OOM Exception on 2
> nodes during repair session.
> Memory analyzing shows an hotspot: an ArrayList of SSTableBoundedScanner
> which contains as many
Thank you for your answers.
I originally post this question because we encoutered an OOM Exception on
2 nodes during repair session.
Memory analyzing shows an hotspot: an ArrayList of SSTableBoundedScanner
which contains as many objects there are SSTables on disk (7747 objects at
the time).
Thi
insert into users (KEY) values (512313);
users is my column family and key is its only attribute..
It is giving an error
bad request : line 1:24 required (...)+ loop did not match anything at
input ')'
do you find any error here?
On 04/11/2012 12:04 PM, ruslan usifov wrote:
HH - this is hinted handoff?
Yes
2012/4/11 Igor mailto:i...@4friends.od.ua>>
On 04/11/2012 11:49 AM, R. Verlangen wrote:
Not everything, just HH :)
I hope this works for me for the next reasons: I have quite large
RF (6 datacente
HH - this is hinted handoff?
2012/4/11 Igor
> On 04/11/2012 11:49 AM, R. Verlangen wrote:
>
> Not everything, just HH :)
>
> I hope this works for me for the next reasons: I have quite large RF (6
> datacenters, each carry one replica of all dataset), read and write at CL
> ONE, relatively smal
On 04/11/2012 11:49 AM, R. Verlangen wrote:
Not everything, just HH :)
I hope this works for me for the next reasons: I have quite large RF (6
datacenters, each carry one replica of all dataset), read and write at
CL ONE, relatively small TTL - 10 days, I have no deletes, servers
almost never
Well, if everything works 100% at any time there should be nothing to
repair, however with a distributed cluster it would be pretty rare for that
to occur. At least that is how I interpret this.
2012/4/11 Igor
> BTW, I heard that we don't need to run repair if all your data have TTL,
> all HH w
BTW, I heard that we don't need to run repair if all your data have TTL,
all HH works, and you never delete your data.
On 04/11/2012 11:34 AM, ruslan usifov wrote:
Sorry fo my bad english, so QUORUM allow doesn't make repair
regularity? But form your anser it does not follow
2012/4/11 R. Ve
Sorry fo my bad english, so QUORUM allow doesn't make repair regularity?
But form your anser it does not follow
2012/4/11 R. Verlangen
> Yes, I personally have configured it to perform a repair once a week, as
> the GCGraceSeconds is at 10 days.
>
> This is also what's in the manual
> http://wi
Yes, I personally have configured it to perform a repair once a week, as
the GCGraceSeconds is at 10 days.
This is also what's in the manual
http://wiki.apache.org/cassandra/Operations#Repairing_missing_or_inconsistent_data
(point
2)
2012/4/11 ruslan usifov
> Hello
>
> I have follow question, i
Hello
I have follow question, if we Read and write to cassandra claster with
QUORUM consistency level, does this allow to us do not call nodetool repair
regular? (i.e. every GCGraceSeconds)
42 matches
Mail list logo