Hi All,
Sorry for the wide distribution.
Our cassandra is running on 1.0.10. Recently, we are facing a weird
situation. We have a column family containing wide rows (each row might
have a few million of columns). We delete the columns on a daily basis and
we also run major compaction on it everyd
I found this bug, seems it is fixed. But I can see that in my situation, the
decommission node still I can see from the JMX console LoadMap attribute.
Might this is the reason why hector says not enough replica??
Experts, any thoughts??
Thanks.
--
View this message in context:
http://cassand
Not sure to understand you correctly, but if you are dealing with ghost
nodes that you want to remove, I never saw a node that could resist to an
"unsafeAssassinateEndpoint".
http://grokbase.com/t/cassandra/user/12b9eaaqq4/remove-crashed-node
http://grokbase.com/t/cassandra/user/133nmsm3hd/removin
@Rob: Thanks about the feedback.
Yet I have a weird behavior still unexplained about repairing. Are counters
supposed to be "repaired" too ? I mean, while reading at CL.ONE I can have
different values depending on what node is answering. Even after a read
repair or a full repair. Shouldn't a repai
Hi,
Adding vnodes is a big improvement to Cassandra, specifically because we
have a fluctuating load on our Cassandra depending on the week, and it is
quite annoying to add some nodes for one week or two, move tokens and then
having to remove them and then move tokens again. Even more if we could
>From this list and the NYC* conference it seems that the consensus
configuration of C* on EC2 is to put the data on an ephemeral drive and
then periodically back it the drive to S3...relying on C*'s inherent fault
tolerance to deal with any data loss.
Fine, and we're doing this...but we find that
Hi all,
I currently have 2 clusters, one running on 1.1.10 using CQL2 and one
running on 1.2.4 using CQL3 and Vnodes. The machines in the 1.2.4 cluster are
expected to have better IO performance as we are going from 1 SSD data disk per
node in the 1.1 cluster to 3 SSD data disks per node
I am not sure of the new default is to use compression, but I do not
believe compression is a good default. I find compression is better for
larger column families that are sparsely read. For high throughput CF's I
feel that decompressing larger blocks hurts performance more then
compression adds.
The biggest reason I'm using compression here is that my data lends itself well
to it due to the composite columns. My current compression ratio is 30.5%.
Not sure it matters but my BF false positive ration os 0.048.
From: Edward Capriolo mailto:edlinuxg...@gmail.com>>
Reply-To: "user@cassandr
With you use compression you should play with your block size. I believe
the default may be 32K but I had more success with 8K, nearly same
compression ratio, less young gen memory pressure.
On Thu, May 16, 2013 at 10:42 AM, Keith Wright wrote:
> The biggest reason I'm using compression here is
Does Cassandra need to load the entire SSTable into memory to uncompress it or
does it only load the relevant block? I ask because if its the latter, that
would not explain why I'm seeing so much higher read MB/s in the 1.2 cluster as
the block sizes are the same in both.
From: Edward Capriolo
Might you be experiencing this?
https://issues.apache.org/jira/browse/CASSANDRA-4417
/Janne
On May 16, 2013, at 14:49 , Alain RODRIGUEZ wrote:
> @Rob: Thanks about the feedback.
>
> Yet I have a weird behavior still unexplained about repairing. Are counters
> supposed to be "repaired" too ?
On May 16, 2013, at 17:05 , Brian Tarbox wrote:
> An alternative that we had explored for a while was to do a two stage backup:
> 1) copy a C* snapshot from the ephemeral drive to an EBS drive
> 2) do an EBS snapshot to S3.
>
> The idea being that EBS is quite reliable, S3 is still the emergency
I indeed had some of those in the past. But my point is not that much to
understand how I can get different counts depending on the node (I consider
this as a weakness of counters and I am aware of it), my wonder is more
why those inconsistent, distinct counters never converge even after a
repair.
Boris,
We hit exactly the same issue, and you are correct the newly created SSTables
are the cause of why most of the column-tombstone not being purged.
There is an improvement in 1.2 train where both the minimum and maximum
timestamp for a row is now stored and used during the compaction to de
My 5 cents: I'd check blockdev --getra for data drives - too high values
for readahead (default to 256 for debian) can hurt read performance.
On 05/16/2013 05:14 PM, Keith Wright wrote:
Hi all,
I currently have 2 clusters, one running on 1.1.10 using CQL2 and
one running on 1.2.4 using CQ
We actually have it set to 512. I have tried decreasing my SSTable size to 5
MB and changing the chunk size to 8 kb (and run an sstableupgrade to ensure
they took effect) but am still seeing similar performance. Is anyone running
lz4 compression in production? I'm thinking of reverting back t
512 sectors for read-ahead. Are your new fancy SSD drives using large
sectors? If your read-ahead is really reading 512 x 4KB per random IO,
then that 2 MB per read seems like a lot of extra overhead.
-Bryan
On Thu, May 16, 2013 at 12:35 PM, Keith Wright wrote:
> We actually have it set to
I was going to say something similar I feel like the SSD drives read much
"more" then the standard drive. Read Ahead/arge sectors could and probably
does explain it.
On Thu, May 16, 2013 at 3:43 PM, Bryan Talbot wrote:
> 512 sectors for read-ahead. Are your new fancy SSD drives using large
> se
just in case it will be useful to somebody - here is my checklist for
better read performance from SSD
1. limit read-ahead to 16 or 32
2. enable 'trickle_fsync' (available starting from cassandra 1.1.x)
3. use 'deadline' io-scheduler (much more important for rotational
drives then for SSD)
4.
This makes sense. Unless you are running major compaction a delete could
only happen if the bloom filters confirmed the row was not in the sstables
not being compacted. If your rows are wide the odds are that they are in
most/all sstables and then finally removing them would be tricky.
On Thu, Ma
Thank you for that. I did not have trickle_fsync enabled and will give it a
try. I just noticed that when running a describe on my table, I do not see the
sstable size parameter (compaction_strategy_options = {'sstable_size_in_mb':5})
included. Is that expected? Does it mean its using the de
lz4 is supposed to achieve similar compression while using less resources
then snappy. It is easy to test, just change then run a 'nodetool rebuild'
. Not sure when lz4 was introduced but being that it is new to cassandra
there may not be many large deployments running it yet.
On Thu, May 16, 201
But the problem is that I would like to use Cassandra embeeded? This is not
possible any more?
2013/5/15 Edward Capriolo
>
> You are doing something wrong. What I was suggesting is only a hack for
> unit tests. Your not supposed to interact with CassandraServer directly
> like that as a client.
You're nodes are overloaded.
I'd recommend using m1.xlarge instead.
Cheers
-
Aaron Morton
Freelance Cassandra Consultant
New Zealand
@aaronmorton
http://www.thelastpickle.com
On 15/05/2013, at 1:59 PM, Rodrigo Felix
wrote:
> Hi,
>
>I'm executing a workload on YCSB (50
Try the IRC room for the java driver or submit a ticket on the JIRA system, see
the links here https://github.com/datastax/java-driver
Cheers
-
Aaron Morton
Freelance Cassandra Consultant
New Zealand
@aaronmorton
http://www.thelastpickle.com
On 15/05/2013, at 5:50 PM, bjbylh
Are you using a multi get or a range slice ?
Read Repair does not run for range slice queries.
Cheers
-
Aaron Morton
Freelance Cassandra Consultant
New Zealand
@aaronmorton
http://www.thelastpickle.com
On 15/05/2013, at 6:51 PM, Sergey Naumov wrote:
>> see that RR works, bu
> When "drop column family" is executed irrespective of the existence of
> generation of Snapshot, $KS/$CF/ directory certainly remains.
I don't think there is any code there to delete the empty directories. We only
care about the files in there.
Cheers
-
Aaron Morton
Freelan
You should configure the seeds as recommended regardless of the snitch used.
You need to update the yaml file to start using the GossipingPropertyFileSnitch
but after that it reads the cassandra-rackdc.properties file to get information
about the node. It reads uses the information in gossip to
We don't have cursors in the RDBMS sense of things.
If you are using thrift the recommendation is to use connection pooling and
re-use connections for different requests. Note that you can not multiplex
queries over the same thrift connection, you must wait for the response before
issuing anoth
(Assuming you have enabled tcp_nodelay on the client socket)
Check the server side latency, using nodetool cfstats or nodetool cfhistograms.
Check the logs for messages from the GCInspector about ParNew pauses.
Cheers
-
Aaron Morton
Freelance Cassandra Consultant
New Zealand
what version of netty is on your classpath?
On 05/16/2013 07:33 PM, aaron morton wrote:
Try the IRC room for the java driver or submit a ticket on the JIRA
system, see the links here https://github.com/datastax/java-driver
Cheers
-
Aaron Morton
Freelance Cassandra Consultant
Thanks. This is kind of a expert advice for me.
--
View this message in context:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Decommission-nodes-starts-to-appear-from-one-node-1-0-11-tp7587842p7587876.html
Sent from the cassandra-u...@incubator.apache.org mailing list archi
On Tue, 14 May 2013, aaron morton wrote:
After several cycles, pycassa starts getting connection failures.
Do you have the error stack ?Are the TimedOutExceptions or socket time
outs or something else.
I figured out the problem here and made this ticket in jira:
https://issues.apa
Please give an example of the code you are trying to execute.
On Thu, May 16, 2013 at 6:26 PM, Everton Lima wrote:
> But the problem is that I would like to use Cassandra embeeded? This is
> not possible any more?
>
>
> 2013/5/15 Edward Capriolo
>
>>
>> You are doing something wrong. What I was
Mutagen Cassandra is a framework providing schema versioning and mutation
for Apache Cassandra. It is similar to Flyway for SQL databases.
https://github.com/toddfast/mutagen-cassandra
Mutagen is a lightweight framework for applying versioned changes (known as
mutations) to a resource, in this ca
On 5/16/13 10:22 PM, Todd Fast wrote:
Mutagen Cassandra is a framework providing schema versioning and
mutation for Apache Cassandra. It is similar to Flyway for SQL databases.
https://github.com/toddfast/mutagen-cassandra
Mutagen is a lightweight framework for applying versioned changes (known
37 matches
Mail list logo