I did a presentation on diagnosing performance problems in production at
the US & Euro summits, in which I covered quite a few tools & preventative
measures you should know when running a production cluster. You may find
it useful:
http://rustyrazorblade.com/2014/09/cassandra-summit-recap-diagnosi
Yes. It is, in general, a best practice to upgrade to the latest bug fix
release before doing an upgrade to the next point release.
On Tue Dec 09 2014 at 6:58:24 PM wyang wrote:
> I looked some upgrade documentations and am a little puzzled.
>
>
> According to
> https://github.com/apache/cassan
Hi, Everyone:
I'm importing a CSV file into Cassandra using SStableLoader. And
I'm following the example here:
https://github.com/yukim/cassandra-bulkload-example/
When i try to run the sstableloader, it fails with OOM. I also
changed the sstableloader.sh script (that runs t
Hi,
We have Two Node Cluster Configuration in production with RF=2.
Which means that the data is written in both the clusters and it's running
for about a month now and has good amount of data.
Questions?
1. What are the best practices for maintenance?
2. Is OPScenter required to be installed or
I looked some upgrade documentations and am a little puzzled.
According tohttps://github.com/apache/cassandra/blob/cassandra-2.1/NEWS.txt,
“Rolling upgrades from anything pre-2.0.7 is not supported”. It means we should
upgrade to 2.0.7 or later first? Can we rolling upgrade to 2.0.7? Do we need
Thanks Rob. Definitely good advice that I wish I had come across a couple
of months ago... That said, it still definitely points me in the right
direction as to what to do now.
--
*Nathanael Yoder*
Principal Engineer & Data Scientist, Whistle
415-944-7344 // n...@whistle.com
On Tue, Dec 9, 2014
On Mon, Dec 8, 2014 at 5:12 PM, Nate Yoder wrote:
> I am currently running a 6 node Cassandra 2.1.1 cluster on EC2 using
> C3.2XLarge nodes which overall is working very well for us. However, after
> letting it run for a while I seem to get into a situation where the amount
> of disk space used
Hi all,
I'd like to write some tests for my code that uses the Cassandra Java
driver to see how it behaves if there is a read timeout while accessing
Cassandra. Is there a best-practice for getting this done? I was thinking
about adjusting the settings in the cluster builder to adjust the timeou
Hi All,
Thanks for the help but after yet another day of investigation I think I
might be running into this
https://issues.apache.org/jira/browse/CASSANDRA-8061 issue where tmplink
files aren't removed until Cassandra is restarted.
Thanks again for all the suggestions!
Nate
--
*Nathanael Yoder*
Hi Reynald,
Good idea but I have incremental backups turned off and other than *.db
files nothing else appears to be in the data directory for that table.
Is there any other output that would be helpful in helping you all help me?
Thanks,
Nate
--
*Nathanael Yoder*
Principal Engineer & Data Scie
I have spent a lot of time working with single-node, RF=1 clusters in my
development. Before I deploy a cluster to our live environment, I have spent
some time learning how to work with a multi-node cluster with RF=3. There were
some surprises. I’m wondering if people here can enlighten me. I do
Hi Nate,
Are you using incremental backups?
Extract from the documentation (
http://www.datastax.com/documentation/cassandra/2.1/cassandra/operations/ops_backup_incremental_t.html
):
/When incremental backups are enabled (disabled by default), Cassandra
hard-links each flushed SSTable to a
Thanks for the advice. Totally makes sense. Once I figure out how to make
my data stop taking up more than 2x more space without being useful I'll
definitely make the change :)
Nate
--
*Nathanael Yoder*
Principal Engineer & Data Scientist, Whistle
415-944-7344 // n...@whistle.com
On Tue, Dec
Well, I personally don't like RF=2. It means if you're using CL=QUORUM and
a node goes down, you're going to have a bad time. (downtime) If you're
using CL=ONE then you'd be ok. However, I am not wild about losing a node
and having only 1 copy of my data available in prod.
On Tue Dec 09 2014 at
Thanks Jonathan. So there is nothing too idiotic about my current set-up
with 6 boxes each with 256 vnodes each and a RF of 2?
I appreciate the help,
Nate
--
*Nathanael Yoder*
Principal Engineer & Data Scientist, Whistle
415-944-7344 // n...@whistle.com
On Tue, Dec 9, 2014 at 8:31 AM, Jonatha
You don't need a prime number of nodes in your ring, but it's not a bad
idea to it be a multiple of your RF when your cluster is small.
On Tue Dec 09 2014 at 8:29:35 AM Nate Yoder wrote:
> Hi Ian,
>
> Thanks for the suggestion but I had actually already done that prior to
> the scenario I descr
Hi Ian,
Thanks for the suggestion but I had actually already done that prior to the
scenario I described (to get myself some free space) and when I ran
nodetool cfstats it listed 0 snapshots as expected, so unfortunately I
don't think that is where my space went.
One additional piece of informati
Try `nodetool clearsnapshot` which will delete any snapshots you have. I
have never taken a snapshot with nodetool yet I found several snapshots on
my disk recently (which can take a lot of space). So perhaps they are
automatically generated by some operation? No idea. Regardless, nuking
those
Some of the sequences grow so fast that sub-partition is inevitable. I may
need to try different bucket sizes to get the optimal throughput. Thank you
all for the advice.
On Mon, Dec 8, 2014 at 9:55 AM, Eric Stevens wrote:
> The upper bound for the data size of a single column is 2GB, and the up
19 matches
Mail list logo