> Rereading through everything again I am starting to wonder if the page cache
> is being affected by compaction. We have been heavily loading data for weeks
> and compaction is basically running non-stop. The manual compaction should
> be done some time tomorrow, so when totally caught up I will t
On 19.12.10 03:05, Wayne wrote:
Rereading through everything again I am starting to wonder if the page
cache is being affected by compaction.
Oh yes ...
http://chbits.blogspot.com/2010/06/lucene-and-fadvisemadvise.html
https://issues.apache.org/jira/browse/CASSANDRA-1470
We have been heavily
You can disable compaction and enable it later. Use nodetool and
setcompactionthreshold to 0 0
-Chris
On Dec 18, 2010, at 6:05 PM, Wayne wrote:
> Rereading through everything again I am starting to wonder if the page cache
> is being affected by compaction. We have been heavily loading data fo
Rereading through everything again I am starting to wonder if the page cache
is being affected by compaction. We have been heavily loading data for weeks
and compaction is basically running non-stop. The manual compaction should
be done some time tomorrow, so when totally caught up I will try again
> You are absolutely back to my main concern. Initially we were consistently
> seeing < 10ms read latency and now we see 25ms (30GB sstable file), 50ms
> (100GB sstable file) and 65ms (330GB table file) read times for a single
> read with nothing else going on in the cluster. Concurrency is not our
You are absolutely back to my main concern. Initially we were consistently
seeing < 10ms read latency and now we see 25ms (30GB sstable file), 50ms
(100GB sstable file) and 65ms (330GB table file) read times for a single
read with nothing else going on in the cluster. Concurrency is not our
problem
> Smaller nodes just seem to fit the Cassandra architecture a lot better. We
> can not use cloud instances, so the cost for us to go to <500gb nodes is
> prohibitive. Cassandra lumps all processes on the node together into one
> bucket, and that almost then requires a smaller node data set. There a
We are using XFS for the data volume. We are load testing now, and
compaction is way behind but weekly manual compaction should help catch
things up.
Smaller nodes just seem to fit the Cassandra architecture a lot better. We
can not use cloud instances, so the cost for us to go to <500gb nodes is
On Sat, Dec 18, 2010 at 11:31 AM, Peter Schuller
wrote:
> I started a page on the wiki that still needs improvement,
> specifically for concerns relating to running large nodes:
>
> http://wiki.apache.org/cassandra/LargeDataSetConsiderations
>
> I haven't linked to it from anywhere yet, pending
I started a page on the wiki that still needs improvement,
specifically for concerns relating to running large nodes:
http://wiki.apache.org/cassandra/LargeDataSetConsiderations
I haven't linked to it from anywhere yet, pending adding various JIRA
ticket references + give people a chance to ob
> +1 on each of Peter's points except one.
>
> For example, if the hot set is very small and slowly changing, you may
> be able to have 100 TB per node and take the traffic without any
> difficulties.
So that statement was probably not the best. I should have been more
careful. I meant it purely i
On Sat, Dec 18, 2010 at 5:27 AM, Peter Schuller
wrote:
> And I forgot:
>
> (6) It is fully expected that sstable counts spike during large
> compactions that take a lot of time simply because smaller compactions
> never get a chance to run. (There was just recently JIRA traffic that
> added suppor
And I forgot:
(6) It is fully expected that sstable counts spike during large
compactions that take a lot of time simply because smaller compactions
never get a chance to run. (There was just recently JIRA traffic that
added support for parallel compaction, but I'm not sure whether it
fully addres
> How many nodes? 10 - 16 cores each (2 x quad ht cpus)
> How much ram per node? 24gb
> What disks and how many? SATA 7200rpm 1x1tb for commit log, 4x1tb (raid0)
> for data
> Is your ring balanced? yes, random partitioned very evenly
> How many column families? 4 CFs x 3 Keyspaces
> How much ram is
On Fri, Dec 17, 2010 at 12:26 PM, Daniel Doubleday
wrote:
> How much ram is dedicated to cassandra? 12gb heap (probably too high?)
> What is the hit rate of caches? high, 90%+
>
> If your heap allows it I would definitely try to give more ram for fs cache.
> Your not using row cache so I don't see
> How much ram is dedicated to cassandra? 12gb heap (probably too high?)
> What is the hit rate of caches? high, 90%+
If your heap allows it I would definitely try to give more ram for fs cache.
Your not using row cache so I don't see what cassandra would gain from so much
memory.
A question ab
Below are some answers to your questions. We have wide rows (what we like
about Cassandra) and I wonder if that plays into this? We have been loading
1 keyspace in our cluster heavily in the last week so it is behind in
compaction for that keyspace. I am not even looking at those read latency
times
On Fri, Dec 17, 2010 at 8:21 AM, Wayne wrote:
> We have been testing Cassandra for 6+ months and now have 10TB in 10 nodes
> with rf=3. It is 100% real data generated by real code in an almost
> production level mode. We have gotten past all our stability issues,
> java/cmf issues, etc. etc. now t
We have been testing Cassandra for 6+ months and now have 10TB in 10 nodes
with rf=3. It is 100% real data generated by real code in an almost
production level mode. We have gotten past all our stability issues,
java/cmf issues, etc. etc. now to find the one thing we "assumed" may not be
true. Our
On Dec 16, 2010, at 11:35 PM, Wayne wrote:
> I have read that read latency goes up with the total data size, but to what
> degree should we expect a degradation in performance? What is the "normal"
> read latency range if there is such a thing for a small slice of scol/cols?
> Can we really pu
On Thu, Dec 16, 2010 at 7:15 PM, Robert Coli wrote:
> On Thu, Dec 16, 2010 at 2:35 PM, Wayne wrote:
>
>> I have read that read latency goes up with the total data size, but to what
>> degree should we expect a degradation in performance?
>
> I'm not sure this is generally answerable because of da
On Thu, Dec 16, 2010 at 2:35 PM, Wayne wrote:
> I have read that read latency goes up with the total data size, but to what
> degree should we expect a degradation in performance?
I'm not sure this is generally answerable because of data modelling
and workload variability, but there are some kno
We are running .6.8 and are reaching 1tb/node in a 10 node cluster rf=3. Our
reads times seem to be getting worse as we load data into the cluster, and I
am worried there is a scale problem in terms of large column families. All
benchmarks/times come from cfstats reporting so no client code or time
23 matches
Mail list logo