multitenant support with key spaces
How many keyspaces can you reasonably have? We have around 500 customers and expect that to double end of year. We're looking into C* and wondering if it makes sense for a separate KS per customer? If we have 1000 customers, so one KS per customer is 1000 keyspaces. Is that something C* can handle efficiently? Each customer has about 10 GB of data (not taking replication into account). Or is this symptomatic of a bad design? I guess the same question applies to our notion of breaking up the column families into time ranges. We're naively trying to avoid having few large CFs/KSs. Is/should that be a concern? What are the tradeoffs of a smaller number of heavyweight KS/CFs vs. manually sharding the data into more granular KSs/CFs? Thanks for any info.
1.2 tuning
Lots of possible "issues" with high write load and not sure if it means we need more nodes or if the nodes aren't tuned correctly. Were using 4 EC2 xlarge instances to support 4 medium instances. We're getting about 10k inserts/sec, but after about 10 minutes it goes down to about 7k/sec which seems to time well with these messages: WARN [MemoryMeter:1] 2013-05-29 21:04:05,462 Memtable.java (line 222) setting live ratio to minimum of 1.0 instead of 0.08283890630659889 and WARN [ScheduledTasks:1] 2013-05-29 21:24:07,732 GCInspector.java (line 142) Heap is 0.7554059480798656 full. You may need to reduce memtable and/or cache sizes. Cassandra will now flush up to the two largest memtables to free up memory. Adjust flush_largest_memtables_at threshold in cassandra.yaml if you don't want Cassandra to do this automatically Weve insured were using JNA and have left most defaults as defaults. No row cache, default key/bloom caches. Also, we start to get timeouts on the clients after about 15 minutes of hammering. Were using the latest JNA and separate ephemeral drives for commit log and data directories. Were using 1.2.5. ALso we see compations backing up: pending tasks: 6 compaction typekeyspace column family completed total unit progress Compaction svbks ttevents 691259241 3621003064 bytes19.09% Compaction svbks ttevents 135890464 505047776 bytes26.91% Compaction svbks ttevents 225229105 2531271538 bytes 8.90% Compaction svbks ttevents 1312041410 5409348928 bytes24.26% Active compaction remaining time : 0h09m38s OpsCenter says were only using 20% of the disk. The overall compaction's remaining time is going up :{ Advise welcome/TIA --Darren
Re: 1.2 tuning
If the amount of remaining time for compaction keeps going up, does this point to an overloaded node or an un-tuned node? On Thu, May 30, 2013 at 3:10 PM, Robert Coli wrote: > On Wed, May 29, 2013 at 2:38 PM, Darren Smythe > wrote: > > Were using the latest JNA and separate ephemeral drives for commit log > and > > data directories. > > (as a note..) > > Per nickmbailey, testing shows that there is little/no benefit to > separating commit log and data dirs on virtualized disk (or SSD), > because the win from this practice comes when the head doesn't move > between appends to the commit log. Because the head must be assumed > to always be moving on shared disk (and because there is no head to > move on SSD), you'd be better off with a one-disk-larger ephemeral > stripe for both data and commit log. > > =Rob >
Re: 1.2 tuning
Is "setting live ratio to minimum of 1.0 instead of X" supposed to be rare? Because were getting it fairly consistently. On Sat, Jun 1, 2013 at 8:58 PM, Darren Smythe wrote: > If the amount of remaining time for compaction keeps going up, does this > point to an overloaded node or an un-tuned node? > > > On Thu, May 30, 2013 at 3:10 PM, Robert Coli wrote: > >> On Wed, May 29, 2013 at 2:38 PM, Darren Smythe >> wrote: >> > Were using the latest JNA and separate ephemeral drives for commit log >> and >> > data directories. >> >> (as a note..) >> >> Per nickmbailey, testing shows that there is little/no benefit to >> separating commit log and data dirs on virtualized disk (or SSD), >> because the win from this practice comes when the head doesn't move >> between appends to the commit log. Because the head must be assumed >> to always be moving on shared disk (and because there is no head to >> move on SSD), you'd be better off with a one-disk-larger ephemeral >> stripe for both data and commit log. >> >> =Rob >> > >
Re: 1.2 tuning
Hi- On Mon, Jun 3, 2013 at 10:36 AM, Robert Coli wrote: > On Mon, Jun 3, 2013 at 8:54 AM, Darren Smythe > wrote: > > Is "setting live ratio to minimum of 1.0 instead of X" supposed to be > rare? > > Because were getting it fairly consistently. > > Do you have working JNA? If so, my understanding is that message > should be relatively rare.. > We recently tried a 8 xlarge node cluster instead of just 4 xlarge instances, keeping the number of clients the same as before. But were still seeing a lot of "setting live ratio" and "flush up to the two largest memtables" messages alot. The compaction queues still seem congested also. Theere's also alot of dropped MUTATION messages --- 13,000,000 Completed in Mutation Stage, 100,000 Dropped THe inserts are now about 11k/second, up slightly from 10k/second with 4 nodes. Another missing piece may be that each column has a 1-2k blob as a value. The value is binary now, but were moving to json later. TIA --Darren > > ./src/java/org/apache/cassandra/db/Memtable.java > " > double newRatio = (double) deepSize / currentSize.get(); > > if (newRatio < MIN_SANE_LIVE_RATIO) > { > logger.warn("setting live ratio to minimum of > {} instead of {}", MIN_SANE_LIVE_RATIO, newRatio); > newRatio = MIN_SANE_LIVE_RATIO; > } > " > > =Rob >
Billions of counters
We want to precalculate counts for some common metrics for usage. We have events, locations, products, etc. The problem is we have millions events/day, thousands of locations and millions of products. Were trying to precalculate counts for some common queries like 'how many times was product X purchased in location Y last week'. It seems like we'll end up with trillions of counters for even these basic permutations. Is this a cause for concern? TIA -- Darren
memtable overhead
Hi, How much overhead (in heap MB) does an empty memtable use? If I have many column families that aren't written to often, how much memory do these take up? TIA -- Darren
Re: memtable overhead
The way weve gone about our data models has resulted in lots of column families and just looking for guidelines about how much space each column table adds. TIA On Sun, Jul 21, 2013 at 11:19 PM, Darren Smythe wrote: > Hi, > > How much overhead (in heap MB) does an empty memtable use? If I have many > column families that aren't written to often, how much memory do these take > up? > > TIA > > -- Darren >