Cassandra 2.0.10 and Datastax Java Driver 2.1.1 On Dec 16, 2014, at 4:48 PM, Ryan Svihla <rsvi...@datastax.com> wrote:
> What version of Cassandra? > > On Dec 16, 2014 6:36 PM, "Arne Claassen" <a...@emotient.com> wrote: > That's just the thing. There is nothing in the logs except the constant > ParNew collections like > > DEBUG [ScheduledTasks:1] 2014-12-16 19:03:35,042 GCInspector.java (line 118) > GC for ParNew: 166 ms for 10 collections, 4400928736 used; max is 8000634888 > > But the load is staying continuously high. > > There's always some compaction on just that one table, media_tracks_raw going > on and those values rarely changed (certainly the remaining time is > meaningless) > > pending tasks: 17 > compaction type keyspace table completed > total unit progress > Compaction mediamedia_tracks_raw 444294932 > 1310653468 bytes 33.90% > Compaction mediamedia_tracks_raw 131931354 > 3411631999 bytes 3.87% > Compaction mediamedia_tracks_raw 30308970 > 23097672194 bytes 0.13% > Compaction mediamedia_tracks_raw 899216961 > 1815591081 bytes 49.53% > Active compaction remaining time : 0h27m56s > > Here's a sample of a query trace: > > activity > | timestamp | source | source_elapsed > --------------------------------------------------------------------------------------------------+--------------+---------------+---------------- > > execute_cql3_query | 00:11:46,612 | 10.140.22.236 | 0 > Parsing select * from media_tracks_raw where id > =74fe9449-8ac4-accb-a723-4bad024101e3 limit 100; | 00:11:46,612 | > 10.140.22.236 | 47 > > Preparing statement | 00:11:46,612 | 10.140.22.236 | 234 > Sending > message to /10.140.21.54 | 00:11:46,619 | 10.140.22.236 | 7190 > Message received > from /10.140.22.236 | 00:11:46,622 | 10.140.21.54 | 12 > Executing single-partition query > on media_tracks_raw | 00:11:46,644 | 10.140.21.54 | 21971 > > Acquiring sstable references | 00:11:46,644 | 10.140.21.54 | 22029 > Merging > memtable tombstones | 00:11:46,644 | 10.140.21.54 | 22131 > Bloom filter allows > skipping sstable 1395 | 00:11:46,644 | 10.140.21.54 | 22245 > Bloom filter allows > skipping sstable 1394 | 00:11:46,644 | 10.140.21.54 | 22279 > Bloom filter allows > skipping sstable 1391 | 00:11:46,644 | 10.140.21.54 | 22293 > Bloom filter allows > skipping sstable 1381 | 00:11:46,644 | 10.140.21.54 | 22304 > Bloom filter allows > skipping sstable 1376 | 00:11:46,644 | 10.140.21.54 | 22317 > Bloom filter allows > skipping sstable 1368 | 00:11:46,644 | 10.140.21.54 | 22328 > Bloom filter allows > skipping sstable 1365 | 00:11:46,644 | 10.140.21.54 | 22340 > Bloom filter allows > skipping sstable 1351 | 00:11:46,644 | 10.140.21.54 | 22352 > Bloom filter allows > skipping sstable 1367 | 00:11:46,644 | 10.140.21.54 | 22363 > Bloom filter allows > skipping sstable 1380 | 00:11:46,644 | 10.140.21.54 | 22374 > Bloom filter allows > skipping sstable 1343 | 00:11:46,644 | 10.140.21.54 | 22386 > Bloom filter allows > skipping sstable 1342 | 00:11:46,644 | 10.140.21.54 | 22397 > Bloom filter allows > skipping sstable 1334 | 00:11:46,644 | 10.140.21.54 | 22408 > Bloom filter allows > skipping sstable 1377 | 00:11:46,644 | 10.140.21.54 | 22429 > Bloom filter allows > skipping sstable 1330 | 00:11:46,644 | 10.140.21.54 | 22441 > Bloom filter allows > skipping sstable 1329 | 00:11:46,644 | 10.140.21.54 | 22452 > Bloom filter allows > skipping sstable 1328 | 00:11:46,644 | 10.140.21.54 | 22463 > Bloom filter allows > skipping sstable 1327 | 00:11:46,644 | 10.140.21.54 | 22475 > Bloom filter allows > skipping sstable 1326 | 00:11:46,644 | 10.140.21.54 | 22488 > Bloom filter allows > skipping sstable 1320 | 00:11:46,644 | 10.140.21.54 | 22506 > Bloom filter allows > skipping sstable 1319 | 00:11:46,644 | 10.140.21.54 | 22518 > Bloom filter allows > skipping sstable 1318 | 00:11:46,644 | 10.140.21.54 | 22528 > Bloom filter allows > skipping sstable 1317 | 00:11:46,644 | 10.140.21.54 | 22540 > Bloom filter allows > skipping sstable 1316 | 00:11:46,644 | 10.140.21.54 | 22552 > Bloom filter allows > skipping sstable 1315 | 00:11:46,644 | 10.140.21.54 | 22563 > Bloom filter allows > skipping sstable 1314 | 00:11:46,644 | 10.140.21.54 | 22572 > Bloom filter allows > skipping sstable 1313 | 00:11:46,644 | 10.140.21.54 | 22583 > Bloom filter allows > skipping sstable 1312 | 00:11:46,644 | 10.140.21.54 | 22594 > Bloom filter allows > skipping sstable 1311 | 00:11:46,644 | 10.140.21.54 | 22605 > Bloom filter allows > skipping sstable 1310 | 00:11:46,644 | 10.140.21.54 | 22616 > Bloom filter allows > skipping sstable 1309 | 00:11:46,644 | 10.140.21.54 | 22628 > Bloom filter allows > skipping sstable 1308 | 00:11:46,644 | 10.140.21.54 | 22640 > Bloom filter allows > skipping sstable 1307 | 00:11:46,644 | 10.140.21.54 | 22651 > Bloom filter allows > skipping sstable 1306 | 00:11:46,644 | 10.140.21.54 | 22663 > Bloom filter allows > skipping sstable 1305 | 00:11:46,644 | 10.140.21.54 | 22674 > Bloom filter allows > skipping sstable 1304 | 00:11:46,644 | 10.140.21.54 | 22684 > Bloom filter allows > skipping sstable 1303 | 00:11:46,644 | 10.140.21.54 | 22696 > Bloom filter allows > skipping sstable 1302 | 00:11:46,644 | 10.140.21.54 | 22707 > Bloom filter allows > skipping sstable 1301 | 00:11:46,644 | 10.140.21.54 | 22718 > Bloom filter allows > skipping sstable 1300 | 00:11:46,644 | 10.140.21.54 | 22729 > Bloom filter allows > skipping sstable 1299 | 00:11:46,644 | 10.140.21.54 | 22740 > Bloom filter allows > skipping sstable 1298 | 00:11:46,644 | 10.140.21.54 | 22752 > Bloom filter allows > skipping sstable 1297 | 00:11:46,644 | 10.140.21.54 | 22763 > Bloom filter allows > skipping sstable 1296 | 00:11:46,644 | 10.140.21.54 | 22774 > Key cache > hit for sstable 1295 | 00:11:46,644 | 10.140.21.54 | 22817 > Seeking to partition > beginning in data file | 00:11:46,644 | 10.140.21.54 | 22842 > Skipped 0/89 non-slice-intersecting sstables, included > 0 due to tombstones | 00:11:46,646 | 10.140.21.54 | 24109 > Merging data from > memtables and 1 sstables | 00:11:46,646 | 10.140.21.54 | 24238 > Read 101 live > and 0 tombstoned cells | 00:11:46,663 | 10.140.21.54 | 41389 > Enqueuing > response to /10.140.22.236 | 00:11:46,663 | 10.140.21.54 | 41831 > Sending > message to /10.140.22.236 | 00:11:46,664 | 10.140.21.54 | 41972 > Message > received from /10.140.21.54 | 00:11:46,671 | 10.140.22.236 | 59498 > Processing > response from /10.140.21.54 | 00:11:46,672 | 10.140.22.236 | 59563 > > Request complete | 00:11:46,704 | 10.140.22.236 | 92781 > > Every query I did always just had three mentions of tombstones > Merging memtable tombstones > Skipped 0/89 non-slice-intersecting sstables, included 0 due to tombstones > Read 101 live and 0 tombstoned cells > And unless i misread those, not of them claim that there are any tombstones. > > > On Dec 16, 2014, at 4:26 PM, Ryan Svihla <rsvi...@datastax.com> wrote: > >> manual forced compactions create more problems than they solve, if you have >> no evidence of tombstones in your selects (which seems odd, can you share >> some of the tracing output?), then I'm not sure what it would solve for you. >> >> Compaction running could explain a high load, logs messages with ERRORS, >> WARN, GCInspector are all meaningful there, I suggest search jira for your >> version to see if there are any interesting bugs. >> >> >> >> On Tue, Dec 16, 2014 at 6:14 PM, Arne Claassen <a...@emotient.com> wrote: >> I just did a wide set of selects and ran across no tombstones. But while on >> the subject of gc_grace_seconds, any reason, on a small cluster not to set >> it to something low like a single day. It seems like 10 days is only need to >> large clusters undergoing long partition splits, or am i misunderstanding >> gc_grace_seconds. >> >> Now, given all that, does any of this explain a high load when the cluster >> is idle? Is it compaction catching up and would manual forced compaction >> alleviate that? >> >> thanks, >> arne >> >> >> On Dec 16, 2014, at 3:28 PM, Ryan Svihla <rsvi...@datastax.com> wrote: >> >>> so a delete is really another write for gc_grace_seconds (default 10 days), >>> if you get enough tombstones it can make managing your cluster a challenge >>> as is. open up cqlsh, turn on tracing and try a few queries..how many >>> tombstones are scanned for a given query? It's possible the heap problems >>> you're seeing are actually happening on the query side and not on the >>> ingest side, the severity of this depends on driver and cassandra version, >>> but older drivers and versions of cassandra could easily overload heap with >>> expensive selects, when layered over tombstones it's certainly becomes a >>> possibility this is your root cause. >>> >>> Now this will primarily create more load on compaction and depending on >>> your cassandra version there maybe some other issue at work, but something >>> I can tell you is every time I see 1 dropped mutation I see a cluster that >>> was overloaded enough it had to shed load. If I see 200k I see a >>> cluster/configuration/hardware that is badly overloaded. >>> >>> I suggest the following >>> trace some of the queries used in prod >>> monitor your ingest rate, see at what levels you run into issues >>> (GCInspector log messages, dropped mutations, etc) >>> heap configuration we mentioned earlier..go ahead and monitor heap usage, >>> if it hits 75% repeated this is an indication of heavy load >>> monitor dropped mutations..any dropped mutation is evidence of an >>> overloaded server, again the root cause can be many other problems that are >>> solvable with current hardware, and LOTS of people runs with nodes with >>> similar configuration. >>> >>> On Tue, Dec 16, 2014 at 5:08 PM, Arne Claassen <a...@emotient.com> wrote: >>> Not using any secondary indicies and memtable_flush_queue_size is the >>> default 4. >>> >>> But let me tell you how data is "mutated" right now, maybe that will give >>> you an insight on how this is happening >>> >>> Basically the frame data table has the following primary key: PRIMARY KEY >>> ((id), trackid, "timestamp") >>> >>> Generally data is inserted once. So day to day writes are all new rows. >>> However, when out process for generating analytics for these rows changes, >>> we run the media back through again, causing overwrites. >>> >>> Up until last night, this was just a new insert because the PK never >>> changed so it was always 1-to-1 overwrite of every row. >>> >>> Last night was the first time that a new change went in where the PK could >>> actually change so now the process is always, DELETE by partition key, >>> insert all rows for partition key, repeat. >>> >>> We two tables that have similar frame data projections and some other >>> aggregates with much smaller row count per partition key. >>> >>> hope that helps, >>> arne >>> >>> On Dec 16, 2014, at 2:46 PM, Ryan Svihla <rsvi...@datastax.com> wrote: >>> >>>> so you've got some blocked flush writers but you have a incredibly large >>>> number of dropped mutations, are you using secondary indexes? and if so >>>> how many? what is your flush queue set to? >>>> >>>> On Tue, Dec 16, 2014 at 4:43 PM, Arne Claassen <a...@emotient.com> wrote: >>>> Of course QA decided to start a test batch (still relatively low traffic), >>>> so I hope it doesn't throw the tpstats off too much >>>> >>>> Node 1: >>>> Pool Name Active Pending Completed Blocked >>>> All time blocked >>>> MutationStage 0 0 13804928 0 >>>> 0 >>>> ReadStage 0 0 10975 0 >>>> 0 >>>> RequestResponseStage 0 0 7725378 0 >>>> 0 >>>> ReadRepairStage 0 0 1247 0 >>>> 0 >>>> ReplicateOnWriteStage 0 0 0 0 >>>> 0 >>>> MiscStage 0 0 0 0 >>>> 0 >>>> HintedHandoff 1 1 50 0 >>>> 0 >>>> FlushWriter 0 0 306 0 >>>> 31 >>>> MemoryMeter 0 0 719 0 >>>> 0 >>>> GossipStage 0 0 286505 0 >>>> 0 >>>> CacheCleanupExecutor 0 0 0 0 >>>> 0 >>>> InternalResponseStage 0 0 0 0 >>>> 0 >>>> CompactionExecutor 4 14 159 0 >>>> 0 >>>> ValidationExecutor 0 0 0 0 >>>> 0 >>>> MigrationStage 0 0 0 0 >>>> 0 >>>> commitlog_archiver 0 0 0 0 >>>> 0 >>>> AntiEntropyStage 0 0 0 0 >>>> 0 >>>> PendingRangeCalculator 0 0 11 0 >>>> 0 >>>> MemtablePostFlusher 0 0 1781 0 >>>> 0 >>>> >>>> Message type Dropped >>>> READ 0 >>>> RANGE_SLICE 0 >>>> _TRACE 0 >>>> MUTATION 391041 >>>> COUNTER_MUTATION 0 >>>> BINARY 0 >>>> REQUEST_RESPONSE 0 >>>> PAGED_RANGE 0 >>>> READ_REPAIR 0 >>>> >>>> Node 2: >>>> Pool Name Active Pending Completed Blocked >>>> All time blocked >>>> MutationStage 0 0 997042 0 >>>> 0 >>>> ReadStage 0 0 2623 0 >>>> 0 >>>> RequestResponseStage 0 0 706650 0 >>>> 0 >>>> ReadRepairStage 0 0 275 0 >>>> 0 >>>> ReplicateOnWriteStage 0 0 0 0 >>>> 0 >>>> MiscStage 0 0 0 0 >>>> 0 >>>> HintedHandoff 2 2 12 0 >>>> 0 >>>> FlushWriter 0 0 37 0 >>>> 4 >>>> MemoryMeter 0 0 70 0 >>>> 0 >>>> GossipStage 0 0 14927 0 >>>> 0 >>>> CacheCleanupExecutor 0 0 0 0 >>>> 0 >>>> InternalResponseStage 0 0 0 0 >>>> 0 >>>> CompactionExecutor 4 7 94 0 >>>> 0 >>>> ValidationExecutor 0 0 0 0 >>>> 0 >>>> MigrationStage 0 0 0 0 >>>> 0 >>>> commitlog_archiver 0 0 0 0 >>>> 0 >>>> AntiEntropyStage 0 0 0 0 >>>> 0 >>>> PendingRangeCalculator 0 0 3 0 >>>> 0 >>>> MemtablePostFlusher 0 0 114 0 >>>> 0 >>>> >>>> Message type Dropped >>>> READ 0 >>>> RANGE_SLICE 0 >>>> _TRACE 0 >>>> MUTATION 0 >>>> COUNTER_MUTATION 0 >>>> BINARY 0 >>>> REQUEST_RESPONSE 0 >>>> PAGED_RANGE 0 >>>> READ_REPAIR 0 >>>> >>>> Node 3: >>>> Pool Name Active Pending Completed Blocked >>>> All time blocked >>>> MutationStage 0 0 1539324 0 >>>> 0 >>>> ReadStage 0 0 2571 0 >>>> 0 >>>> RequestResponseStage 0 0 373300 0 >>>> 0 >>>> ReadRepairStage 0 0 325 0 >>>> 0 >>>> ReplicateOnWriteStage 0 0 0 0 >>>> 0 >>>> MiscStage 0 0 0 0 >>>> 0 >>>> HintedHandoff 1 1 21 0 >>>> 0 >>>> FlushWriter 0 0 38 0 >>>> 5 >>>> MemoryMeter 0 0 59 0 >>>> 0 >>>> GossipStage 0 0 21491 0 >>>> 0 >>>> CacheCleanupExecutor 0 0 0 0 >>>> 0 >>>> InternalResponseStage 0 0 0 0 >>>> 0 >>>> CompactionExecutor 4 9 85 0 >>>> 0 >>>> ValidationExecutor 0 0 0 0 >>>> 0 >>>> MigrationStage 0 0 0 0 >>>> 0 >>>> commitlog_archiver 0 0 0 0 >>>> 0 >>>> AntiEntropyStage 0 0 0 0 >>>> 0 >>>> PendingRangeCalculator 0 0 6 0 >>>> 0 >>>> MemtablePostFlusher 0 0 164 0 >>>> 0 >>>> >>>> Message type Dropped >>>> READ 0 >>>> RANGE_SLICE 0 >>>> _TRACE 0 >>>> MUTATION 205259 >>>> COUNTER_MUTATION 0 >>>> BINARY 0 >>>> REQUEST_RESPONSE 0 >>>> PAGED_RANGE 0 >>>> READ_REPAIR 18 >>>> >>>> >>>> Compaction seems like the only thing consistently active and pending >>>> >>>> On Tue, Dec 16, 2014 at 2:18 PM, Ryan Svihla <rsvi...@datastax.com> wrote: >>>> Ok based on those numbers I have a theory.. >>>> >>>> can you show me nodetool tptats for all 3 nodes? >>>> >>>> On Tue, Dec 16, 2014 at 4:04 PM, Arne Claassen <a...@emotient.com> wrote: >>>> No problem with the follow up questions. I'm on a crash course here trying >>>> to understand what makes C* tick so I appreciate all feedback. >>>> >>>> We reprocessed all media (1200 partition keys) last night where partition >>>> keys had somewhere between 4k and 200k "rows". After that completed, no >>>> traffic went to cluster at all for ~8 hours and throughout today, we may >>>> get a couple (less than 10) queries per second and maybe 3-4 write batches >>>> per hour. >>>> >>>> I assume the last value in the Partition Size histogram is the largest row: >>>> >>>> 20924300 bytes: 79 >>>> 25109160 bytes: 57 >>>> >>>> The majority seems clustered around 200000 bytes. >>>> >>>> I will look at switching my inserts to unlogged batches since they are >>>> always for one partition key. >>>> >>>> On Tue, Dec 16, 2014 at 1:47 PM, Ryan Svihla <rsvi...@datastax.com> wrote: >>>> Can you define what is "virtual no traffic" sorry to be repetitive about >>>> that, but I've worked on a lot of clusters in the past year and people >>>> have wildly different ideas what that means. >>>> >>>> unlogged batches of the same partition key are definitely a performance >>>> optimization. Typically async is much faster and easier on the cluster >>>> when you're using multip partition key batches. >>>> >>>> nodetool cfhistograms <keyspace> <tablename> >>>> >>>> On Tue, Dec 16, 2014 at 3:42 PM, Arne Claassen <a...@emotient.com> wrote: >>>> Actually not sure why the machine was originally configured at 6GB since >>>> we even started it on an r3.large with 15GB. >>>> >>>> Re: Batches >>>> >>>> Not using batches. I actually have that as a separate question on the >>>> list. Currently I fan out async single inserts and I'm wondering if >>>> batches are better since my data is inherently inserted in blocks of >>>> ordered rows for a single partition key. >>>> >>>> >>>> Re: Traffic >>>> >>>> There isn't all that much traffic. Inserts come in as blocks per partition >>>> key, but then can be 5k-200k rows for that partition key. Each of these >>>> rows is less than 100k. It's small, lots of ordered rows. It's frame and >>>> sub-frame information for media. and rows for one piece of media is >>>> inserted at once (the partition key). >>>> >>>> For the last 12 hours, where the load on all these machine has been stuck >>>> there's been virtually no traffic at all. This is the nodes basically >>>> sitting idle, except that they had load of 4 each. >>>> >>>> BTW, how do you determine widest row or for that matter number of >>>> tombstones in a row? >>>> >>>> thanks, >>>> arne >>>> >>>> On Tue, Dec 16, 2014 at 1:24 PM, Ryan Svihla <rsvi...@datastax.com> wrote: >>>> So 1024 is still a good 2.5 times what I'm suggesting, 6GB is hardly >>>> enough to run Cassandra well in, especially if you're going full bore on >>>> loads. However, you maybe just flat out be CPU bound on your write >>>> throughput, how many TPS and what size writes do you have? Also what is >>>> your widest row? >>>> >>>> Final question what is compaction throughput at? >>>> >>>> >>>> On Tue, Dec 16, 2014 at 3:20 PM, Arne Claassen <a...@emotient.com> wrote: >>>> The starting configuration I had, which is still running on two of the >>>> nodes, was 6GB Heap, 1024MB parnew which is close to what you are >>>> suggesting and those have been pegged at load 4 for the over 12 hours with >>>> hardly and read or write traffic. I will set one to 8GB/400MB and see if >>>> its load changes. >>>> >>>> On Tue, Dec 16, 2014 at 1:12 PM, Ryan Svihla <rsvi...@datastax.com> wrote: >>>> So heap of that size without some tuning will create a number of problems >>>> (high cpu usage one of them), I suggest either 8GB heap and 400mb parnew >>>> (which I'd only set that low for that low cpu count) , or attempt the >>>> tunings as indicated in >>>> https://issues.apache.org/jira/browse/CASSANDRA-8150 >>>> >>>> On Tue, Dec 16, 2014 at 3:06 PM, Arne Claassen <a...@emotient.com> wrote: >>>> Changed the 15GB node to 25GB heap and the nice CPU is down to ~20% now. >>>> Checked my dev cluster to see if the ParNew log entries are just par for >>>> the course, but not seeing them there. However, both have the following >>>> every 30 seconds: >>>> >>>> DEBUG [BatchlogTasks:1] 2014-12-16 21:00:44,898 BatchlogManager.java (line >>>> 165) Started replayAllFailedBatches >>>> DEBUG [MemtablePostFlusher:1] 2014-12-16 21:00:44,899 >>>> ColumnFamilyStore.java (line 866) forceFlush requested but everything is >>>> clean in batchlog >>>> DEBUG [BatchlogTasks:1] 2014-12-16 21:00:44,899 BatchlogManager.java (line >>>> 200) Finished replayAllFailedBatches >>>> >>>> Is that just routine scheduled house-keeping or a sign of something else? >>>> >>>> On Tue, Dec 16, 2014 at 12:52 PM, Arne Claassen <a...@emotient.com> wrote: >>>> Sorry, I meant 15GB heap on the one machine that has less nice CPU% now. >>>> The others are 6GB >>>> >>>> On Tue, Dec 16, 2014 at 12:50 PM, Arne Claassen <a...@emotient.com> wrote: >>>> AWS r3.xlarge, 30GB, but only using a Heap of 10GB, new 2GB because we >>>> might go c3.2xlarge instead if CPU is more important than RAM >>>> Storage is optimized EBS SSD (but iostat shows no real IO going on) >>>> Each node only has about 10GB with ownership of 67%, 64.7% & 68.3%. >>>> >>>> The node on which I set the Heap to 10GB from 6GB the utlilization has >>>> dropped to 46%nice now, but the ParNew log messages still continue at the >>>> same pace. I'm gonna up the HEAP to 20GB for a bit, see if that brings >>>> that nice CPU further down. >>>> >>>> No TombstoneOverflowingExceptions. >>>> >>>> On Tue, Dec 16, 2014 at 11:50 AM, Ryan Svihla <rsvi...@datastax.com> wrote: >>>> What's CPU, RAM, Storage layer, and data density per node? Exact heap >>>> settings would be nice. In the logs look for TombstoneOverflowingException >>>> >>>> >>>> On Tue, Dec 16, 2014 at 1:36 PM, Arne Claassen <a...@emotient.com> wrote: >>>> I'm running 2.0.10. >>>> >>>> The data is all time series data and as we change our pipeline, we've been >>>> periodically been reprocessing the data sources, which causes each time >>>> series to be overwritten, i.e. every row per partition key is deleted and >>>> re-written, so I assume i've been collecting a bunch of tombstones. >>>> >>>> Also, the presence of the ever present and never completing compaction >>>> types, i assumed were an artifact of tombstoning, but i fully admit to >>>> conjecture based on about ~20 blog posts and stackoverflow questions i've >>>> surveyed. >>>> >>>> I doubled the Heap on one node and it changed nothing regarding the load >>>> or the ParNew log statements. New Generation Usage is 50%, Eden itself is >>>> 56%. >>>> >>>> Anything else i should look at and report, let me know. >>>> >>>> On Tue, Dec 16, 2014 at 11:14 AM, Jonathan Lacefield >>>> <jlacefi...@datastax.com> wrote: >>>> Hello, >>>> >>>> What version of Cassandra are you running? >>>> >>>> If it's 2.0, we recently experienced something similar with 8447 [1], >>>> which 8485 [2] should hopefully resolve. >>>> >>>> Please note that 8447 is not related to tombstones. Tombstone >>>> processing can put a lot of pressure on the heap as well. Why do you think >>>> you have a lot of tombstones in that one particular table? >>>> >>>> [1] https://issues.apache.org/jira/browse/CASSANDRA-8447 >>>> [2] https://issues.apache.org/jira/browse/CASSANDRA-8485 >>>> >>>> Jonathan >>>> >>>> >>>> Jonathan Lacefield >>>> Solution Architect | (404) 822 3487 | jlacefi...@datastax.com >>>> >>>> >>>> >>>> On Tue, Dec 16, 2014 at 2:04 PM, Arne Claassen <a...@emotient.com> wrote: >>>> I have a three node cluster that has been sitting at a load of 4 (for each >>>> node), 100% CPI utilization (although 92% nice) for that last 12 hours, >>>> ever since some significant writes finished. I'm trying to determine what >>>> tuning I should be doing to get it out of this state. The debug log is >>>> just an endless series of: >>>> >>>> DEBUG [ScheduledTasks:1] 2014-12-16 19:03:35,042 GCInspector.java (line >>>> 118) GC for ParNew: 166 ms for 10 collections, 4400928736 used; max is >>>> 8000634880 >>>> DEBUG [ScheduledTasks:1] 2014-12-16 19:03:36,043 GCInspector.java (line >>>> 118) GC for ParNew: 165 ms for 10 collections, 4440011176 used; max is >>>> 8000634880 >>>> DEBUG [ScheduledTasks:1] 2014-12-16 19:03:37,043 GCInspector.java (line >>>> 118) GC for ParNew: 135 ms for 8 collections, 4402220568 used; max is >>>> 8000634880 >>>> >>>> iostat shows virtually no I/O. >>>> >>>> Compaction may enter into this, but i don't really know what to make of >>>> compaction stats since they never change: >>>> >>>> [root@cassandra-37919c3a ~]# nodetool compactionstats >>>> pending tasks: 10 >>>> compaction type keyspace table completed >>>> total unit progress >>>> Compaction mediamedia_tracks_raw 271651482 >>>> 563615497 bytes 48.20% >>>> Compaction mediamedia_tracks_raw 30308910 >>>> 21676695677 bytes 0.14% >>>> Compaction mediamedia_tracks_raw 1198384080 >>>> 1815603161 bytes 66.00% >>>> Active compaction remaining time : 0h22m24s >>>> >>>> 5 minutes later: >>>> >>>> [root@cassandra-37919c3a ~]# nodetool compactionstats >>>> pending tasks: 9 >>>> compaction type keyspace table completed >>>> total unit progress >>>> Compaction mediamedia_tracks_raw 271651482 >>>> 563615497 bytes 48.20% >>>> Compaction mediamedia_tracks_raw 30308910 >>>> 21676695677 bytes 0.14% >>>> Compaction mediamedia_tracks_raw 1198384080 >>>> 1815603161 bytes 66.00% >>>> Active compaction remaining time : 0h22m24s >>>> >>>> Sure the pending tasks went down by one, but the rest is identical. >>>> media_tracks_raw likely has a bunch of tombstones (can't figure out how to >>>> get stats on that). >>>> >>>> Is this behavior something that indicates that i need more Heap, larger >>>> new generation? Should I be manually running compaction on tables with >>>> lots of tombstones? >>>> >>>> Any suggestions or places to educate myself better on performance tuning >>>> would be appreciated. >>>> >>>> arne >>>> >>>> >>>> -- >>>> >>>> Ryan Svihla >>>> Solution Architect >>>> >>>> >>>> >>>> DataStax is the fastest, most scalable distributed database technology, >>>> delivering Apache Cassandra to the world’s most innovative enterprises. >>>> Datastax is built to be agile, always-on, and predictably scalable to any >>>> size. With more than 500 customers in 45 countries, DataStax is the >>>> database technology and transactional backbone of choice for the worlds >>>> most innovative companies such as Netflix, Adobe, Intuit, and eBay. >>>> >>>> >>>> >>>> -- >>>> >>>> Ryan Svihla >>>> Solution Architect >>>> >>>> >>>> >>>> DataStax is the fastest, most scalable distributed database technology, >>>> delivering Apache Cassandra to the world’s most innovative enterprises. >>>> Datastax is built to be agile, always-on, and predictably scalable to any >>>> size. With more than 500 customers in 45 countries, DataStax is the >>>> database technology and transactional backbone of choice for the worlds >>>> most innovative companies such as Netflix, Adobe, Intuit, and eBay. >>>> >>>> >>>> >>>> -- >>>> >>>> Ryan Svihla >>>> Solution Architect >>>> >>>> >>>> >>>> DataStax is the fastest, most scalable distributed database technology, >>>> delivering Apache Cassandra to the world’s most innovative enterprises. >>>> Datastax is built to be agile, always-on, and predictably scalable to any >>>> size. With more than 500 customers in 45 countries, DataStax is the >>>> database technology and transactional backbone of choice for the worlds >>>> most innovative companies such as Netflix, Adobe, Intuit, and eBay. >>>> >>>> >>>> >>>> -- >>>> >>>> Ryan Svihla >>>> Solution Architect >>>> >>>> >>>> >>>> DataStax is the fastest, most scalable distributed database technology, >>>> delivering Apache Cassandra to the world’s most innovative enterprises. >>>> Datastax is built to be agile, always-on, and predictably scalable to any >>>> size. With more than 500 customers in 45 countries, DataStax is the >>>> database technology and transactional backbone of choice for the worlds >>>> most innovative companies such as Netflix, Adobe, Intuit, and eBay. >>>> >>>> >>>> >>>> -- >>>> >>>> Ryan Svihla >>>> Solution Architect >>>> >>>> >>>> >>>> DataStax is the fastest, most scalable distributed database technology, >>>> delivering Apache Cassandra to the world’s most innovative enterprises. >>>> Datastax is built to be agile, always-on, and predictably scalable to any >>>> size. With more than 500 customers in 45 countries, DataStax is the >>>> database technology and transactional backbone of choice for the worlds >>>> most innovative companies such as Netflix, Adobe, Intuit, and eBay. >>>> >>>> >>>> >>>> -- >>>> >>>> Ryan Svihla >>>> Solution Architect >>>> >>>> >>>> >>>> DataStax is the fastest, most scalable distributed database technology, >>>> delivering Apache Cassandra to the world’s most innovative enterprises. >>>> Datastax is built to be agile, always-on, and predictably scalable to any >>>> size. With more than 500 customers in 45 countries, DataStax is the >>>> database technology and transactional backbone of choice for the worlds >>>> most innovative companies such as Netflix, Adobe, Intuit, and eBay. >>>> >>> >>> >>> >>> -- >>> >>> Ryan Svihla >>> Solution Architect >>> >>> >>> >>> DataStax is the fastest, most scalable distributed database technology, >>> delivering Apache Cassandra to the world’s most innovative enterprises. >>> Datastax is built to be agile, always-on, and predictably scalable to any >>> size. With more than 500 customers in 45 countries, DataStax is the >>> database technology and transactional backbone of choice for the worlds >>> most innovative companies such as Netflix, Adobe, Intuit, and eBay. >>> >> >> >> >> -- >> >> Ryan Svihla >> Solution Architect >> >> >> >> DataStax is the fastest, most scalable distributed database technology, >> delivering Apache Cassandra to the world’s most innovative enterprises. >> Datastax is built to be agile, always-on, and predictably scalable to any >> size. With more than 500 customers in 45 countries, DataStax is the database >> technology and transactional backbone of choice for the worlds most >> innovative companies such as Netflix, Adobe, Intuit, and eBay. >> >