You probably have a very large partition in that file. Nodetool cfstats will show you the largest compacted partition now - I suspect it's much higher than before.
On Thu, Jul 5, 2018 at 9:50 PM, atul atri <atulatri2...@gmail.com> wrote: > Hi Chris, > > Compaction process finally finished. It took long time though. > > Thank you very much for all your help. > > Please let me know if you have any guidelines to make future compaction > processes faster. > > Thanks & Regards, > Atul Atri. > > On 5 July 2018 at 22:05, atul atri <atulatri2...@gmail.com> wrote: > >> Hi Cris, >> >> Thank you for reply. >> >> I already have tried to run "nodetool stop compaction" and this does not >> help. I have restarted each node in cluster one by one and compaction >> starts again. It gets stuck on same table. >> >> Following in 'nodetool compactionstats' output. It's stuck at >> *1336035468* for more than 35 hours at least. >> >> >> >> >> *pending tasks: 1 compaction type keyspace >> table completed total unit progress >> Compactionnotification_system_v1user_notification 1336035468 >> 1660997721 bytes 80.44%Active compaction remaining time : 0h00m38s* >> >> >> Following is output for "nodetool cfstats". >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> *Table: user_notification SSTable count: 18 Space used >> (live), bytes: 17247516201 Space used (total), bytes: 17316488652 >> SSTable Compression Ratio: 0.41922805938461566 Number of keys >> (estimate): 32556160 Memtable cell count: 44717 Memtable data >> size, bytes: 27705294 Memtable switch count: 5 Local read >> count: 0 Local read latency: 0.000 ms Local write count: >> 236961 Local write latency: 0.047 ms Pending tasks: 0 >> Bloom filter false positives: 0 Bloom filter false ratio: 0.00000 >> Bloom filter space used, bytes: 72414688 Compacted partition >> minimum bytes: 104 Compacted partition maximum bytes: 4966933177 >> Compacted partition mean bytes: 1183 Average live cells per >> slice (last five minutes): 0.0 Average tombstones per slice (last >> five minutes): 0.0* >> >> Please let me know if any more information. I am really thankful to you >> for spending time on this investigation. >> >> Thanks & Regards, >> Atul Atri. >> >> >> On 5 July 2018 at 20:54, Chris Lohfink <clohf...@apple.com> wrote: >> >>> That looks a bit to me like it isnt stuck but just a long running >>> compaction. Can you include the output of `nodetool compactionstats` >>> and the `nodetool cfstats` with schema for the table thats being >>> compacted (redacted names if necessary). >>> >>> Can stop compaction with `nodetool stop COMPACTION` or restarting the >>> node. >>> >>> Chris >>> >>> On Jul 5, 2018, at 12:08 AM, atul atri <atulatri2...@gmail.com> wrote: >>> >>> Hi, >>> >>> We noticed that compaction process is also hanging on a node in backup >>> ring. Please find attached thread dump for both servers. Recently, we have >>> made few changes in cluster topology. >>> >>> a. Added new server in backup data-center and decommissioned old server. >>> Backup ring only has 2 server. >>> b. Added new node in primary data-center. Now it has 4 nods. >>> >>> Is there way we can stop this compaction? As we have added a new node in >>> this cluster and we are waiting to run cleanup on this node on which >>> compaction is hanging. I am afraid that cleanup will not start until >>> compaction job finishes. >>> >>> Attachments: >>> 1. cass-logg02.prod2.thread_dump.out: Thread dump from old node in >>> primary datacenter >>> 2. cass-logg03.prod1.thread_dump.out: Thread dump from new node in >>> backup datacenter. This node is added recently. >>> >>> Your help is much appreciated. >>> >>> Thanks & Regards, >>> Atul Atri. >>> >>> >>> On 4 July 2018 at 21:15, atul atri <atulatri2...@gmail.com> wrote: >>> >>>> Hi Chris, >>>> Thanks for reply. >>>> >>>> Unfortunately, our servers do not have jstack installed. >>>> I tried "kill -3 <PID>" option but that is also not generating thread >>>> dump. >>>> >>>> Is there any other way I can generate thread dump? >>>> >>>> Thanks & Regards, >>>> Atul Atri. >>>> >>>> On 4 July 2018 at 20:32, Chris Lohfink <clohf...@apple.com> wrote: >>>> >>>>> Can you take a thread dump (jstack) and share the state of the >>>>> compaction threads? Also check for “Exception” in logs >>>>> >>>>> Chris >>>>> >>>>> Sent from my iPhone >>>>> >>>>> On Jul 4, 2018, at 8:37 AM, atul atri <atulatri2...@gmail.com> wrote: >>>>> >>>>> Hi, >>>>> >>>>> On one of our server, compaction process is hanging. It's stuck at >>>>> 80%. It was stuck for last 3 days. And today we did a cluster restart (one >>>>> host at time). And again it is stuck at same 80%. CPU usages are 100% and >>>>> there seems no IO issue. We are seeing following kinds of WARNING in >>>>> system.log >>>>> >>>>> *BatchStatement.java (line 226) Batch of prepared statements for >>>>> [****, *****] is of size 7557, exceeding specified threshold of 5120 by >>>>> 2437.* >>>>> >>>>> >>>>> Other than this there seems no error. I have tried to stop compaction >>>>> process, but it does not stop. Cassandra version is 2.1. >>>>> >>>>> Can someone please guide us in solving this issue? >>>>> >>>>> Thanks & Regards, >>>>> Atul Atri. >>>>> >>>>> >>>> >>> <cass-logg02.prod2.thread_dump.out><cass-logg03.prod1.thread_dump.out> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org >>> <user-unsubscr...@cassandra.apache.org> >>> For additional commands, e-mail: user-h...@cassandra.apache.org >>> <user-h...@cassandra.apache.org> >>> >>> >>> >> >