Thanks Zhao Yang, > Could you try some jvm tool to find out which thread are allocating memory or gc? maybe the migration stage thread..
I use Cassandra Cluster Manager to locally reproduce the issue. I tried to use VisualVM to find out which threads are allocating memory, but VisualVM does not see cassandra processes and says "Cannot open application with pid". Then I tried to use YourKit Java Profiler. It created snapshot when process of one cassandra node failed. http://i.imgur.com/9jBcjcl.png - how CPU is used by threads. http://i.imgur.com/ox5Sozy.png - how memory is used by threads, but biggest part of memory is used by objects without allocation information. http://i.imgur.com/oqx9crX.png - which objects use biggest part of memory. Maybe you know some other good jvm tool that can show by which threads biggest part of memory is used? > BTW, is your cluster under high load while dropping table? LA5 was <= 5 on all nodes almost all time while dropping tables Thanks 2017-04-21 19:49 GMT+03:00 Jasonstack Zhao Yang <zhaoyangsingap...@gmail.com >: > Hi Bohdan, Carlos, > > Could you try some jvm tool to find out which thread are allocating memory > or gc? maybe the migration stage thread.. > > BTW, is your cluster under high load while dropping table? > > As far as I remember, in older c* version, it applies the schema mutation > in memory, ie. DROP, then flush all schema info into sstable, then reads > all on disk schema into memory (5k tables info + related column info).. > > > You also might need to increase the node count if you're resource > constrained. > > More nodes won't help and most probably make it worse due to coordination. > > > Zhao Yang > > > > On Fri, 21 Apr 2017 at 21:10 Bohdan Tantsiura <bohdan...@gmail.com> wrote: > >> Hi, >> >> Problem is still not solved. Does anybody have any idea what to do with >> it? >> >> Thanks >> >> 2017-04-20 15:05 GMT+03:00 Bohdan Tantsiura <bohdan...@gmail.com>: >> >>> Thanks Carlos, >>> >>> In each keyspace we also have 11 MVs. >>> >>> It is impossible to reduce number of tables now. Long GC Pauses take >>> about one minute. But why it takes so much time and how that can be fixed? >>> >>> Each node in cluster has 128GB RAM, so resources are not constrained now >>> >>> Thanks >>> >>> 2017-04-20 13:18 GMT+03:00 Carlos Rolo <r...@pythian.com>: >>> >>>> You have 4800 Tables in total? That is a lot of tables, plus MVs? or >>>> MVs are already considered in the 60*80 account? >>>> >>>> I would recommend to reduce the table number. Other thing is that you >>>> need to check your log file for GC Pauses, and how long those pauses take. >>>> >>>> You also might need to increase the node count if you're resource >>>> constrained. >>>> >>>> Regards, >>>> >>>> Carlos Juzarte Rolo >>>> Cassandra Consultant / Datastax Certified Architect / Cassandra MVP >>>> >>>> Pythian - Love your data >>>> >>>> rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin: >>>> *linkedin.com/in/carlosjuzarterolo >>>> <http://linkedin.com/in/carlosjuzarterolo>* >>>> Mobile: +351 918 918 100 <+351%20918%20918%20100> >>>> www.pythian.com >>>> >>>> On Thu, Apr 20, 2017 at 11:10 AM, Bohdan Tantsiura <bohdan...@gmail.com >>>> > wrote: >>>> >>>>> Hi, >>>>> >>>>> We are using cassandra 3.10 in a 10 nodes cluster with replication = >>>>> 3. MAX_HEAP_SIZE=64GB on all nodes, G1 GC is used. We have about 60 >>>>> keyspaces with about 80 tables in each keyspace. We had to delete three >>>>> tables and two materialized views from each keyspace. It began to take >>>>> more >>>>> and more time for each next keyspace (for some keyspaces it took about 30 >>>>> minutes) and then failed with "Cannot achieve consistency level ALL". >>>>> After >>>>> restarting the same repeated. It seems that cassandra hangs on GC. How >>>>> that >>>>> can be solved? >>>>> >>>>> Thanks >>>>> >>>> >>>> >>>> -- >>>> >>>> >>>> >>>> >>> >>