Hi, Petrus, Seems we've solved the problem, but it wasn't relationed to repair the cluster or disk latency. I've increased the memory available for Cassandra from 16GB to 24GB and the performance was much improved! The main symptom we've observed in Opscenter was a significantly decrease in total compactions graph.
Felipe Esteves Tecnologia felipe.este...@b2wdigital.com <seu.em...@b2wdigital.com> 2017-07-15 3:23 GMT-03:00 Petrus Gomes <petru...@gmail.com>: > Hi Felipe, > > Yes, try it and let us know how it goes. > > Thanks, > Petrus Silva. > > On Fri, Jul 14, 2017 at 11:37 AM, Felipe Esteves < > felipe.este...@b2wdigital.com> wrote: > >> Hi Petrus, thanks for the feedback. >> >> I couldn't found the percent repaired in nodetool info, C* version is >> 2.1.8, maybe it's something newer than that? >> >> I'm analyzing this thread about num_token. >> >> Compaction is "compaction_throughput_mb_per_sec: 16", I don't get >> pending compactions in Opscenter. >> >> One point I've noticed, is that Opscenter show "OS: Disk Latency" max >> with high values when the problem occurs, but it doesn't reflect in server >> directly monitoring, in these tools the IO and latency of disks seems ok. >> But seems to me that "read repair attempted" is a bit high, maybe it will >> explain the latency in reads. I will try to run a repair on cluster to see >> how it goes. >> >> Felipe Esteves >> >> Tecnologia >> >> felipe.este...@b2wdigital.com <seu.em...@b2wdigital.com> >> >> Tel.: (21) 3504-7162 ramal 57162 >> >> Skype: felipe2esteves >> >> 2017-07-13 15:02 GMT-03:00 Petrus Gomes <petru...@gmail.com>: >> >>> How is your Percent Repaired when you run " nodetool info" ? >>> >>> Search for : >>> "reduced num_token = improved performance ??" topic. >>> The people were discussing that. >>> >>> How is your compaction is configured? >>> >>> Could you run the same process in command line to have a measurement? >>> >>> Thanks, >>> Petrus Silva >>> >>> >>> >>> On Thu, Jul 13, 2017 at 7:49 AM, Felipe Esteves < >>> felipe.este...@b2wdigital.com> wrote: >>> >>>> Hi, >>>> >>>> I have a Cassandra 2.1 cluster running on AWS that receives high read >>>> loads, jumping from 100k requests to 400k requests, for example. Then it >>>> normalizes and later cames another high throughput. >>>> >>>> To the application, it appears that Cassandra is slow. However, cpu and >>>> disk use is ok in every instance, row cache is enabled and with almost 100% >>>> hit rate. >>>> >>>> The logs from Cassandra instances doesn't have any errors, nor >>>> tombstone messages or something liked that. It's mostly compactions and >>>> G1GC operations. >>>> >>>> Any hints on where to investigate more? >>>> >>>> >>>> Felipe Esteves >>>> >>>> >>>> >>>> >>>> >>> >>> ------------------------------ >>> >>> Esta mensagem pode conter informações confidenciais e somente o >>> indivíduo ou entidade a quem foi destinada pode utilizá-la. A transmissão >>> incorreta da mensagem não acarreta a perda de sua confidencialidade. Caso >>> esta mensagem tenha sido recebida por engano, solicitamos que o fato seja >>> comunicado ao remetente e que a mensagem seja eliminada de seu sistema >>> imediatamente. É vedado a qualquer pessoa que não seja o destinatário usar, >>> revelar, distribuir ou copiar qualquer parte desta mensagem. Ambiente de >>> comunicação sujeito a monitoramento. >>> >>> This message may include confidential information and only the intended >>> addresses have the right to use it as is, or any part of it. A wrong >>> transmission does not break its confidentiality. If you've received it >>> because of a mistake or erroneous transmission, please notify the sender >>> and delete it from your system immediately. This communication environment >>> is controlled and monitored. >>> >>> B2W Digital >>> >>> >>> >> >> >> >> > --