Hi Oskar,

Thanks for response.

 Yes, could see lot of threads for compaction. Actually we are loading
around 400GB data  per node on 3 node cassandra cluster.
Throttling was set to write around 7k TPS per node. Job ran fine for 2 days
and then, we start getting Mutation drops  , longer GC and very high load
on system.

System log reports:
Enqueuing flush of compactions_in_progress: 1156 (0%) on-heap, 1132 (0%)
off-heap

 The job was stopped 12 hours back. But, still these failures can be seen.
Can you Please let me know how shall i proceed further. If possible, Please
suggest some parameters for high write intensive jobs.


Regards,
Varun Saluja


On 11 May 2017 at 23:01, Oskar Kjellin <oskar.kjel...@gmail.com> wrote:

> Do you have a lot of compactions going on? It sounds like you might've
> built up a huge backlog. Is your throttling configured properly?
>
> > On 11 May 2017, at 18:50, varun saluja <saluj...@gmail.com> wrote:
> >
> > Hi Experts,
> >
> > Seeking your help on a production issue.  We were running high write
> intensive job on our 3 node cassandra cluster V 2.1.7.
> >
> > TPS on nodes were high. Job ran for more than 2 days and thereafter,
> loadavg on 1 of the node increased to very high number like loadavg : 29.
> >
> > System log reports:
> >
> > INFO  [ScheduledTasks:1] 2017-05-11 22:11:04,466
> MessagingService.java:888 - 839 MUTATION messages dropped in last 5000ms
> > INFO  [ScheduledTasks:1] 2017-05-11 22:11:04,466
> MessagingService.java:888 - 2 READ messages dropped in last 5000ms
> > INFO  [ScheduledTasks:1] 2017-05-11 22:11:04,466
> MessagingService.java:888 - 1 REQUEST_RESPONSE messages dropped in last
> 5000ms
> >
> > The job was stopped due to heavy load. But sill after 12 hours , we can
> see mutation drops messages and sudden increase on avgload
> >
> > Are these hintedhandoff mutations? Can we stop these.
> > Strangely this behaviour is seen only on 2 nodes. Node 1 does not show
> any load or any such activity.
> >
> > Due to heavy load and GC , there are intermittent gossip failures among
> node. Can you someone Please help.
> >
> > PS: Load job was stopped on cluster. Everything ran fine for few hours
> and and Later issue started again like mutation messages drops.
> >
> > Thanks and Regards,
> > Varun Saluja
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
>

Reply via email to