Upgrade to 2.1.3 seems to help so far. After ~12 hours total memory consumption grew from 10GB to 10.5GB.
On Thu, Feb 19, 2015 at 2:02 PM, Carlos Rolo <r...@pythian.com> wrote: > Then you are probably hitting a bug... Trying to find out in Jira. The bad > news is the fix is only to be released on 2.1.4. Once I find it out I will > post it here. > > Regards, > > Carlos Juzarte Rolo > Cassandra Consultant > > Pythian - Love your data > > rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo > <http://linkedin.com/in/carlosjuzarterolo>* > Tel: 1649 > www.pythian.com > > On Thu, Feb 19, 2015 at 12:16 PM, Michał Łowicki <mlowi...@gmail.com> > wrote: > >> |trickle_fsync| has been enabled for long time in our settings (just >> noticed): >> >> trickle_fsync: true >> >> trickle_fsync_interval_in_kb: 10240 >> >> On Thu, Feb 19, 2015 at 12:12 PM, Michał Łowicki <mlowi...@gmail.com> >> wrote: >> >>> >>> >>> On Thu, Feb 19, 2015 at 11:02 AM, Carlos Rolo <r...@pythian.com> wrote: >>> >>>> Do you have trickle_fsync enabled? Try to enable that and see if it >>>> solves your problem, since you are getting out of non-heap memory. >>>> >>>> Another question, is always the same nodes that die? Or is 2 out of 4 >>>> that die? >>>> >>> >>> Always the same nodes. Upgraded to 2.1.3 two hours ago so we'll monitor >>> if maybe issue has been fixed there. If not will try to enable >>> |tricke_fsync| >>> >>> >>>> >>>> Regards, >>>> >>>> Carlos Juzarte Rolo >>>> Cassandra Consultant >>>> >>>> Pythian - Love your data >>>> >>>> rolo@pythian | Twitter: cjrolo | Linkedin: >>>> *linkedin.com/in/carlosjuzarterolo >>>> <http://linkedin.com/in/carlosjuzarterolo>* >>>> Tel: 1649 >>>> www.pythian.com >>>> >>>> On Thu, Feb 19, 2015 at 10:49 AM, Michał Łowicki <mlowi...@gmail.com> >>>> wrote: >>>> >>>>> >>>>> >>>>> On Thu, Feb 19, 2015 at 10:41 AM, Carlos Rolo <r...@pythian.com> >>>>> wrote: >>>>> >>>>>> So compaction doesn't seem to be your problem (You can check with >>>>>> nodetool compactionstats just to be sure). >>>>>> >>>>> >>>>> pending tasks: 0 >>>>> >>>>> >>>>>> >>>>>> How much is your write latency on your column families? I had OOM >>>>>> related to this before, and there was a tipping point around 70ms. >>>>>> >>>>> >>>>> Write request latency is below 0.05 ms/op (avg). Checked with >>>>> OpsCenter. >>>>> >>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> BR, >>>>> Michał Łowicki >>>>> >>>> >>>> >>>> -- >>>> >>>> >>>> >>>> >>> >>> >>> -- >>> BR, >>> Michał Łowicki >>> >> >> >> >> -- >> BR, >> Michał Łowicki >> > > > -- > > > > -- BR, Michał Łowicki