Jens, We haven't noticed any particular large GC operations or even persistently high GC times.
Mike On Thu, Jun 30, 2016 at 3:20 AM, Jens Rantil <jens.ran...@tink.se> wrote: > Hi, > > Could it be garbage collection occurring on nodes that are more heavily > loaded? > > Cheers, > Jens > > Den sön 26 juni 2016 05:22Mike Heffner <m...@librato.com> skrev: > >> One thing to add, if we do a rolling restart of the ring the timeouts >> disappear entirely for several hours and performance returns to normal. >> It's as if something is leaking over time, but we haven't seen any >> noticeable change in heap. >> >> On Thu, Jun 23, 2016 at 10:38 AM, Mike Heffner <m...@librato.com> wrote: >> >>> Hi, >>> >>> We have a 12 node 2.2.6 ring running in AWS, single DC with RF=3, that >>> is sitting at <25% CPU, doing mostly writes, and not showing any particular >>> long GC times/pauses. By all observed metrics the ring is healthy and >>> performing well. >>> >>> However, we are noticing a pretty consistent number of connection >>> timeouts coming from the messaging service between various pairs of nodes >>> in the ring. The "Connection.TotalTimeouts" meter metric show 100k's of >>> timeouts per minute, usually between two pairs of nodes for several hours >>> at a time. It seems to occur for several hours at a time, then may stop or >>> move to other pairs of nodes in the ring. The metric >>> "Connection.SmallMessageDroppedTasks.<ip>" will also grow for one pair of >>> the nodes in the TotalTimeouts metric. >>> >>> Looking at the debug log typically shows a large number of messages like >>> the following on one of the nodes: >>> >>> StorageProxy.java:1033 - Skipped writing hint for /172.26.33.177 (ttl 0) >>> >>> We have cross node timeouts enabled, but ntp is running on all nodes and >>> no node appears to have time drift. >>> >>> The network appears to be fine between nodes, with iperf tests showing >>> that we have a lot of headroom. >>> >>> Any thoughts on what to look for? Can we increase thread count/pool >>> sizes for the messaging service? >>> >>> Thanks, >>> >>> Mike >>> >>> -- >>> >>> Mike Heffner <m...@librato.com> >>> Librato, Inc. >>> >>> >> >> >> -- >> >> Mike Heffner <m...@librato.com> >> Librato, Inc. >> >> -- > > Jens Rantil > Backend Developer @ Tink > > Tink AB, Wallingatan 5, 111 60 Stockholm, Sweden > For urgent matters you can reach me at +46-708-84 18 32. > -- Mike Heffner <m...@librato.com> Librato, Inc.