Really wild guess : do you monitor I/O performance and are positive those are the same over the past year ? (network becoming a little busier, hard drive a bit slower and so on) ? Wild guess 2 : a new 'monitoring' software (log shipping agent for instance) added meanwhile on the box ?
On 11 June 2018 at 16:56, Jeff Jirsa <jji...@gmail.com> wrote: > No > > -- > Jeff Jirsa > > > On Jun 11, 2018, at 7:49 AM, Fd Habash <fmhab...@gmail.com> wrote: > > I will check for both. > > > > On a different subject, I have read some user testimonies that running > ‘nodetool cleanup’ requires a C* process reboot at least around 2.2.8. Is > this true? > > > > > > ---------------- > Thank you > > > > *From: *Nitan Kainth <nitankai...@gmail.com> > *Sent: *Monday, June 11, 2018 10:40 AM > *To: *user@cassandra.apache.org > *Subject: *Re: Read Latency Doubles After Shrinking Cluster and Never > Recovers > > > > I think it would because it Cassandra will process more sstables to create > response to read queries. > > > > Now after clean if the data volume is same and compaction has been > running, I can’t think of any more diagnostic step. Let’s wait for other > experts to comment. > > > > Can you also check sstable count for each table just to be sure that they > are not extraordinarily high? > > Sent from my iPhone > > > On Jun 11, 2018, at 10:21 AM, Fd Habash <fmhab...@gmail.com> wrote: > > Yes we did after adding the three nodes back and a full cluster repair as > well. > > > > But even it we didn’t run cleanup, would it have impacted read latency the > fact that some nodes still have sstables that they no longer need? > > > > Thanks > > > > ---------------- > Thank you > > > > *From: *Nitan Kainth <nitankai...@gmail.com> > *Sent: *Monday, June 11, 2018 10:18 AM > *To: *user@cassandra.apache.org > *Subject: *Re: Read Latency Doubles After Shrinking Cluster and Never > Recovers > > > > Did you run cleanup too? > > > > On Mon, Jun 11, 2018 at 10:16 AM, Fred Habash <fmhab...@gmail.com> wrote: > > I have hit dead-ends every where I turned on this issue. > > > > We had a 15-node cluster that was doing 35 ms all along for years. At > some point, we made a decision to shrink it to 13. Read latency rose to > near 70 ms. Shortly after, we decided this was not acceptable, so we added > the three nodes back in. Read latency dropped to near 50 ms and it has been > hovering around this value for over 6 months now. > > > > Repairs run regularly, load on cluster nodes is even, application > activity profile has not changed. > > > > Why are we unable to get back the same read latency now that the cluster > is 15 nodes large same as it was before? > > > > -- > > > > ---------------------------------------- > Thank you > > > > > > > > >