If you only need query/update performance you could aggregate the logs too. If you need more information, I like what was proposed in SOLR-9641, that would allow you do collect and aggregate metrics for internal components too.
Tomás On Tue, Nov 15, 2016 at 8:31 AM, Walter Underwood <[email protected]> wrote: > To calculate percentiles we need all the data points. If there is a lot of > data, it could be sampled. > > Average can be calculated with the total time and the number of requests. > Snapshots of those > two values allow snapshots of averages. > > But averages are the wrong metric for a one-sided distribution like > response time. Let’s assume > that any response longer than 10 seconds is a bad experience. Percentiles > will tell you what > response time 95% of customer searches are getting. With averages, a > single 30 second response > time will increase the metric, even though it is “just as broken” as a 15 > s response. > > wunder > Walter Underwood > [email protected] > http://observer.wunderwood.org/ (my blog) > > > On Nov 15, 2016, at 7:27 AM, Ryan Josal <[email protected]> wrote: > > I haven't tried for 95th percentile, but generally with those collection > start stats you would monitor based on calculated deltas. You can figure > out the average response time for any given window of time not smaller than > your snapshot polling interval. I don't see why 95th percentile would be > any different. > > Ryan > > On Monday, November 14, 2016, Walter Underwood <[email protected]> > wrote: > >> Because the current stats are not usable. They really should be removed >> from the code. >> >> They calculate percentiles since the last collection load. We need to >> know 95th percentile >> during the peak hour last night, not the 95th for the last month. >> >> Right now, we run eleven collections in our Solr 4 cluster. In each >> collection, we have >> several different handlers. Usually, one for autosuggest (instant >> results), one for the SRP, >> and one for mobile, though we also have SEO requests and so on. We can >> track performance >> for each of these. >> >> wunder >> Walter Underwood >> [email protected] >> http://observer.wunderwood.org/ (my blog) >> >> >> On Nov 14, 2016, at 3:54 PM, Erick Erickson <[email protected]> >> wrote: >> >> Point taken, and thanks for the link. The stats I'm referring to in >> this thread are available now, and would (I think) be a quick win. I >> don't have a huge amount of investment in it though, more "why didn't >> we think of this before?" followed by "maybe there's a very good >> reason not to bother". This may be it since we now standardize on >> Jetty. My question of course is whether this would be supported moving >> forward to netty or whatever... >> >> Best, >> Erick >> >> On Mon, Nov 14, 2016 at 3:44 PM, Walter Underwood <[email protected]> >> wrote: >> >> I’m not fond of polling for performance stats. I’d rather have the app >> report them. >> >> We could integrate existing Jetty monitoring: >> >> http://metrics.dropwizard.io/3.1.0/manual/jetty/ >> >> From our experience with a similar approach, we might need some >> Solr-specific metric >> conflation. SolrJ sends a request to /solr/collection/handler as >> /solr/collection/select?qt=/handler. >> In our code, we fix that request to the intended path. We’ve been running >> a >> Tomcat metrics search >> filter for three years. >> >> Also, see: >> >> https://issues.apache.org/jira/browse/SOLR-8785 >> >> wunder >> Walter Underwood >> [email protected] >> http://observer.wunderwood.org/ (my blog) >> >> >> On Nov 14, 2016, at 3:25 PM, Erick Erickson <[email protected]> >> wrote: >> >> What do people think about exposing a Collections API call (name TBD, >> but the sense is PERFORMANCESTATS) that would simply issue the >> admin/mbeans call to each replica of a collection and report them >> back. This would give operations monitors the ability to see, say, >> anomalous replicas that had poor average response times for the last 5 >> minutes and the like. >> >> Seems like an easy enhancement that would make ops people's lives easier. >> >> I'll raise a JIRA if there's interest, but sure won't make progress on >> it until I clear my plate of some other JIRAs that I've let linger for >> far too long. >> >> Erick >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> >> >> >
