I was about to point you at that pull request.  How droll.

Didn't know it was from you guys.


On Thu, Aug 8, 2013 at 3:35 PM, Otis Gospodnetic <[email protected]
> wrote:

> Hi Ted,
>
> Yes, that's what we did recently, too:
> https://github.com/clearspring/stream-lib/pull/47
>
> ... but it's still a little too phat...which is what made me think of your
> OnlineSummarizer as a possible, slimmer alternative.
>
> Otis
> ----
> Performance Monitoring for Solr / ElasticSearch / Hadoop / HBase -
> http://sematext.com/spm
>
>
>
>
> >________________________________
> > From: Ted Dunning <[email protected]>
> >To: "[email protected]" <[email protected]>; Otis Gospodnetic <
> [email protected]>
> >Sent: Thursday, August 8, 2013 8:27 AM
> >Subject: Re: Is OnlineSummarizer mergeable?
> >
> >
> >
> >I just looked at the source for QDigest from streamlib.
> >
> >
> >I think that the memory usage could be trimmed substantially, possibly by
> as much as 5:1 by using more primitive friendly structures.
> >
> >
> >
> >
> >
> >On Wed, Aug 7, 2013 at 3:04 PM, Otis Gospodnetic <
> [email protected]> wrote:
> >
> >Hi Ted,
> >>
> >>I need percentiles.  Ideally not pre-defined ones, because one person
> may want e.g. 70th pctile, while somebody else might want 75th pctile for
> the same metric.
> >>
> >>Deal breakers:
> >>High memory footprint. ("high" means "higher than QDigest from
> stream-lib" for us.... and we could test and compare with QDigest
> relatively easily with live data)
> >>Algos that create data structures that cannot be merged
> >>Loss of accuracy that is not predictably small or configurable
> >>
> >>Thank you,
> >>Otis
> >>----
> >>
> >>Performance Monitoring for Solr / ElasticSearch / Hadoop / HBase -
> http://sematext.com/spm
> >>
> >>
> >>
> >>
> >>>________________________________
> >>> From: Ted Dunning <[email protected]>
> >>>To: "[email protected]" <[email protected]>; Otis
> Gospodnetic <[email protected]>
> >>>Sent: Wednesday, August 7, 2013 11:48 PM
> >>>Subject: Re: Is OnlineSummarizer mergeable?
> >>>
> >>>
> >>>
> >>>Otis,
> >>>
> >>>
> >>>What statistics do you need?
> >>>
> >>>
> >>>What guarantees?
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>On Wed, Aug 7, 2013 at 1:26 PM, Otis Gospodnetic <
> [email protected]> wrote:
> >>>
> >>>Hi Ted,
> >>>>
> >>>>I'm actually trying to find an alternative to QDigest (the stream-lib
> impl specifically) because even though it seems good, we have to deal with
> crazy volumes of data in SPM (performance monitoring service, see
> signature)... I'm hoping we can find something that has both a lower memory
> footprint than QDigest AND that is mergeable a la QDigest.  Utopia?
> >>>>
> >>>>Thanks,
> >>>>Otis
> >>>>----
> >>>>Performance Monitoring for Solr / ElasticSearch / Hadoop / HBase -
> http://sematext.com/spm
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>>________________________________
> >>>>> From: Ted Dunning <[email protected]>
> >>>>>To: "[email protected]" <[email protected]>
> >>>>>Sent: Wednesday, August 7, 2013 4:51 PM
> >>>>>Subject: Re: Is OnlineSummarizer mergeable?
> >>>>>
> >>>>>
> >>>>>It isn't as mergeable as I would like.  If you have randomized record
> >>>>>selection, it should be possible, but perverse ordering can cause
> serious
> >>>>>errors.
> >>>>>
> >>>>>It would be better to use something like a Q-digest.
> >>>>>
> >>>>>http://www.cs.virginia.edu/~son/cs851/papers/ucsb.sensys04.pdf
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>On Wed, Aug 7, 2013 at 4:21 AM, Otis Gospodnetic <
> [email protected]
> >>>>>> wrote:
> >>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> Is OnlineSummarizer algo "mergeable"?
> >>>>>>
> >>>>>> Say that we compute a percentile for some metric for time
> 12:00-12:01
> >>>>>> and store that somewhere, then we compute it for 1201-12:02 and
> store
> >>>>>> that separately, and so on.
> >>>>>>
> >>>>>> Can we then later merge these computed and previously stored
> >>>>>> percentile "instances" and get an accurate value?
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Otis
> >>>>>> --
> >>>>>> Performance Monitoring -- http://sematext.com/spm
> >>>>>> Solr & ElasticSearch Support -- http://sematext.com/
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>
> >>>
> >>>
> >
> >
> >

Reply via email to