ok. Wanted to understand advantage of having a container class for all storeless stats (just as DescriptiveStats is for Univariate). I could open another email thread. Also wanted to understand whats a abstract interface problem that you were refering
thanks murthy On Tue, Oct 14, 2014 at 9:47 AM, Phil Steitz <phil.ste...@gmail.com> wrote: > On 10/13/14 8:55 PM, venkatesha murthy wrote: > > On Tue, Oct 14, 2014 at 6:05 AM, Phil Steitz <phil.ste...@gmail.com> > wrote: > > > >> On 10/13/14 1:04 PM, venkatesha murthy wrote: > >>> Adding a bit more on this: > >>> a) The DescriptiveStatisticalSummary actually handles the rest of the > >>> functions such as addValue, getPercentile etc. > >>> b) I have added addValue() as it is important to see either storeless > or > >>> store variants as interfaces. > >>> c) A case in point being (for b); i was actually trying out a lockfull > >> and > >>> a lockfree based variants for descriptive statistical summary and it > was > >>> very concise/consistent with an interface to use that has all common > >>> functions across all variants. > >>> d) well lock based or lock free variants are not a part of this patch > as > >>> iam still working through > >>> > >>> However i feel the getPercentile can definitely add value. Please let > me > >>> know if i could turn in all the relevant methods of > >>> DescriptiveStorelessStatistics into statistical summary (such as > >> kurtosis, > >>> skewness etc..) and then we could just use SummaryStatistics. > >> I am not sure I understand what you are proposing. Currently, we > >> have two statistical "aggregates" for descriptive univariate stats: > >> SummaryStatistics - aggregates "storeless" statistics over a stream > >> of data that is not stored in memory > >> DescriptiveStatistics - provides an extended set of statistics, some > >> of which require that the full set of data be stored in memory > >> > >> OK. I am sorry for the confusion here. I understand the intent now. > > However what i wanted to convey was all the statistics that > > is supported in current DescriptiveStatistics can be supported in > Storeless > > variant as well. (For eg: skewness, kurtosis, percentile) > > No, for example exact percentiles, or even arbitrary percentiles > (without the quantile - e.g. quartile) specified in advance, can't > be computed without storing the data. Also, DescriptiveStatistics > supports a rolling window and stats it implements can make use of > multi-pass algorithms. > > > > > Therefore; what i was proposing is to have a common interface that can > have > > all these methods too. for eg: (we can change the name if it is needed) > > > > DescriptiveStatisticalSummary<S extends UnivariateStatistics> extends > > StatisticalSummary{ > > getKurtosis(); > > getPercentile(); > > getSkewness(); > > // Add Mutation methods as well > > addValue(double d); > > //Provide additional builder methods for injecting custom > percentile, > > kurtosis, skewness, variance etc. > > withPercentile(S Percentile); > > withKurtosis(S kurtosis); > > } > > Per comments above, the contracts of these aggregates are > different. We have also moved away from defining abstract > interfaces as these end up creating problems when we want to add > things (as in the subject of this thread). > > Phil > > > >> The subject of this thread was a proposal to add quartiles to > >> SummaryStatistics, as the new(ish) PSquarePercentile allows those > >> statistics to be computed without storing the data. > >> > >> Agreed. I was just adding points on how we can bring both > > DescriptiveStatistics and SummaryStatistics under a common interface for > > all the stats. > > > >> Phil > >>> On Tue, Oct 14, 2014 at 1:15 AM, venkatesha murthy < > >>> venkateshamurth...@gmail.com> wrote: > >>> > >>>> Hi Phil, > >>>> > >>>> Though i did not add to StatisticalSummary i was actually working on a > >>>> DescriptiveStatisticalSummary for all the Storeless variants inclusive > >> of > >>>> PSquarePercentile. Would it help if you can actually implement > >>>> SummaryStatisitcs with an extended interface such as > >>>> DescriptiveStatisticalSummary ? below. > >>>> > >>>> That said i actually wanted to discuss the new storelessvariant of > >>>> descriptive statistics. > >>>> a) DescriptiveStatisticalSummary - an extended interface for > >>>> StatisticalSummary (adds a Generic type that can cater for store full > >> and > >>>> storeless) > >>>> b) DescriptiveStorelessStatistics - Storeless variant of > >>>> DescriptiveStatisitcs > >>>> c) SynchronizedDescriptiveStorelessStatistics - a synchronized > wrapper. > >>>> > >>>> Test case classes added to the same. > >>>> > >>>> Please let me know on this i could also accomodate the changes to > >> summary > >>>> stats based on this change here. > >>>> Also please let me know if this could be raised as a jira ticket to > >> pursue. > >>>> Thanks > >>>> Murthy > >>>> > >>>> On Sat, Oct 11, 2014 at 1:10 AM, Phil Steitz <phil.ste...@gmail.com> > >>>> wrote: > >>>> > >>>>> Now that we have a "storeless" percentile estimator, we can add > >>>>> quartile computation to SummaryStatistics. Any objections to my > >>>>> adding this? I could optionally add a boolean constructor argument > >>>>> to avoid the overhead of maintaining these stats. Or more > >>>>> generally, add a bitfield encoding the exact set of stats the user > >>>>> wants to maintain. If there are no objections to the addition, I > >>>>> will open a JIRA. > >>>>> > >>>>> Phil > >>>>> > >>>>> > >>>>> --------------------------------------------------------------------- > >>>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > >>>>> For additional commands, e-mail: dev-h...@commons.apache.org > >>>>> > >>>>> > >> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > >> For additional commands, e-mail: dev-h...@commons.apache.org > >> > >> > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > >