Phil Steitz a écrit :
On Feb 9, 2008 5:49 PM, Phil Steitz <[EMAIL PROTECTED]> wrote:
On Feb 9, 2008 11:14 AM, Luc Maisonobe <[EMAIL PROTECTED]> wrote:
Phil Steitz a écrit :
On Feb 8, 2008 7:00 AM, <[EMAIL PROTECTED]> wrote:
[EMAIL PROTECTED] wrote:
In addition to the statistics required by the StatisticalSummary interface it
implements, the SummaryStatistics class computes the sum of squares and the
sum
of logs. It also has setters and getters for the underlying statistics
implementations. However, it does not provide a getSumlg method.
The sum of logs is also not used in the equals, hash and toString methods.
Luc
Should the sum of logs computation be deprecated or a getSumlg method added ?
Interesting. This is likely a result of refactoring several years
back when the geometric mean computation used the sum of logs
instance. Now it does not, so it is either wasted computation or
something of value not exposed to the user. Makes sense to me to add
getSumLog to SummaryStatistics. It doesn't need to be included in
equals or hashcode since geo mean + N equivalence implies log sum
equivalence.
Looking again at the code, I now see it as stupid that geometricMean
in SummaryStatistics does not use the sumOfLogs instance. If
geometricMean exposed a setter for its internally wrapped sumOfLogs
instance, we could just set that in SummaryStatistics and only
increment the sumOfLogs instance. It would probably also be an
improvment for geometricMean to expose a setter for this.
If there are no objections, I will go ahead and make these changes.
Thanks for pointing this out, luc.
Go ahead with that.
If you could also have a glimpse at the multivariate summary statistics
I added yesterday, I would be happy. During my paid work day (in the
space industry), I am in the process of switching several projects from
Mantissa to [math] and needed this feature.
I am aware this was really done in a hurry. I have tried to be as
compliant with univariate statistics as possible, but may have
completely missed something. I have reused the VectorialCovariance class
I commited one year ago, but it does not follow the general architecture
of other statistics. I would really like to have your thoughts about this.
The code looks fine to me. I have a couple of comments on the API.
1) I edited the javadoc for MultivariateSummaryStatistics to make it
clearer what was actually being computed / reported. If you are OK
with these changes, we can replicate to the interface definition.
Yes, these changes are fine. I will copy the new javadoc to interface.
2) It might be more convenient for users to have the implementation
setters just take a single instance, rather than an array. I can't
think of a use case where one would want to use different
implementations of the same statistic here. (so, e.g.
setMeanImpl(StorelessUnivariateStatistic[] meanImpl) becomes just
setMeanImpl(StorelessUnivariateStatistic meanImpl) and we take care of
the cloning).
Ugh. Just realized that there is no reason to believe we can clone arbitrary
StorelessUnivariateStatistics, so this may not be possible.
I agree. So we should rather keep the arrays.
3) I am not sure StatisticalMultivariateSummaryValues is necessary /
valuable enough to be included. The analog for univariate was added
to be input into tests, etc, but I am not sure this one is that
useful. Do you need / use this or can you see use cases for it? If
not, I would say omit it and we can always add later if someone wants
it.
I included it only for consistency. I don't need it and it does a lot of
copying, which bother me. I'll remove it.
Might be good also to add a reference in the user guide just to let
people know its there.
Yes. I forgot to do this. I'll write something.
Thanks
Luc
Phil
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]