One more thing... (ala Detective Colombo). In add and remove observation there is a snippet which looks like:
sumXX += dx * dx * (double) n / (n + 1d); sumYY += dy * dy * (double) n / (n + 1d); sumXY += dx * dy * (double) n / (n + 1d); xbar += dx / (n + 1.0); ybar += dy / (n + 1.0); Would you be against pulling out the divisions by ( n+1.0)? Maybe something like: final double fact1 = 1.0/((double) n + 1.0); final double fact2 = ((double) n) * fact1; sumXX += dx * dx * fact2; sumYY += dy * dy * fact2; sumXY += dx * dy * fact2; xbar += dx * fact1; ybar += dy * fact1; I realize that most likely the compiler or runtime does this optimization, but just in case... -Greg On Fri, Aug 12, 2011 at 11:40 PM, Greg Sterijevski <gsterijev...@gmail.com>wrote: > +1 from me, on all counts! -Greg > > > On Fri, Aug 12, 2011 at 11:30 PM, Phil Steitz <phil.ste...@gmail.com>wrote: > >> On 8/12/11 7:16 PM, Greg Sterijevski wrote: >> > Hello All, >> > >> > Before I chum the water with more JIRA tickets, I thought I would see >> > whether people thought this was important. >> > >> > In the SimpleRegression you have two methods: >> > >> > public void addData(double x, double y) { >> > ...some code that is not germane to discussion...... >> > >> > if (n > 2) { >> > distribution = new TDistributionImpl(n - 2); >> > } >> > } >> > >> > public void removeData(double x, double y) { >> > ...some code that is not germane...... >> > >> > if (n > 2) { >> > distribution = new TDistributionImpl(n - 2); >> > } >> > } >> > } >> > >> > >From the perspective of a user, you are likely to call add/remove >> repeatedly >> > before you ever need to check for statistical significance. Wouldn't it >> be >> > better to instantiate the TDistribution when it is needed? >> > >> > So you would have to make the following two methods a bit more >> complicated: >> > >> > public double getSlopeConfidenceInterval(double alpha) >> > throws MathException { >> > if (alpha >= 1 || alpha <= 0) { >> > throw new >> > OutOfRangeException(LocalizedFormats.SIGNIFICANCE_LEVEL, >> > alpha, 0, 1); >> > } >> > if( distribution == null || distribution.getDegreesOfFreedom() >> != >> > n-2){ >> > distribution = new TDistributionImpl(n - 2); >> > } >> > return getSlopeStdErr() * >> > distribution.inverseCumulativeProbability(1d - alpha / 2d); >> > } >> > >> > Similarly getSignificance() would have to be modified with the check of >> the >> > degrees of freedom of the distribution. >> > >> > There is nothing wrong with the current code, but making the change >> means >> > faster updates. >> >> Slightly, yes. There is not much code at all in the distribution >> constructor, but you are correct. Moreover, I can see now that the >> "immutability-everywhere" changes in trunk have made the code in the >> class a little funny. In versions <=2.2, the TDistribution used by >> the class was pluggable - i.e., there was a constructor that took at >> TDistribution as an argument, so if for some reason you wanted to >> use an impl different from TDistributionImpl, you could do that. >> There was also a setter for the distribution. In addition, >> TDistributionImpl itself was mutable, exposing a setDegreesOfFreedom >> method. So the distribution member was set at construction time and >> the data update methods called the setter for DF on the >> distribution. We decided to make the distributions immutable in 3.0 >> (search the archives for discussion), so the current mods were done >> to basically accomplish that. But the code should be cleaned up. >> The constructor taking an int is meaningless and should be >> deprecated or removed (unfortunately, we added that in 2.2 and >> advertised it as a deprecation replacement for the version that took >> a distribution as parameter. We should have realized then that it >> was meaningless. My bad for missing that. I would favor dropping it >> in 3.0, since even if anyone is using it, it isn't doing anything >> meaningful for them.) Given that constructing a TDistributionImpl >> is trivial, we might as well eliminate the member field altogether >> and just create one when needed. If you agree and no one else >> objects, I will make these changes. Thanks for reviewing the code. >> >> Phil >> > >> > Thoughts? >> > >> > -Greg >> > >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >> For additional commands, e-mail: dev-h...@commons.apache.org >> >> >