Re: [math] Re: Longley Data

Luc Maisonobe Thu, 14 Jul 2011 23:57:46 -0700

Le 15/07/2011 02:37, Greg Sterijevski a écrit :

The usual issues with numerical techniques, how you calculate (c * x + d *
y)/e matters...
It turns out that religiously following the article and defining c_bar  = c
/ e is not a good idea.


The Filippelli data is still a bit dicey. I would like to resolve where the
error is accumulating there as well. That's really the last thing preventing
me from sending the patch with the Miller-Gentlemen Regression to Phil.

I don't know whether this is feasible in your case, but when trying tofind this kind of numerical errors, I found useful to just redo thecomputation in parallel to high precision. Up to a few months ago, I wassimply doing this using emacs (yes, emacs rocks) configured with 50significant digits? Now it is easier since we have our own dfp packagein [math].

Luc


-Greg

On Thu, Jul 14, 2011 at 1:18 PM, Ted Dunning<ted.dunn...@gmail.com>  wrote:

What was the problem?

On Wed, Jul 13, 2011 at 8:33 PM, Greg Sterijevski<gsterijev...@gmail.com

wrote:

Phil,

Got it! I fit longley to all printed values. I have not broken

anything...

I
need to type up a few loose ends, then I will send a patch.

-Greg

On Tue, Jul 12, 2011 at 2:35 PM, Phil Steitz<phil.ste...@gmail.com>
wrote:

On 7/12/11 12:12 PM, Greg Sterijevski wrote:

All,

So I included the wampler data in the test suite. The interesting

thing,

is

to get clean runs I need wider tolerances with OLSMultipleRegression

than

with the version of the Miller algorithm I am coding up.

This is good for your Miller impl, not so good for
OLSMultipleRegression.

Perhaps we should come to a consensus of what good enough is? How

close

do

we want to be? Should we require passing on all of NIST's 'hard'

problems?

(for all regression techniques that get cooked up)

The goal should be to match all of the displayed digits in the
reference data.  When we can't do that, we should try to understand
why and aim to, if possible, improve the impls.   As we improve the
code, the tolerances in the tests can be improved.  Characterization
of the types of models where the different implementations do well /
poorly is another thing we should aim for (and include in the
javadoc).  As with all reference validation tests, we need to keep
in mind that a) the "hard" examples are designed to be numerically
unstable and b) conversely, a handful of examples does not really
demonstrate correctness.

Phil

-Greg



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [math] Re: Longley Data

Reply via email to