Re: [Math] Cleaning up the curve fitters

Ajo Fod Fri, 19 Jul 2013 09:45:18 -0700

Hi,

I very much appreciate the work that has been done in CM and this is
precisely why I'd like more people to contribute. Even when you didnt'
accept my MATH-995 patch, I got useful input from Konstantin and it has
already made my application more efficient.


What you required of me in the Improper integral example was a comparison
of different methods. This sort of research takes time. I hear that Gilles
is working on it. I appreciate that you guys spent so much effort on this.

However, my contention is that your efforts at researching alternate
solutions to a patch is not justified till you come up with a test that the
patch fails OR if you know an alternate performs better for an application
you have. In the first case, you're losing the efficiency of open source by
reinventing a possibly different wheel without sufficient marginal reward.
In the second case, beware of the fact that numerical algorithms are hairy
beasts, and it takes a while to encode something new. The efficiency of
commons comes from putting the burden of development on the developers who
need the code.

So, I propose an alternate approach to testing if a submitted patch needs
to be accepted:
1. Check if the patch fills a gap in existing CM code
2. if so, check if it passes known tests
3. if so, write up alternate tests to see if the code breaks.
4. if so, wrap the code up in a suitable API and accept the patch

This has two advantages. First CM will have more capabilities per unit of
your precious time. Second you give people the feeling that they are making
a difference.

As far as the debate on AQ(AdaptiveQuadrature) vs
LGQ(IterativeLegendreGaussIntegrator) goes:
The FACTS that support AQ over LGQ are:
1. An example where LGQ failed and AQ succeeded. I also explained why LGQ
fails and AQ will probably converge more correctly. Generally adaptive
quadrature are known to be so succesful at integration that Konstantin even
wondered why we don't have something yet.
2. Efficiency improvement: I also showed that LGQ is more efficient at at
least one example in terms of accuracy in digits per function evaluation.
So, conversely, its now your turn to provide concrete examples where LGQ
does better than AQ. You could pose credible objections by providing
examples where:
1. AQ fails but LGQ passes.
2. LGQ is more efficient in accuracy per evaluation.

All that to illustrate with example where the perception that it is hard to
convince the gatekeepers of commons of the merits of a patch arises from. I
have a package in my codebase with assorted patches that I just dont' think
is worth the time to try to post to commons. I think it is very inefficient
if others have such private patches.

Cheers,
Ajo







On Thu, Jul 18, 2013 at 2:15 PM, Phil Steitz <phil.ste...@gmail.com> wrote:

> On 7/18/13 1:48 PM, Ajo Fod wrote:
> > Hello folks,
> >
> > There is a lot of work in API design. However, Konstantin's point is that
> > it takes a lot of effort to convince Gilles of any alternatives. API
> design
> > issues should really be second to functionality. This idea seems to be
> lost
> > in conversations.
>
> With patience and collaboration you can have both and we *need* to
> have both.  You can't get to a stable API and approachable and
> maintainable code base without thinking carefully about API design.
> >
> > I agree with Gilles that providing tests and benchmarks that exhibit the
> > advantages of a particular method are probably the best way to show other
> > contributors the value of an alternative approach.
>
> There is some value to this, but honestly much more value in
> carefully researching and presenting the numerical analysis to
> support improvement / performance claims.
> >
> > It is quite depressing to the contributor to see one's contribution be
> > rejected when efficiency/accuracy improvements are demonstrated.
>
> What you "demonstrated" in one case was better performance in one
> problem instance.  The change of variable approach you implemented
> was, in my admittedly possibly naive numerics view, questionable.  I
> asked to see numerical analysis support and no one provided that.
> Had you provided that, I would have argued to include some version
> of the patch.
>
> > In a
> > better world, rejecting a patch that passes the hurdle of demonstrating
> an
> > efficiency improvement over existing code should come with a
> responsibility
> > of showing alternate tests that the patch fails and the original code
> > passes. Otherwise, the patch should be accepted by default. The person
> who
> > commits or designed the API is free to make changes to fit API design.
>
> This is essentially what Gilles ended up doing.  You may not agree
> with the approach, but he did in fact address the core issue.
> >
> > Just like API designers are not experts at the underlying math,
> > contributors are not necessarily experts at the underlying API design. To
> > unlock the efficiency of open source, contributor morale needs to be
> > considered and classes that pass tests should really be accepted.
>
> I agree that we should try to be friendly and encouraging and I
> apologize if we have not been so.  That said, the process of
> contributing here is not just tossing patches over the wall.  First
> you need to get community support for the ideas.  Then work
> collaboratively to get patches that work for the code and community.
> >
> > For example, Performance AND accuracy improvements to existing algorithm
> > were demonstrated for AdaptiveQuadrature in my patches to MATH-995
>
> Sorry, I was not convinced by the accuracy and performance claims
> and, as I said above, I suspect that the change of variable approach
> may not be the best way to handle improper integrals.  I am not
> claiming authority here - just - again - asking for real numerical
> analysis arguments to support the claims you are making.
>
> It would be a lot better if we focused discussion on the actual
> technical issues and mathematical principles rather than
> generalities about how hard / easy it is to get stuff in.
>
> Phil
> > The only joy I got out of that was Gilles putting a comment in the docs
> for
> > the existing class:
> > "The Javadoc now draws attention that the [existing] algorithm is not
> 100%
> > fool-proof."!
> > Also, I was asked to open a new issue about Adaptive Quadratures to
> figure
> > out what is the best quadratue method ... all while a patch that is a
> clear
> > improvement over existing code wastes away. Why not accept the patch and
> > make improvements as necessary?
> >
> > My impression since that patch was rejected, is that it just seems like a
> > huge hurdle to get any patch past the API design requirements that are
> > frankly not as clear to me as it is to the designer. I can see how others
> > feel the same way.
> >
> > Cheers,
> > Ajo.
> >
> > Gilles: if you don't want to end up spending time developing
> Gauss-Hermite
> > quadrature or something else you don't really need, perhaps you should
> > consider accepting/modifying code that was shown to work by someone who
> > needed that functionality. It is reasonable to develop alternatives to
> fix
> > flaws/gaps, but otherwise its your effort wasted.  If someone's
> > contribution doesn't fit your view of the API, then by all means edit the
> > patch, but if you go about rejecting things that work, there won't be as
> > many contributors to CM.
> >
> >
> >
> >
> >
> >
> > On Thu, Jul 18, 2013 at 10:08 AM, Roger L. Whitcomb <
> > roger.whitc...@actian.com> wrote:
> >
> >> As an outsider listening to these discussions, it seems like:
> >> a) *IF* there are problems with the current arrangement of packages,
> APIs,
> >> or whatever, then a constructive approach would be for the one who sees
> >> such problems to take the time to not just criticize and point out
> "flaws",
> >> but to dig in and rearrange the packages, redo the APIs, provide unit
> >> tests, and submit a patch with these changes, along with quantitative
> >> justification, benchmarks, test cases, etc.  It is quite easy to
> criticize,
> >> from the sidelines, the one who is actually doing the work, but quite
> >> another matter to roll up your sleeves and join in the work....
> >> b) Since Math is a "library", it seems like there needs to be
> >> implementations of many different algorithms, since (quite clearly) not
> >> every algorithm is suited to every problem.  To say that X method
> doesn't
> >> work well for problem Y, is not necessarily a reason to rewrite X
> method,
> >> if that method is correctly implementing the algorithm.  Maybe the
> >> algorithm is simply not the right one to use for the problem.
> >> c) Comments that imply (or state outright) that someone who has
> (clearly)
> >> done a lot of work has done it "...without much thinking..." are clearly
> >> out of line.  In my experience, the only reason to resort to name
> calling
> >> and character assassination is because one has no worthy arguments to
> put
> >> forward.
> >> d) Kudos to the Commons committers who have been doing the work ...
> >>
> >> My 2 cents...
> >>
> >> ~Roger Whitcomb
> >> Apache Pivot PMC Chair
> >>
> >> -----Original Message-----
> >> From: Gilles [mailto:gil...@harfang.homelinux.org]
> >> Sent: Thursday, July 18, 2013 9:35 AM
> >> To: dev@commons.apache.org
> >> Subject: Re: [Math] Cleaning up the curve fitters
> >>
> >> On Thu, 18 Jul 2013 11:47:03 -0400, Konstantin Berlin wrote:
> >>> I appreciate the comment. I would like to help, but currently my
> >>> schedule is full. Maybe towards the end of the year.
> >>>
> >>> I think the first approach should be do no harm. The optimization
> >>> package keeps getting refactored every few months without much
> >>> thinking involved. We had the discuss previously, with Gilles
> >>> unilaterally deciding on the current tree, which he now wants to
> >>> change again.
> >> As I said,
> >> as Luc said,
> >> as Phil said,
> >> again and again and again,
> >> we are not optimization (as a scientific field) experts here, but we do
> >> use Commons Math in scientific code that is pretty compute intensive
> (and
> >> yes, maybe not in the same sense as you'd like it to be for your
> comfort).
> >> Current code has, and may still have problems, but we see them only
> >> through running unit tests, running our applications, running code
> examples
> >> submitted by issue reporters.
> >> We improve what we can, given time and motivation constraints.
> >> Other than that, there is nothing.
> >>
> >> Yes, we already had that asymmetrical conversation where _you_ declare
> >> what _we_ should do.
> >>
> >>> As someone who uses optimization regular I would say the current API
> >>> state (not just package naming) leaves a lot to be desired, and is not
> >>> amenable to the various modification that people might need for larger
> >>> problems. So if you are going to modify it, you should at least open
> >>> up the API to the possibility that different optimization steps can be
> >>> done using various techniques, depending on the problem.
> >>>
> >>> We should also accept that not everything can fit neatly into a
> >>> package tree and a single set of APIs. A good example is least
> >>> squares. Linear least squares does not require an initial guess at a
> >>> solution, and by performing decomposition ahead of time you can
> >>> quickly recompute the solution given different input values. However,
> >>> an iterative least squares method might not have these properties.
> >>> There are probably countless of other examples.
> >>>
> >>> Because optimization problems are really computationally hard all the
> >>> little specific differences matter, that is why Gilles approach of
> >>> sweeping everything under the rug and into some rigid not thought out
> >>> hierarchical API forces these methods to adapt (or drop) numerical
> >>> aspects that should not be there (e.x. polynomial fits). This has
> >>> *huge* performance implications, but the issue is treated as some OO
> >>> design 101 class, with the focus on how to force everything into a
> >>> simple inheritance structure, numerics be damned.
> >>>
> >>> I would gladly help with the feedback when I can. Ajo and I provided
> >>> code for adaptive integration, yet the whole issue was completely
> >>> ignored. So I am not sure how much effort is required for the
> >>> developers to take an idea or mostly completed code and make a change,
> >>> rather than reject even the most basic numerical approaches that are
> >>> taught in introduction classes as something that needs to be
> >>> benchmarked.
> >> As usual, you are mixing everything, from algorithms to implementations,
> >> from proposing new features to denigrating existing ones (with
> non-existent
> >> or inappropriate use-cases), from numerical to efficiency
> considerations...
> >> [On top of it, you blatantly affirm that this issue has been ignored,
> even
> >> as I provided[1] an analysis[2] of what was actually happening.
> >> People like you seem to ignore that we work benevolently on this
> project!]
> >> Not even speaking of derogatory remarks like "sweeping [...] under the
> rug"
> >> and "not thought out" and insinuating that everything was better and
> more
> >> efficient before. Which is simply not true.
> >>
> >> It's an asymmetrical discussion because you declare that half-baked code
> >> is good enough and _we_ have to work even more than if we'd have to
> >> implement the feature from scratch.
> >>
> >>
> >> Gilles
> >>
> >> [1] In the spare time I do _not_ have either.
> >> [2] Which dragged me to the implementation of the Gauss-Hermite
> quadrature
> >>      scheme (although I had no personal use of it), which seems to be
> the
> >>      appropriate way to deal with the improper integral reported in the
> >>      issue which you refer to.
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> >> For additional commands, e-mail: dev-h...@commons.apache.org
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> >> For additional commands, e-mail: dev-h...@commons.apache.org
> >>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

Re: [Math] Cleaning up the curve fitters

Reply via email to