Re: [Math] Cleaning up the curve fitters

Ted Dunning Fri, 19 Jul 2013 12:22:53 -0700

The discussion about how to get something into commons when it is (a) well
documented and (b) demonstrated better on at least some domains is
partially procedural, but it hinges on technical factors.


I think that Ajo is being very reserved here.  When I faced similar
discouragement in the past with commons math contributions, I simply went
elsewhere.

It still seems to me that it would serve CM well to pay more attention to
Ajo's comments and suggestions.  Simply saying that we should focus on
technical discussion when CM's list is filled with esthetic arguments
really just sounds like a way of pushing people away.


On Fri, Jul 19, 2013 at 10:21 AM, Phil Steitz <[email protected]> wrote:

> As I said above, let's focus on actual technical discussion here.
> We implement standard, well-documented algorithms.  We need to
> provide references and convince ourselves that what we release is
> numerically sound, well-documented and well-tested.  We do our best
> with the volunteer resources we have.  Your help and contributions
> are appreciated.
>
> Phil
>
> On 7/19/13 9:44 AM, Ajo Fod wrote:
> > Hi,
> >
> > I very much appreciate the work that has been done in CM and this is
> > precisely why I'd like more people to contribute. Even when you didnt'
> > accept my MATH-995 patch, I got useful input from Konstantin and it has
> > already made my application more efficient.
> >
> > What you required of me in the Improper integral example was a comparison
> > of different methods. This sort of research takes time. I hear that
> Gilles
> > is working on it. I appreciate that you guys spent so much effort on
> this.
> >
> > However, my contention is that your efforts at researching alternate
> > solutions to a patch is not justified till you come up with a test that
> the
> > patch fails OR if you know an alternate performs better for an
> application
> > you have. In the first case, you're losing the efficiency of open source
> by
> > reinventing a possibly different wheel without sufficient marginal
> reward.
> > In the second case, beware of the fact that numerical algorithms are
> hairy
> > beasts, and it takes a while to encode something new. The efficiency of
> > commons comes from putting the burden of development on the developers
> who
> > need the code.
> >
> > So, I propose an alternate approach to testing if a submitted patch needs
> > to be accepted:
> > 1. Check if the patch fills a gap in existing CM code
> > 2. if so, check if it passes known tests
> > 3. if so, write up alternate tests to see if the code breaks.
> > 4. if so, wrap the code up in a suitable API and accept the patch
> >
> > This has two advantages. First CM will have more capabilities per unit of
> > your precious time. Second you give people the feeling that they are
> making
> > a difference.
> >
> > As far as the debate on AQ(AdaptiveQuadrature) vs
> > LGQ(IterativeLegendreGaussIntegrator) goes:
> > The FACTS that support AQ over LGQ are:
> > 1. An example where LGQ failed and AQ succeeded. I also explained why LGQ
> > fails and AQ will probably converge more correctly. Generally adaptive
> > quadrature are known to be so succesful at integration that Konstantin
> even
> > wondered why we don't have something yet.
> > 2. Efficiency improvement: I also showed that LGQ is more efficient at at
> > least one example in terms of accuracy in digits per function evaluation.
> > So, conversely, its now your turn to provide concrete examples where LGQ
> > does better than AQ. You could pose credible objections by providing
> > examples where:
> > 1. AQ fails but LGQ passes.
> > 2. LGQ is more efficient in accuracy per evaluation.
> >
> > All that to illustrate with example where the perception that it is hard
> to
> > convince the gatekeepers of commons of the merits of a patch arises
> from. I
> > have a package in my codebase with assorted patches that I just dont'
> think
> > is worth the time to try to post to commons. I think it is very
> inefficient
> > if others have such private patches.
> >
> > Cheers,
> > Ajo
> >
> >
> >
> >
> >
> >
> >
> > On Thu, Jul 18, 2013 at 2:15 PM, Phil Steitz <[email protected]>
> wrote:
> >
> >> On 7/18/13 1:48 PM, Ajo Fod wrote:
> >>> Hello folks,
> >>>
> >>> There is a lot of work in API design. However, Konstantin's point is
> that
> >>> it takes a lot of effort to convince Gilles of any alternatives. API
> >> design
> >>> issues should really be second to functionality. This idea seems to be
> >> lost
> >>> in conversations.
> >> With patience and collaboration you can have both and we *need* to
> >> have both.  You can't get to a stable API and approachable and
> >> maintainable code base without thinking carefully about API design.
> >>> I agree with Gilles that providing tests and benchmarks that exhibit
> the
> >>> advantages of a particular method are probably the best way to show
> other
> >>> contributors the value of an alternative approach.
> >> There is some value to this, but honestly much more value in
> >> carefully researching and presenting the numerical analysis to
> >> support improvement / performance claims.
> >>> It is quite depressing to the contributor to see one's contribution be
> >>> rejected when efficiency/accuracy improvements are demonstrated.
> >> What you "demonstrated" in one case was better performance in one
> >> problem instance.  The change of variable approach you implemented
> >> was, in my admittedly possibly naive numerics view, questionable.  I
> >> asked to see numerical analysis support and no one provided that.
> >> Had you provided that, I would have argued to include some version
> >> of the patch.
> >>
> >>> In a
> >>> better world, rejecting a patch that passes the hurdle of demonstrating
> >> an
> >>> efficiency improvement over existing code should come with a
> >> responsibility
> >>> of showing alternate tests that the patch fails and the original code
> >>> passes. Otherwise, the patch should be accepted by default. The person
> >> who
> >>> commits or designed the API is free to make changes to fit API design.
> >> This is essentially what Gilles ended up doing.  You may not agree
> >> with the approach, but he did in fact address the core issue.
> >>> Just like API designers are not experts at the underlying math,
> >>> contributors are not necessarily experts at the underlying API design.
> To
> >>> unlock the efficiency of open source, contributor morale needs to be
> >>> considered and classes that pass tests should really be accepted.
> >> I agree that we should try to be friendly and encouraging and I
> >> apologize if we have not been so.  That said, the process of
> >> contributing here is not just tossing patches over the wall.  First
> >> you need to get community support for the ideas.  Then work
> >> collaboratively to get patches that work for the code and community.
> >>> For example, Performance AND accuracy improvements to existing
> algorithm
> >>> were demonstrated for AdaptiveQuadrature in my patches to MATH-995
> >> Sorry, I was not convinced by the accuracy and performance claims
> >> and, as I said above, I suspect that the change of variable approach
> >> may not be the best way to handle improper integrals.  I am not
> >> claiming authority here - just - again - asking for real numerical
> >> analysis arguments to support the claims you are making.
> >>
> >> It would be a lot better if we focused discussion on the actual
> >> technical issues and mathematical principles rather than
> >> generalities about how hard / easy it is to get stuff in.
> >>
> >> Phil
> >>> The only joy I got out of that was Gilles putting a comment in the docs
> >> for
> >>> the existing class:
> >>> "The Javadoc now draws attention that the [existing] algorithm is not
> >> 100%
> >>> fool-proof."!
> >>> Also, I was asked to open a new issue about Adaptive Quadratures to
> >> figure
> >>> out what is the best quadratue method ... all while a patch that is a
> >> clear
> >>> improvement over existing code wastes away. Why not accept the patch
> and
> >>> make improvements as necessary?
> >>>
> >>> My impression since that patch was rejected, is that it just seems
> like a
> >>> huge hurdle to get any patch past the API design requirements that are
> >>> frankly not as clear to me as it is to the designer. I can see how
> others
> >>> feel the same way.
> >>>
> >>> Cheers,
> >>> Ajo.
> >>>
> >>> Gilles: if you don't want to end up spending time developing
> >> Gauss-Hermite
> >>> quadrature or something else you don't really need, perhaps you should
> >>> consider accepting/modifying code that was shown to work by someone who
> >>> needed that functionality. It is reasonable to develop alternatives to
> >> fix
> >>> flaws/gaps, but otherwise its your effort wasted.  If someone's
> >>> contribution doesn't fit your view of the API, then by all means edit
> the
> >>> patch, but if you go about rejecting things that work, there won't be
> as
> >>> many contributors to CM.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Thu, Jul 18, 2013 at 10:08 AM, Roger L. Whitcomb <
> >>> [email protected]> wrote:
> >>>
> >>>> As an outsider listening to these discussions, it seems like:
> >>>> a) *IF* there are problems with the current arrangement of packages,
> >> APIs,
> >>>> or whatever, then a constructive approach would be for the one who
> sees
> >>>> such problems to take the time to not just criticize and point out
> >> "flaws",
> >>>> but to dig in and rearrange the packages, redo the APIs, provide unit
> >>>> tests, and submit a patch with these changes, along with quantitative
> >>>> justification, benchmarks, test cases, etc.  It is quite easy to
> >> criticize,
> >>>> from the sidelines, the one who is actually doing the work, but quite
> >>>> another matter to roll up your sleeves and join in the work....
> >>>> b) Since Math is a "library", it seems like there needs to be
> >>>> implementations of many different algorithms, since (quite clearly)
> not
> >>>> every algorithm is suited to every problem.  To say that X method
> >> doesn't
> >>>> work well for problem Y, is not necessarily a reason to rewrite X
> >> method,
> >>>> if that method is correctly implementing the algorithm.  Maybe the
> >>>> algorithm is simply not the right one to use for the problem.
> >>>> c) Comments that imply (or state outright) that someone who has
> >> (clearly)
> >>>> done a lot of work has done it "...without much thinking..." are
> clearly
> >>>> out of line.  In my experience, the only reason to resort to name
> >> calling
> >>>> and character assassination is because one has no worthy arguments to
> >> put
> >>>> forward.
> >>>> d) Kudos to the Commons committers who have been doing the work ...
> >>>>
> >>>> My 2 cents...
> >>>>
> >>>> ~Roger Whitcomb
> >>>> Apache Pivot PMC Chair
> >>>>
> >>>> -----Original Message-----
> >>>> From: Gilles [mailto:[email protected]]
> >>>> Sent: Thursday, July 18, 2013 9:35 AM
> >>>> To: [email protected]
> >>>> Subject: Re: [Math] Cleaning up the curve fitters
> >>>>
> >>>> On Thu, 18 Jul 2013 11:47:03 -0400, Konstantin Berlin wrote:
> >>>>> I appreciate the comment. I would like to help, but currently my
> >>>>> schedule is full. Maybe towards the end of the year.
> >>>>>
> >>>>> I think the first approach should be do no harm. The optimization
> >>>>> package keeps getting refactored every few months without much
> >>>>> thinking involved. We had the discuss previously, with Gilles
> >>>>> unilaterally deciding on the current tree, which he now wants to
> >>>>> change again.
> >>>> As I said,
> >>>> as Luc said,
> >>>> as Phil said,
> >>>> again and again and again,
> >>>> we are not optimization (as a scientific field) experts here, but we
> do
> >>>> use Commons Math in scientific code that is pretty compute intensive
> >> (and
> >>>> yes, maybe not in the same sense as you'd like it to be for your
> >> comfort).
> >>>> Current code has, and may still have problems, but we see them only
> >>>> through running unit tests, running our applications, running code
> >> examples
> >>>> submitted by issue reporters.
> >>>> We improve what we can, given time and motivation constraints.
> >>>> Other than that, there is nothing.
> >>>>
> >>>> Yes, we already had that asymmetrical conversation where _you_ declare
> >>>> what _we_ should do.
> >>>>
> >>>>> As someone who uses optimization regular I would say the current API
> >>>>> state (not just package naming) leaves a lot to be desired, and is
> not
> >>>>> amenable to the various modification that people might need for
> larger
> >>>>> problems. So if you are going to modify it, you should at least open
> >>>>> up the API to the possibility that different optimization steps can
> be
> >>>>> done using various techniques, depending on the problem.
> >>>>>
> >>>>> We should also accept that not everything can fit neatly into a
> >>>>> package tree and a single set of APIs. A good example is least
> >>>>> squares. Linear least squares does not require an initial guess at a
> >>>>> solution, and by performing decomposition ahead of time you can
> >>>>> quickly recompute the solution given different input values. However,
> >>>>> an iterative least squares method might not have these properties.
> >>>>> There are probably countless of other examples.
> >>>>>
> >>>>> Because optimization problems are really computationally hard all the
> >>>>> little specific differences matter, that is why Gilles approach of
> >>>>> sweeping everything under the rug and into some rigid not thought out
> >>>>> hierarchical API forces these methods to adapt (or drop) numerical
> >>>>> aspects that should not be there (e.x. polynomial fits). This has
> >>>>> *huge* performance implications, but the issue is treated as some OO
> >>>>> design 101 class, with the focus on how to force everything into a
> >>>>> simple inheritance structure, numerics be damned.
> >>>>>
> >>>>> I would gladly help with the feedback when I can. Ajo and I provided
> >>>>> code for adaptive integration, yet the whole issue was completely
> >>>>> ignored. So I am not sure how much effort is required for the
> >>>>> developers to take an idea or mostly completed code and make a
> change,
> >>>>> rather than reject even the most basic numerical approaches that are
> >>>>> taught in introduction classes as something that needs to be
> >>>>> benchmarked.
> >>>> As usual, you are mixing everything, from algorithms to
> implementations,
> >>>> from proposing new features to denigrating existing ones (with
> >> non-existent
> >>>> or inappropriate use-cases), from numerical to efficiency
> >> considerations...
> >>>> [On top of it, you blatantly affirm that this issue has been ignored,
> >> even
> >>>> as I provided[1] an analysis[2] of what was actually happening.
> >>>> People like you seem to ignore that we work benevolently on this
> >> project!]
> >>>> Not even speaking of derogatory remarks like "sweeping [...] under the
> >> rug"
> >>>> and "not thought out" and insinuating that everything was better and
> >> more
> >>>> efficient before. Which is simply not true.
> >>>>
> >>>> It's an asymmetrical discussion because you declare that half-baked
> code
> >>>> is good enough and _we_ have to work even more than if we'd have to
> >>>> implement the feature from scratch.
> >>>>
> >>>>
> >>>> Gilles
> >>>>
> >>>> [1] In the spare time I do _not_ have either.
> >>>> [2] Which dragged me to the implementation of the Gauss-Hermite
> >> quadrature
> >>>>      scheme (although I had no personal use of it), which seems to be
> >> the
> >>>>      appropriate way to deal with the improper integral reported in
> the
> >>>>      issue which you refer to.
> >>>>
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: [email protected]
> >>>> For additional commands, e-mail: [email protected]
> >>>>
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: [email protected]
> >>>> For additional commands, e-mail: [email protected]
> >>>>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: [email protected]
> >> For additional commands, e-mail: [email protected]
> >>
> >>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Re: [Math] Cleaning up the curve fitters

Reply via email to