On 9/7/12 9:22 AM, marios michaelidis wrote: > Hi Giles, > I will start exproling the links you gave me. > I would suggest Logistic/probit regression to go under the regerssion > package.
Yes, these should go in regression. Thanks in advance for your contribution! Phil > Not that clustering is really any different, but it makes sense to find > logistic "regerssion" in the a package named as such. > Regards > Marios > >> Date: Fri, 7 Sep 2012 17:48:12 +0200 >> From: gil...@harfang.homelinux.org >> To: dev@commons.apache.org >> Subject: Re: [math] Logistic, Probit regerssion and Tolerance checks >> >> Hi. >> >>> My name is Marios and I have very good >>> academic background as well as I have worked as modeling analyst in big >>> projects thus I have experience with prediction and optimization algorithms. >>> >> Welcome to Commons Math's forum. >> >>> Recently (before 5 months) , I started >>> learning JAVA and I have made my life much more simple by using Java and >>> Common >>> math rather than depending on the common packages (SAS SPSS etc). >>> Obviously, I >>> owe common math a lot. >> That's good to read. >> >>> I have noticed that the site does not >>> have logistic regression and probit regression, very commonly used in >>> classification problems. Additionally, The math package does not provide a >>> way >>> to assess Tolerance (or VIF), very commonly used to avoid multi-colinearity >>> issues and singular matrices in optimization algorithms, prior to running >>> them. >>> >>> >>> >>> I am willing to provide complete >>> Logistic and Probit regression algorithms, optimizable by newton Raphson >>> optimization maximum-likelihood method , in a very programmatically easy way >>> (e.g regression(double matrix [][], double Target[], String >>> Constant, double precision, double tolerance) , with academic references and >>> very quick (3 secs for 60k set), with getter methods for all the common >>> statistics such as null Deviance, Deviance, AIC, BIC, Chi-square f the >>> model, >>> betas, Wald statistics and p values, Cox_snell R square, Nagelkerke’s >>> R-Square, >>> Pseudo_r2, residuals, probabilities, classification matrix. >> Such contributions would certainly be most welcome. >> >> But care must be taken in how to fit those features into Commons Math. I mean >> that the new implementations should be integrated in the API of similar >> functionalities, if they currently exist. >> >> IIUC, the proposal could be related to code currently in package >> org.apache.commons.math3.stat.clustering >> and/or to the pending improvements suggested in this report: >> https://issues.apache.org/jira/browse/MATH-748 >> >> [By the way, I wonder whether "clustering" should really be under "stat", >> rather than, say, "optimization" or a package of its own, one level up.] >> >> In any case, it might be worth discussing here some design issues, before you >> start adapting your code. At the same time, you should open tickets on the >> bug tracking system: >> https://issues.apache.org/jira/browse/MATH >> Preferably, there should be a general request for "New feature"; then >> several "sub-issues" could be linked to that one, each referring to a >> specific task (typically a class, with its unit tests). >> >>> I have also included steps for checking >>> tolerance so that we avoid cases that fail to converge. Generally the >>> algorithm >>> is not very expensive for the RAM (because I have approximated the Hessian >>> Matrix) and the only external jar that I use is common math for >>> multiplications >>> of matrices. >> Although the performance issue is certainly important, it is an >> "implementation detail" that should not preempt a clear API (i.e. one that >> reflects the mathematical concepts) and the reuse of existing classes (those >> can be improved at the same time, if your proposal reveals that something is >> lacking). >> >> >> Thanks for your interest, >> Gilles >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >> For additional commands, e-mail: dev-h...@commons.apache.org >> > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org