On 9/7/12 9:22 AM, marios michaelidis wrote:
> Hi Giles,
> I will start exproling the links you gave me.
> I would suggest Logistic/probit regression to go under the regerssion 
> package. 

Yes, these should go in regression.  Thanks in advance for your
contribution!

Phil
> Not that clustering is really any different, but it makes sense to find 
> logistic "regerssion" in the a package named as such.
> Regards
> Marios
>
>> Date: Fri, 7 Sep 2012 17:48:12 +0200
>> From: gil...@harfang.homelinux.org
>> To: dev@commons.apache.org
>> Subject: Re: [math] Logistic, Probit regerssion and Tolerance checks
>>
>> Hi.
>>
>>> My name is Marios and I have very good
>>> academic background as well as I have worked as modeling analyst in big
>>> projects thus I have experience with prediction and optimization algorithms.
>>>
>> Welcome to Commons Math's forum.
>>  
>>> Recently (before 5 months) , I started
>>> learning JAVA and I have made my life much more simple by using Java and 
>>> Common
>>> math rather than depending on the common packages (SAS SPSS etc). 
>>> Obviously, I
>>> owe common math a lot.
>> That's good to read.
>>  
>>> I have noticed that the site does not
>>> have logistic regression and probit regression, very commonly used in
>>> classification problems. Additionally, The math package does not provide a 
>>> way
>>> to assess Tolerance (or VIF), very commonly used to avoid multi-colinearity
>>> issues and singular matrices in optimization algorithms, prior to running 
>>> them.
>>>
>>>  
>>>
>>> I am willing to provide complete
>>> Logistic and Probit regression algorithms, optimizable by newton Raphson
>>> optimization maximum-likelihood method , in a very programmatically easy way
>>> (e.g  regression(double matrix [][],  double Target[], String
>>> Constant, double precision, double tolerance) , with academic references and
>>> very quick (3 secs for 60k set), with getter methods for all the common
>>> statistics such as null Deviance, Deviance, AIC, BIC, Chi-square f the 
>>> model,
>>> betas, Wald statistics and p values, Cox_snell R square, Nagelkerke’s 
>>> R-Square,
>>> Pseudo_r2, residuals, probabilities, classification matrix.
>> Such contributions would certainly be most welcome.
>>
>> But care must be taken in how to fit those features into Commons Math. I mean
>> that the new implementations should be integrated in the API of similar
>> functionalities, if they currently exist.
>>
>> IIUC, the proposal could be related to code currently in package
>>   org.apache.commons.math3.stat.clustering
>> and/or to the pending improvements suggested in this report:
>>   https://issues.apache.org/jira/browse/MATH-748
>>
>> [By the way, I wonder whether "clustering" should really be under "stat",
>> rather than, say, "optimization" or a package of its own, one level up.]
>>
>> In any case, it might be worth discussing here some design issues, before you
>> start adapting your code. At the same time, you should open tickets on the
>> bug tracking system:
>>   https://issues.apache.org/jira/browse/MATH
>> Preferably, there should be a general request for "New feature"; then
>> several "sub-issues" could be linked to that one, each referring to a
>> specific task (typically a class, with its unit tests).
>>
>>> I have also included steps for checking
>>> tolerance so that we avoid cases that fail to converge. Generally the 
>>> algorithm
>>> is not very expensive for the RAM (because I have approximated the Hessian
>>> Matrix) and the only external jar that I use is common math for 
>>> multiplications
>>> of matrices.
>> Although the performance issue is certainly important, it is an
>> "implementation detail" that should not preempt a clear API (i.e. one that
>> reflects the mathematical concepts) and the reuse of existing classes (those
>> can be improved at the same time, if your proposal reveals that something is
>> lacking).
>>
>>
>> Thanks for your interest,
>> Gilles
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>> For additional commands, e-mail: dev-h...@commons.apache.org
>>
>                                         


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to