On 7/5/12 2:24 PM, Devl Devel wrote: > Hi All, Welcome! > > Below is a proposal for a new feature: > > *A concise description of the new feature / enhancement* > * > * > I propose a new feature to implement the Kendall's Tau which is a measure > of Association/Correlation between ranked ordinal data. > > *References to definitions and algorithms.* > * > *A basic description is available at > http://en.wikipedia.org/wiki/Kendall_tau_rank_correlation_coefficient however > the test implementation will follow that defined by "Handbook of Parametric > and Nonparametric Statistical Procedures, Fifth Edition, Page 1393 Test 30, > ISBN-10: 1439858012 | ISBN-13: 978-1439858011." > > The algorithm is proposed as follows. > > Given two rankings or permutations represented by a 2D matrix; columns > indicate rankings (e.g. by an individual) and row are observations of each > rank. The algorithm is to calculate the total number of concordant pairs of > ranks (between columns), discordant pairs of ranks (between columns) and > calculate the Tau defined as > > tau= (Number of concordant - number of discordant)/(n(n-1)/2) > where n(n-1)/2 is the total number of possible pairs of ranks. > > The method will then output the tau value between 0 and 1 where 1 signifies > a "perfect" correlation between the two ranked lists. > > Where ties exist within a ranking it is marked as neither concordant nor > discordant in the calculation. An optional merge sort can be used to speed > up the implementation. Details are in the wiki page. > > *Some indication of why the addition / enhancement is practically useful* > * > * > Although this implementation is not particularly complex it would be useful > to have it in a consistent format in the commons math package in addition > to existing correlation tests. Kendall's Tau is used effectively in > comparing ranks for products, rankings from search engines or measurements > from engineering equipment. > > This is my first post on this list, I tried to follow the guidelines but > let me know if I need to elaborate.
I think a Kendal's Tau implementation would make a great addition to the correlation package (o.a.c.math3.stat.correlation). Here is how you can get started: 0) Get yourself set up to build commons math and run the unit tests. If you are familiar with maven, this should not be too hard. If you have any questions or run into problems checking out the sources, building locally, etc., don't hesitate to ask. 1) Look at the Spearman's implementation and the ranking classes in the stat.ranking package. That might give you some ideas on how to implement Kendal's consistently. 2) Open a JIRA ticket with the info above and start attaching patches implementing the new implementation class and associated test class. Run "mvn site" or checkstyle standalone to make sure your contributed code follows the style guidelines we use. 3) Be patient but persistent and we will get Kendall's Tau into commons math :) Thanks in advance! Phil > > Regards > Dev > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org