Hi All,

Below is a proposal for a new feature:

*A concise description of the new feature / enhancement*
*
*
I propose a new feature to implement the Kendall's Tau which is a measure
of Association/Correlation between ranked ordinal data.

*References to definitions and algorithms.*
*
*A basic description is available at
http://en.wikipedia.org/wiki/Kendall_tau_rank_correlation_coefficient however
the test implementation will follow that defined by "Handbook of Parametric
and Nonparametric Statistical Procedures, Fifth Edition, Page 1393 Test 30,
ISBN-10: 1439858012 | ISBN-13: 978-1439858011."

The algorithm is proposed as follows.

Given two rankings or permutations represented by a 2D matrix; columns
indicate rankings (e.g. by an individual) and row are observations of each
rank. The algorithm is to calculate the total number of concordant pairs of
ranks (between columns), discordant pairs of ranks  (between columns) and
calculate the Tau defined as

tau= (Number of concordant - number of discordant)/(n(n-1)/2)
 where n(n-1)/2 is the total number of possible pairs of ranks.

The method will then output the tau value between 0 and 1 where 1 signifies
a "perfect" correlation between the two ranked lists.

Where ties exist within a ranking it is marked as neither concordant nor
discordant in the calculation. An optional merge sort can be used to speed
up the implementation. Details are in the wiki page.

*Some indication of why the addition / enhancement is practically useful*
*
*
Although this implementation is not particularly complex it would be useful
to have it in a consistent format in the commons math package in addition
to existing correlation tests. Kendall's Tau is used effectively in
comparing ranks for products, rankings from search engines or measurements
from engineering equipment.

This  is my first post on this list, I tried to follow the guidelines but
let me know if I need to elaborate.

Regards
Dev

Reply via email to