Hi Patrick, On 11/07/2012 04:37 PM, Patrick Meyer wrote: > I agree that it would be nice to have a constructor that allows you to > specific the ranking algorithm only. > > As far as NaN and the Spearman correlation, maybe we should add a default > strategy of NaNStrategy.FAIL so that an exception would occur if any NaN is > encountered. R uses this treatment of missing data and forces users to > choose how to handle it. If we implemented something like listwise or > pairwise deletion it could be used in other classes too. As such, treatment > of missing data should be part of a larger discussion and handled in a more > comprehensive and systematic way.
I think this additional option makes sense, but I forward this discussion to the dev mailing list where it is better suited. Thomas > -----Original Message----- > From: Thomas Neidhart [mailto:thomas.neidh...@gmail.com] > Sent: Wednesday, November 07, 2012 8:09 AM > To: u...@commons.apache.org > Subject: Re: [math] correlation analysis with NaNs > > On 11/07/2012 01:38 PM, Patrick Meyer wrote: >> You are getting values like 2.5 because of the default ties strategy. >> If you do not want to use that method, create an instance of >> RankingAlgorithm with a different ties strategy and pass it to the >> constructor for the SpearmanCorrelation. This approach also gives you >> control over the method for dealing with NaNs. Something like, >> >> //create data matrix >> double[] column1 = new double[]{Double.NaN, 1, 2}; double[] column2 = >> new double[]{10, 2, 10}; Array2DRowRealMatrix mydata = new >> Array2DRowRealMatrix(); For(int i=0;i<column1.length;i++){ >> mydata.addToEntry(i, 0, column1[i]); >> mydata.addToEntry(i, 1, column2[i]); >> } >> >> //compute correlation >> NaturalRanking ranking = new NaturalRanking(NaNStrategy.FIXED, >> TiesStrategy.RANDOM); SpearmanCorrelation spearman = new >> SpearmanCorrelation(ranking, mydata); >> >> Try that. > > Hi, > > this will not really help imho. > > As far as I can see, there are at least two problems with the current use of > the RankingAlgorithm in the SpearmanCorrelation class: > > * there is no way to select the ranking algorithm in the constructor > without passing the values at the same time > * the NaNStrategy.REMOVED does not work symmetrically, i.e. it removes > the NaN only from the input array where it occurs but not in the > corresponding array, thus rendering it useless as it will result in > exceptions (array lengths differ) > > Would you be able to create an issue for this on the issue tracker and > provide the test case? > > Thanks, > > Thomas > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@commons.apache.org > For additional commands, e-mail: user-h...@commons.apache.org > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@commons.apache.org > For additional commands, e-mail: user-h...@commons.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org