On 6/29/14, 2:30 PM, Gilles wrote: > On Sun, 29 Jun 2014 10:25:58 -0700, Phil Steitz wrote: >> On 6/29/14, 9:48 AM, venkatesha murthy wrote: >>> On Sun, Jun 29, 2014 at 1:25 AM, Phil Steitz >>> <phil.ste...@gmail.com> wrote: >>> >>>> On 6/25/14, 12:12 AM, Luc Maisonobe wrote: >>>>> Le 25/06/2014 06:05, venkatesha murthy a écrit : >>>>>> On Wed, Jun 25, 2014 at 12:21 AM, Luc Maisonobe >>>>>> <l...@spaceroots.org> >>>> wrote: >>>>>>> Hi Venkat, >>>>>>> >>>>>>> Le 23/06/2014 21:08, venkatesha murthy a écrit : >>>>>>>> On Tue, Jun 24, 2014 at 12:08 AM, Luc Maisonobe >>>>>>>> <l...@spaceroots.org> >>>>>>> wrote: >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> While looking further in Percentile class for MATH-1120, I >>>>>>>>> have found >>>>>>>>> another problem in the current implementation. >>>>>>>>> NaNStrategy.FIXED >>>> should >>>>>>>>> leave the NaNs in place, but at the end of the >>>>>>>>> KthSelector.select >>>>>>>>> method, a call to Arrays.sort moves the NaNs to the end of >>>>>>>>> the small >>>>>>>>> sub-array. What is really implemented here is therefore >>>>>>>>> closer to >>>>>>>>> NaNStrategy.MAXIMAL than NaNStrategy.FIXED. This always >>>>>>>>> occur in the >>>>>>>>> test cases because they use very short arrays, and we >>>>>>>>> directly >>>> switch to >>>>>>>>> this part of the select method. >>>>>>>> Are NaNs considered higher than +Inf ? >>>>>>>> If MAXIMAL represent replacing for +inf ; you need >>>>>>>> something to >>>>>>>> indicate beyond this for NaN. >>>>>>> Well, we can just keep the NaN themselves and handled them >>>>>>> appropriately, hoping not to trash performances too much. >>>>>>> >>>>>> Agreed. >>>>>> >>>>>>>> What is the test input you see an issue and what is the >>>>>>>> actual error >>>>>>>> you are seeing. Please share the test case. >>>>>>> Just look at PercentileTest.testReplaceNanInRange(). The >>>>>>> first check in >>>>>>> the test corresponds to a Percentile configuration at 50% >>>>>>> percentile, >>>>>>> and NaNStrategy.FIXED. The array has an odd number of >>>>>>> entries, so the >>>>>>> 50% percentile is exactly one of the entries: the one at >>>>>>> index 5 in the >>>>>>> final array. >>>>>>> >>>>>>> The initial ordering is { 0, 1, 2, 3, 4, NaN, NaN, 5, 7, >>>>>>> NaN, 8 }. So >>>>>>> for the NaNStrategy.FIXED setting, it should not be modified >>>>>>> at all in >>>>>>> the selection algorithm and the result for 50% should be the >>>>>>> first NaN >>>>>>> of the array, at index 5. In fact, due to the Arrays.sort, >>>>>>> we *do* >>>>>>> reorder the array into { 0, 1, 2, 3, 4, 5, 7, 8, NaN, NaN, >>>>>>> NaN }, so >>>>>>> the result is 5. >>>>>>> >>>>>>> Agreed. just verified by putting back the earlier insertionSort >>>> function. >>>>>>> If we use NaNStrategy.MAXIMAL and any quantile above 67%, we >>>>>>> get as a >>>>>>> result Double.POSITIVE_INFINITY instead of Double.NaN. >>>>>>> >>>>>>> If we agree to leave FIXED as unchanged behaviour with your >>>> insertionSort >>>>>> code; then atleast MAXIMAL/MINIMAL should be allowed for >>>>>> transformation >>>> of >>>>>> NaN to +/-Inf >>>>> I'm fine with it, as long as we clearly state it in the >>>>> NaNStrategy >>>>> documentation, saying somethig like: >>>>> >>>>> When MAXIMAL/MINIMAL are used, the NaNs are effecively >>>>> *replaced* by >>>>> +/- infinity, hence the results will be computed as if >>>>> infinities were >>>>> there in the first place. >>>> -1 - this is mixing concerns. NaNStrategy exists for one specific >>>> purpose - to turn extend the ordering on doubles a total ordering. >>>> Strategies prescribe only how NaNs are to be treated in the >>>> ordering. Side effects on the input array don't make sense in >>>> this >>>> context. The use case for which this class was created was rank >>>> transformations. Returning infinities in rank transformations >>>> would >>>> blow up computations in these classes. If a floating point >>>> computations need to transform NaNs, the specific stats / other >>>> classes that do this transformation should perform and document >>>> the >>>> behavior. >>>> >>>> Phil >>>> >>> OK. Agreed realized it later. Hence i have not touched >>> NaNStrategy in my >>> patch(MATH-1132) . Now i have added a separate transformer to >>> transform NaNs >>> but it uses NanStrategy. Please let me know on this as this >>> trasnformation >>> itself >>> can be used in different places. >> >> Honestly, I don't see the value of this. I would be more favorable >> to an addition to MathArrays supporting NaN (or infinity) filtering >> / transformation. Several years back we made the decision to just >> let NaNs propagate in computations, which basically means that if >> you feed NaNs to [math] you are going to get NaNs out in results. >> If we do get into the business of filtering NaNs (or infinities for >> that matter), I think it is probably best to do that in MathArrays >> or somesuch - i.e., don't decorate APIs with NaN or >> infinity-handling handling descriptions. That would be a huge task >> and would likely end up bleeding implementation detail into the >> APIs. As I said above, NaNStrategy has to exist for rank >> transformations to be well-defined. NaN-handling in floating point >> computations is defined and as long as we fully document what our >> algorithms do, I think it is fair enough to "let the NaNs fall where >> they may" - which basically means users need to do the filtering / >> transformation themselves. > > So, do I understand correctly that it's OK to add the "remove" and > "replace" utility methods (cf. MATH-1130) in "MathArrays"? > Or does this thread conclude that we don't have a use for this within > Commons Math?
You raise a good point here. Personally, I don't see an internal use and if we are sticking with MathArrays including only things we have internal use for, I would say no, don't include them. Phil > > Gilles > >> >> Phil >>> >>>>> [...] > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org