> I'm inclined to say the expected data shape returned should be preserved.
Agreed, I think the array of NaNs is the most correct return value in the case described there (and org.apache.druid.query.aggregation.histogram.FixedBucketsHistogram#percentilesFloat should be adjusted to that match convention). I don't see any difference between returning empty array vs null, to address the original concern in that thread. On Tue, Apr 23, 2019 at 12:42 PM Charles Allen <charles.al...@snap.com.invalid> wrote: > > Hi all! > > If you do not use approximate quantiles (or histograms or quantiles > double sketch) then you can stop reading. > > https://github.com/apache/incubator-druid/issues/7486 brings up an > issue related to how objects are returned from Druid aggregations, > specifically when the input aggregation configuration has a complex > configuration (like an array of input values). I'm bringing the > discussion to the dev list so that any decisions are part of a more > official Apache review process and not accidentally tucked away in > github thread. Please be sure to check out the thread and > AlexanderSaydakov's insights. > > > If a single quantile is requested, then the best answer must be NaN, not zero since zero is a perfectly good number and would be deeply misleading. What to do if an array of quantiles is requested? > > I'm inclined to say the expected data shape returned should be preserved. > > Let's say there's an alternate world where some other quantiles > estimation algorithm can either converge or not converge but it > depends on the % you requested. Like choosing the 50th percentile > might converge and give you a value but choosing the 99.99% might not. > In such a world it would be possible for SOME of the requested values > to resolve but not others. In this same world, if you were to do two > aggregators at `50%` and `99.99%`, vs one aggregator at `[50%, > 99.99%]`, I would hope the result would be directly relatable, and > that the array form would be one of optimization or convenience. > > As such, and since > `org.apache.druid.query.aggregation.histogram.ApproximateHistogram` > already sets a precedence for returning an array of `NaN`, I propose > the returned value for an array of quantiles be directly translatable > to the array-equivalent form of the result when requesting the > quantiles singularly in different aggregations. Which in this case I > believe would be an array of `NaN`. > > Thoughts? > Charles Allen > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org > For additional commands, e-mail: dev-h...@druid.apache.org