> I'm inclined to say the expected data shape returned should be preserved.

Agreed, I think the array of NaNs is the most correct return value in the
case described there (and
org.apache.druid.query.aggregation.histogram.FixedBucketsHistogram#percentilesFloat
should be adjusted to that match convention).

I don't see any difference between returning empty array vs null, to
address the original concern in that thread.

On Tue, Apr 23, 2019 at 12:42 PM Charles Allen
<charles.al...@snap.com.invalid> wrote:
>
> Hi all!
>
> If you do not use approximate quantiles (or histograms or quantiles
> double sketch) then you can stop reading.
>
> https://github.com/apache/incubator-druid/issues/7486 brings up an
> issue related to how objects are returned from Druid aggregations,
> specifically when the input aggregation configuration has a complex
> configuration (like an array of input values). I'm bringing the
> discussion to the dev list so that any decisions are part of a more
> official Apache review process and not accidentally tucked away in
> github thread. Please be sure to check out the thread and
> AlexanderSaydakov's insights.
>
> >  If a single quantile is requested, then the best answer must be NaN,
not zero since zero is a perfectly good number and would be deeply
misleading. What to do if an array of quantiles is requested?
>
> I'm inclined to say the expected data shape returned should be preserved.
>
> Let's say there's an alternate world where some other quantiles
> estimation algorithm can either converge or not converge but it
> depends on the % you requested. Like choosing the 50th percentile
> might converge and give you a value but choosing the 99.99% might not.
> In such a world it would be possible for SOME of the requested values
> to resolve but not others. In this same world, if you were to do two
> aggregators at `50%` and `99.99%`, vs one aggregator at `[50%,
> 99.99%]`, I would hope the result would be directly relatable, and
> that the array form would be one of optimization or convenience.
>
> As such, and since
> `org.apache.druid.query.aggregation.histogram.ApproximateHistogram`
> already sets a precedence for returning an array of `NaN`, I propose
> the returned value for an array of quantiles be directly translatable
> to the array-equivalent form of the result when requesting the
> quantiles singularly in different aggregations. Which in this case I
> believe would be an array of `NaN`.
>
> Thoughts?
> Charles Allen
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
> For additional commands, e-mail: dev-h...@druid.apache.org

Reply via email to