tdunning commented on PR #471: URL: https://github.com/apache/datasketches-cpp/pull/471#issuecomment-3736778851
Your key statement is that this never (practically speaking) happens. That means that filtering is not a breaking change (practically speaking), nor does it produce different results for any reasonable input. Filtering out invalid inputs makes it easier to reason about the code and so it can actually improve things. Pretending that we have covered all the bases in the presence of infinite values is probably not a realistic thing to do. You have convinced me. On Sun, Jan 11, 2026 at 7:34 PM tison ***@***.***> wrote: > *tisonkun* left a comment (apache/datasketches-cpp#471) > <https://github.com/apache/datasketches-cpp/pull/471#issuecomment-3736767457> > > Arithmetic with infinity in IEEE floating point *can* produce NaN, but > only when opposing values are involved. Sorting works correctly because > infinities compare with normal numbers. Since the centroids are always > sorted, merging will only ever combine infinities of like sign. This means > that infinity will creep slowly like a contagion into the centroids in the > digest. If there are lots of infinite values in a sample, we may get a > slight over-estimate of quantiles because an entire centroid will go to > infinity if a single infinite value is introduced, but no single centroid > is that large in a reasonably configured digest. > > In general inputs, we shall never see NaN, Inf, -Inf, or at least not too > much. > > We're here to discuss edge cases. So the trade-off is that, assuming the > sketches may have +-inf but they never combine to produce NaN, or just > filter all +-inf. > > If we allow +-inf and assume they will never combine due to intermediate > finite numbers, the question is how we program against the opposing values? > Shall we mark the sketch in a broken state, and all operations then fail? > Or crash the program? (Since NaN cannot compare with other values; or we > can define a total order in some way.) > > — > Reply to this email directly, view it on GitHub > <https://github.com/apache/datasketches-cpp/pull/471#issuecomment-3736767457>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AAB5E6QBGM4EWRD5F6TI4W34GMI57AVCNFSM6AAAAACQ5Y54J2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTOMZWG43DONBVG4> > . > You are receiving this because you were mentioned.Message ID: > ***@***.***> > -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
