tdunning commented on PR #471:
URL: https://github.com/apache/datasketches-cpp/pull/471#issuecomment-3736732713

   Arithmetic with infinity in IEEE floating point *can* produce NaN, but only
   when opposing values are involved. Sorting works correctly because
   infinities compare with normal numbers.
   
   Since the centroids are always sorted, merging will only ever combine
   infinities of like sign. This means that infinity will creep slowly like a
   contagion into the centroids in the digest. If there are lots of infinite
   values in a sample, we may get a slight over-estimate of quantiles because
   an entire centroid will go to infinity if a single infinite value is
   introduced, but no single centroid is that large in a reasonably configured
   digest.
   
   ```
   julia> 3 + +Inf
   Inf
   
   julia> +Inf + +Inf
   Inf
   
   julia> +Inf + -Inf
   NaN
   
   julia> -Inf + -Inf
   -Inf
   
   julia> 3 + -Inf
   -Inf
   ```
   
   On Sun, Jan 11, 2026 at 6:56 PM tison ***@***.***> wrote:
   
   > *tisonkun* left a comment (apache/datasketches-cpp#471)
   > 
<https://github.com/apache/datasketches-cpp/pull/471#issuecomment-3736718766>
   >
   > Allowing +Inf and -Inf in TDigest structure would cause internal NaN
   > values during compression or merging and potentially break internal
   > assumption.
   >
   > See apache/datasketches-java#702
   > <https://github.com/apache/datasketches-java/issues/702> and 
apache/datasketches-rust#23
   > (comment)
   > 
<https://github.com/apache/datasketches-rust/pull/23#discussion_r2622754228>
   > .
   >
   > This is the behavior that any function like sum should follow. This
   > includes mean and, in a more nuanced way, t-digest. What is the median of
   > [0, 0, 1, +∞, +∞] ? I think it should be 1, not 0. Julia agrees: julia>
   > median([0, 0, 1, +Inf, +Inf]) 1.0 If the infinities are ignored, we get
   > the wrong answer.
   >
   > I agree with this at some point. But since we're free to merge/compress
   > tdigests sketches, this may cause issues described above.
   >
   > —
   > Reply to this email directly, view it on GitHub
   > 
<https://github.com/apache/datasketches-cpp/pull/471#issuecomment-3736718766>,
   > or unsubscribe
   > 
<https://github.com/notifications/unsubscribe-auth/AAB5E6QWNYN3IVQFMHGAUKL4GMEO5AVCNFSM6AAAAACQ5Y54J2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTOMZWG4YTQNZWGY>
   > .
   > You are receiving this because you were mentioned.Message ID:
   > ***@***.***>
   >
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to