Copilot commented on code in PR #136:
URL: https://github.com/apache/datasketches-rust/pull/136#discussion_r3332489356
##########
datasketches/src/hll/sketch.rs:
##########
@@ -423,6 +423,19 @@ impl HllSketch {
Mode::Array8(arr) => arr.serialize(self.lg_config_k),
}
}
+
+ /// Returns the estimated size of the sketch in bytes
+ pub fn estimated_size(&self) -> usize {
+ let heap_size = match &self.mode {
+ Mode::List { list, .. } => list.container().estimated_size(),
+ Mode::Set { set, .. } => set.container().estimated_size(),
+ Mode::Array4(arr) => arr.estimated_size(),
+ Mode::Array6(arr) => arr.estimated_size(),
+ Mode::Array8(arr) => arr.estimated_size(),
+ };
+
+ std::mem::size_of::<Self>() + heap_size
+ }
Review Comment:
The PR description/issue mention exposing a `size()` API for memory
footprint, but this adds `estimated_size()` instead. Consider adding `size()`
(and optionally keeping `estimated_size()` as a compatibility wrapper) so
callers have the intended, stable API surface.
##########
datasketches/src/cpc/sketch.rs:
##########
@@ -450,6 +450,18 @@ impl CpcSketch {
matrix
}
+
+ /// Returns the estimated size of the sketch in bytes
+ pub fn estimated_size(&self) -> usize {
+ let heap_size = self.sliding_window.capacity()
+ + self
+ .surprising_value_table
+ .as_ref()
+ .map(|t| t.estimated_size())
+ .unwrap_or(0);
+
+ std::mem::size_of::<Self>() + heap_size
+ }
Review Comment:
The PR description/issue mention a `size()` method; this adds
`estimated_size()` instead. Consider exposing `size()` as the primary API
(keeping `estimated_size()` as a wrapper if desired). Also, `Vec::capacity()`
is in elements; multiplying by `size_of::<u8>()` makes the byte accounting
explicit and robust if the element type ever changes.
##########
datasketches/src/hll/array4.rs:
##########
@@ -425,6 +425,16 @@ impl Array4 {
bytes.into_bytes()
}
+
+ /// Returns the estimated size of the heap allocations in bytes
+ pub fn estimated_size(&self) -> usize {
+ self.bytes.len()
+ + self
+ .aux_map
+ .as_ref()
+ .map(|a| a.estimated_size())
+ .unwrap_or(0)
+ }
Review Comment:
`Array4::estimated_size()` introduces new behavior (including optional
aux-map accounting) but there are existing unit tests in this file and none
cover this size calculation. Adding a small test that constructs an Array4 both
with and without an aux map and asserts the returned size matches `bytes.len()`
(+ aux-map allocation when present) would help prevent regressions.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]