It is possible, and we used to have serialization and deserialization of updatable Theta sketches. At some point we decided that it is more confusing than useful and might encourage anti-patterns in big systems (such as deserialize-update-serialize sequences on every update). So we removed this functionality from the C++ code, but not from Java (yet). Again, I would suggest treating serialization as finalizing a sketch. If you want to update it, create a fresh one for this new time frame or whatever classifier makes sense (batch, session, transaction). Hopefully this new sketch can be kept for updating for a while (unlit some close-of-books for a period of time or until the whole batch is processed or something). Finalized sketches can be easily merged as needed. Say, you create a new sketch every minute and serialize the previous one. Later you can have your report to show the last 60-min rolling window or a calendar day or something like that by aggregating the appropriate set of sketches for that report.
On Wed, Aug 25, 2021 at 1:20 PM Karl Matthias <k...@community.com> wrote: > Thanks for the reply. Yes I could do time series sketches, but what I want > actually is a summary representation of the current set, which I update > over time and eventually replace entirely. It's an evented system and I > want to use Theta sketches as a sort of summary. I can rebuild them > entirely at any time, but if maintained live they would be a fast > approximation that is combinable with other Theta sketches. Ideally I would > not have to keep them all in memory to do that and could serialize and > deserialize at will. > > It sounds like it's not currently implemented. But if I can manage the > code to do it, it is possible? > > On Wed, Aug 25, 2021 at 8:09 PM Alexander Saydakov < > sayda...@verizonmedia.com> wrote: > >> Is there a good reason to necessarily update the same sketch you decided >> to serialize? >> I would suggest considering that sketch finalized. Perhaps, in your >> system these sketches would represent different time periods or different >> categories or something like that. Later on you may want to merge (union) >> some of them to obtain an estimate for a longer time frame or a total >> across categories and so on. >> >> On Wed, Aug 25, 2021 at 11:14 AM Karl Matthias <k...@community.com> >> wrote: >> >>> Hey folks, >>> >>> I am working with both the Java library and the C++ library and the >>> Theta sketch. >>> >>> What I would like to do is update a sketch, save it somewhere (i.e. >>> disk, etc), then reload it later and possibly update it then. The >>> CompactSketch doesn't support updates when an UpdateSketch is serialized >>> and loaded, it is read-only. >>> >>> From looking at the Java code it seems like it would be possible to >>> create an UpdateSketch from the contents of a CompactSketch but there >>> doesn't appear to be an existing method that does this. Am I missing >>> something that already does this? Or is it not possible? >>> >>> Many thanks >>> Karl >>> >>>