It is possible, and we used to have serialization and deserialization of
updatable Theta sketches. At some point we decided that it is more
confusing than useful and might encourage anti-patterns in big systems
(such as deserialize-update-serialize sequences on every update). So we
removed this functionality from the C++ code, but not from Java (yet).
Again, I would suggest treating serialization as finalizing a sketch. If
you want to update it, create a fresh one for this new time frame or
whatever classifier makes sense (batch, session, transaction). Hopefully
this new sketch can be kept for updating for a while (unlit some
close-of-books for a period of time or until the whole batch is processed
or something). Finalized sketches can be easily merged as needed. Say, you
create a new sketch every minute and serialize the previous one. Later you
can have your report to show the last 60-min rolling window or a calendar
day or something like that by aggregating the appropriate set of sketches
for that report.


On Wed, Aug 25, 2021 at 1:20 PM Karl Matthias <k...@community.com> wrote:

> Thanks for the reply. Yes I could do time series sketches, but what I want
> actually is a summary representation of the current set, which I update
> over time and eventually replace entirely. It's an evented system and I
> want to use Theta sketches as a sort of summary. I can rebuild them
> entirely at any time, but if maintained live they would be a fast
> approximation that is combinable with other Theta sketches. Ideally I would
> not have to keep them all in memory to do that and could serialize and
> deserialize at will.
>
> It sounds like it's not currently implemented. But if I can manage the
> code to do it, it is possible?
>
> On Wed, Aug 25, 2021 at 8:09 PM Alexander Saydakov <
> sayda...@verizonmedia.com> wrote:
>
>> Is there a good reason to necessarily update the same sketch you decided
>> to serialize?
>> I would suggest considering that sketch finalized. Perhaps, in your
>> system these sketches would represent different time periods or different
>> categories or something like that. Later on you may want to merge (union)
>> some of them to obtain an estimate for a longer time frame or a total
>> across categories and so on.
>>
>> On Wed, Aug 25, 2021 at 11:14 AM Karl Matthias <k...@community.com>
>> wrote:
>>
>>> Hey folks,
>>>
>>> I am working with both the Java library and the C++ library and the
>>> Theta sketch.
>>>
>>> What I would like to do is update a sketch, save it somewhere (i.e.
>>> disk, etc), then reload it later and possibly update it then. The
>>> CompactSketch doesn't support updates when an UpdateSketch is serialized
>>> and loaded, it is read-only.
>>>
>>> From looking at the Java code it seems like it would be possible to
>>> create an UpdateSketch from the contents of a CompactSketch but there
>>> doesn't appear to be an existing method that does this. Am I missing
>>> something that already does this? Or is it not possible?
>>>
>>> Many thanks
>>> Karl
>>>
>>>

Reply via email to