I wonder if comparing the size of a full checkpoint vs total size of an 
incremental checkpoint to get insights about the keycount is helpful at all 
because:

- Full checkpoints are basically a dump of all key value pairs as written by 
their serializers, plus their keygoup id. Each key is contained exactly once. 
They are uncompressed, unless you enable compression explicitly.
- Incremental checkpoints use RocksDB’s internal SSTable format, can contain 
additional metadata, and they are always compressed (typically with snappy). 
- Adding up the sizes of all increments can account for each key multiple 
times, because each delta might contain an update per key. Furthermore, deltas 
also contain entries for deleted keys because that is how they must work.

Best,
Stefan 


> Am 20.04.2018 um 11:36 schrieb Juho Autio <juho.au...@rovio.com>:
> 
> Hi Aljoscha & co.,
> 
> Is there any way to monitor the state size yet? Maybe a ticket in Jira?
> 
> When using incremental checkpointing, the total state size can't be seen 
> anywhere. For example the checkpoint details only show the size of the 
> increment. It would be nice to add the total size there as well. The only way 
> I know currently for figuring out the total state size is by triggering a 
> manual savepoint. But this doesn't work at all if the state has grown so big 
> that savepoint times out.
> 
> Also when Flink restores state from an incrementally created checkpoint, it 
> doesn't offer a way to see the total size.
> 
> On Thu, Jan 4, 2018 at 8:23 PM, Steven Wu <stevenz...@gmail.com 
> <mailto:stevenz...@gmail.com>> wrote:
> Aljoscha/Stefan, 
> 
> if incremental checkpoint is enabled, I assume the "checkpoint size" is only 
> the delta/incremental size (not the full state size), right?
> 
> Thanks,
> Steven
> 
> 
> On Thu, Jan 4, 2018 at 5:18 AM, Aljoscha Krettek <aljos...@apache.org 
> <mailto:aljos...@apache.org>> wrote:
> Hi,
> 
> I'm afraid there is currently no metrics around state. I see that it's very 
> good to have so I'm putting it on my list of stuff that we should have at 
> some point.
> 
> One thing that comes to mind is checking the size of checkpoints, which gives 
> you an indirect way of figuring out how big state is but that's not very 
> exact, i.e. doesn't give you "number of keys" or some such.
> 
> Best,
> Aljoscha
> 
> > On 20. Dec 2017, at 08:09, Netzer, Liron <liron.net...@citi.com 
> > <mailto:liron.net...@citi.com>> wrote:
> >
> > Ufuk, Thanks for replying !
> >
> > Aljoscha, can you please assist with the questions below?
> >
> > Thanks,
> > Liron
> >
> > -----Original Message-----
> > From: Ufuk Celebi [mailto:u...@apache.org <mailto:u...@apache.org>]
> > Sent: Friday, December 15, 2017 3:06 PM
> > To: Netzer, Liron [ICG-IT]
> > Cc: user@flink.apache.org <mailto:user@flink.apache.org>
> > Subject: Re: Flink State monitoring
> >
> > Hey Liron,
> >
> > unfortunately, there are no built-in metrics related to state. In general, 
> > exposing the actual values as metrics is problematic, but exposing summary 
> > statistics would be a good idea. I'm not aware of a good work around at the 
> > moment that would work in the general case (taking into account state 
> > restore, etc.).
> >
> > Let me pull in Aljoscha (cc'd) who knows the state backend internals well.
> >
> > @Aljoscha:
> > 1) Are there any plans to expose keyed state related metrics (like number 
> > of keys)?
> > 2) Is there a way to work around the lack of these metrics in 1.3?
> >
> > – Ufuk
> >
> > On Thu, Dec 14, 2017 at 10:55 AM, Netzer, Liron <liron.net...@citi.com 
> > <mailto:liron.net...@citi.com>> wrote:
> >> Hi group,
> >>
> >>
> >>
> >> We are using Flink keyed state in several operators.
> >>
> >> Is there an easy was to expose the data that is stored in the state, i.e.
> >> the key and the values?
> >>
> >> This is needed for both monitoring as well as debugging. We would like
> >> to understand how many key+values are stored in each state and also to
> >> view the data itself.
> >>
> >> I know that there is the "Queryable state" option, but this is still
> >> in Beta, and doesn't really give us what we want easily.
> >>
> >>
> >>
> >>
> >>
> >> *We are using Flink 1.3.2 with Java.
> >>
> >>
> >>
> >> Thanks,
> >>
> >> Liron
> 
> 
> 

Reply via email to