Neither those are metrics metrics on a ValueState<IntervalList>, which is updated at least once every call to process. The metric is the the number of these ValueState<IntervalList>s scoped to a key ( am using session windows ).
On Mon, Mar 15, 2021 at 11:29 PM Yun Tang <myas...@live.com> wrote: > Hi, > > Could you describe what you observed in details? Which states you compare > with the session window state "merging-window-set", the "newKeysInState" > or "existingKeysInState"? > > BTW, since we use list state as main state for window operator and we use > RocksDB's merge operation for window state add operations, this would cause > the estimating of number keys inaccurate [1]: > // Estimation will be inaccurate when: > // (1) there exist merge keys > // (2) keys are directly overwritten > // (3) deletion on non-existing keys > // (4) low number of samples > > [1] > https://github.com/ververica/frocksdb/blob/49bc897d5d768026f1eb816d960c1f2383396ef4/db/version_set.cc#L919-L924 > > > > Best > Yun Tang > ------------------------------ > *From:* Vishal Santoshi <vishal.santo...@gmail.com> > *Sent:* Monday, March 15, 2021 5:48 > *To:* user <user@flink.apache.org> > *Subject:* Re: Question about > session_aggregate.merging-window-set.rocksdb_estimate-num-keys > > All I can think is, that any update on a state key, which I do in my > ProcessFunction, creates an update ( essentially an append on rocksdb ) > which does render the previous value for the key, a tombstone , but that > need not reflect on the count ( as double or triple counts ) atomically, > thus the called as an "estimate" , but was not anticipating this much > difference ... > > On Sun, Mar 14, 2021 at 5:32 PM Vishal Santoshi <vishal.santo...@gmail.com> > wrote: > > The reason I ask is that I have a "Process Window Function" on that > Session Window and I keep key scoped Global State. I maintain a TTL on > that state ( that is outside the Window state ) that is roughly the > current WM + lateness. > > I would imagine that keys for that custom state are *roughly* equal to > the number of keys in the "merging-window-set" . It seems twice that number > but does follow the slope. I am trying to figure out why this deviation. > > public void process(KEY key, > ProcessWindowFunction<KeyedSession<KEY, VALUE>, KeyedSessionWithSessionID< > KEY, VALUE>, KEY, TimeWindow>.Context context, > Iterable<KeyedSession<KEY, VALUE>> elements, Collector< > KeyedSessionWithSessionID<KEY, VALUE>> out) > throws Exception { > // scoped to the key > if (state.value() == null) { > this.newKeysInState.inc(); > state.update(new IntervalList()); > }else{ > this.existingKeysInState.inc(); > } > > On Sun, Mar 14, 2021 at 3:32 PM Vishal Santoshi <vishal.santo...@gmail.com> > wrote: > > Hey folks, > > Was looking at this very specific metric > "session_aggregate.merging-window-set.rocksdb_estimate-num-keys". Does > this metric also represent session windows ( it is a session window ) that > have lateness on them ? In essence if the session window was closed but has > a lateness of a few hours would those keys still be counted against this > metric. > > I think they should as it is an estimate keys for the Column Family for > the operator and if the window has not been GCed then the key for those > Windows should be in RocksDB but wanted to be sure. > > Regards. > > >