It’s a good idea to get the process information of large ongoing window.
+1 from my side.

> 在 2019年7月4日,11:41,vino yang <yanghua1...@gmail.com> 写道:
> 
> Hi folks,
> 
> Currently, the queryable state is not widely used in production. IMO, there
> are two key reasons caused this result. 1) the client of the queryable
> state is hard to use. Because it requires users to know the address of
> TaskManager and the port of the proxy. Actually, most business users who do
> not have good knowledge about the Flink's inner and runtime in production.
> 2) The benefit of this feature has not been excavated. In Flink DataStream
> API, State is the first level citizen, it’s Flink key advantage compared
> with other compute engines. Because the queryable state is the most
> effective way to pry the latest computing progress.
> 
> Three months ago, I started a discussion about improving the queryable
> state and introducing a proxy component.[1] It brings a lot of attention
> and discussion. Recently, I have submitted a design document about the
> proposal.[2] These efforts try to process the first problem.
> 
> About the second question, the most essential solution is that we should
> really make the queryable state work. The window operator is one of the
> most valuable and most frequently used operators of all Flink operators.
> And it also uses keyed state which is queryable. So we propose to let the
> state of the window operator be queried. This is not only for increasing
> the value of the queryable state but also for the real business needs.
> 
> IMO, allowing window state to be queried will provide great value. In many
> scenarios, we often use large windows for aggregate calculations. A very
> common example is a day-level window that counts the PV of a day. But
> usually, the user is not only satisfied to wait until the end of the window
> to get the result. They want to get "intermediate results" at a smaller
> time granularity to analyze trends. Because Flink does not provide periodic
> triggers for fixed windows. We have extended this and implemented an
> "incremental window". It can trigger a fixed window with a smaller interval
> period and feedback intermediate results. However, we believe that this
> approach is still not flexible enough. We should let the user query the
> current calculation result of the window through the API at any time.
> 
> However, I know that if we want to implement it, we still have some details
> that need to be discussed, such as how to let users know the state
> descriptors in the window, namespace and so on.
> 
> This discussion thread is mainly to listen to the community's opinion on
> this proposal.
> 
> Any feedback and ideas are welcome and appreciated.
> 
> Best,
> Vino
> 
> [1]:
> http://mail-archives.apache.org/mod_mbox/flink-dev/201907.mbox/%3ctencent_35a56d6858408be2e2064...@qq.com%3E
> [2]:
> https://docs.google.com/document/d/181qYVIiHQGrc3hCj3QBn1iEHF4bUztdw4XO8VSaf_uI/edit?usp=sharing


Reply via email to