Hi Jiayi, Thanks for your reply and glad to hear that you have taken some effort for it, the potential contribution is also welcome.
I also want to explore it in depth. Currently, let's listen to the community's opinions. Best, Vino. bupt_ljy <bupt_...@163.com> 于2019年4月26日周五 下午9:54写道: > Hi yang, > +1 for this proposal. Queryable state is a very common usage in our > scenarios when we debug and query the realtime status in streaming process > like CEP. And we’ve done a lot to improve the “user experience” of this > feature like exposing the taskmanager’s proxy port in TaskManagerInfo. > I’m looking forward to a more detailed and deeper discussion and I’d like > to contribute back to the community on this. > > > Best Regards, > Jiayi Liao > > > Original Message > Sender:vino yangyanghua1...@gmail.com > Recipient:dev@flink.apache.org...@flink.apache.org > Date:Friday, Apr 26, 2019 16:41 > Subject:Re: [DISCUSS] Improve Queryable State and introduce > aQueryServerProxy component > > > Hi Paul, Thanks for your reply. You are right, currently, the queryable > state has few users. And I totally agree with you, it makes the streaming > works more like a DB. About the architecture and the problem you concern: > yes, it maybe affect the JobManager if they are deployed together. I think > it's important to guarantee the JobManager's available and stability, and > the QueryProxyServer is just a secondary service component. So when > describing the role of the QueryProxyServer, I mentioned SLA policy, I > think it's a solution. But the detail may need to be discussed. About > starting queryable state client with a cmd, I think it's a good idea and > valuable. Best, Vino. Paul Lam paullin3...@gmail.com 于2019年4月26日周五 > 下午3:31写道: Hi Vino, Thanks a lot for bringing up the discussion! > Queryable state has been at beta version for a long time, and due to its > complexity and instability I think there are not many users, but there’s a > great value in it which makes state as database one step closer. WRT the > architecture, I’d vote for opt 3, because it fits the cloud architecture > the most and avoids putting more burdens on JM (sometimes the queries > could be slow and resources intensive). My concern is that on many cluster > frameworks the container resources are limited (IIUC, the JM and QS are > running in the same container), would JM gets killed if QS eats up too > much memory? And a minor suggestion: can we introduce a cmd script to > setup a QueryableStateClient? That would be easier for users who wants to > try out this feature. Best, Paul Lam 在 2019年4月26日,11:09,vino yang > yanghua1...@gmail.com 写道: Hi Quan, Thanks for your reply. > Actually, I did not try this way. But, there are two factors we should > consider: 1. The local state storage is not equals to RocksDB, > otherwise Flink does not need to provide a queryable state client. What's > more, querying the RocksDB is still an address-explicit action. 2. > IMO, the proposal's more valuable suggestion is to make the queryable > state's architecture more reasonable, let it encapsulated more details > and improve its scalability. Best, Vino Shi Quan > qua...@outlook.com 于2019年4月26日周五 上午10:38写道: Hi, How about take > states from RocksDB directly, in this case, TM host is unnecessary. > Best Quan Shi ________________________________ From: vino yang > yanghua1...@gmail.com Sent: Thursday, April 25, 2019 10:18:20 PM To: > dev; user Cc: Stefan Richter; Aljoscha Krettek; kklou...@gmail.com > Subject: [DISCUSS] Improve Queryable State and introduce a > QueryServerProxy component Hi all, I want to share my thought with > you about improving the queryable state and introducing a > QueryServerProxy component. I think the current queryable state's > client is hard to use. Because it needs users to know the TaskManager's > address and proxy's port. Actually, some business users who do not have > good knowledge about the Flink's inner or runtime in production. > However, sometimes they need to query the values of states. IMO, the > reason caused this problem is because of the queryable state's > architecture. Currently, the queryable state clients interact with query > state client proxy components which host on each TaskManager. This > design is difficult to encapsulate the point of change and exposes too > much detail to the user. My personal idea is that we could introduce > a really queryable state server, named e.g. QueryStateProxyServer which > would delegate all the query state request and query the local registry > then redirect the request to the specific QueryStateClientProxy(runs on > each TaskManager). The server is the users really want to care about. > And it would make the users ignorant to the TaskManagers' address and > proxies' port. The current QueryStateClientProxy would become > QueryStateProxyClient. Generally speaking, the roles of the > QueryStateProxyServer list below: * works as all the query client's > proxy to receive all the request and send response; * a router to > redirect the real query requests to the specific proxy client; * > maintain route table registry (state - TaskManager, TaskManager-proxy > client address) * more fine-granted control, such as cache result, ACL, > TTL, SLA(rate limit) and so on About the implementation, there are > three opts: opt 1: Let the JobManager acts as the query proxy > server. * pros: reuse the exists JM, do not need to introduce a new > process can reduce the complexity; * cons: would make JM heavy burdens, > depends on the query frequency, may impact on the stability [Screen > Shot 2019-04-25 at 5.12.07 PM.png] opt 2: Introduce a new component > which runs as a single process and acts as the query proxy server: > * pros: reduce the burdens and make the JM more stability * cons: > introduced a new component will make the implementation more complexity > [Screen Shot 2019-04-25 at 5.14.05 PM.png] opt 3 (suggestion comes > from Stefan Richter): Combining the two opts, the query server could > run as a single entry point(process) and integrate with JobManager. > If we keep it well encapsulated, the only difference would be how we > register new TMs with the query server in the different scenarios, in JM > we might have this information already, in standalone e.g. the TMs be > started with the query server address to register. This would give the > convenience to start QS with the JM and the flexibility for power user to > reduce load on their JM. IMO, the queryable state is a very valuable > feature. It can let users query some real-time measure results. I hope it > will get the attention of the community. It is just a roughly > thought. If it is valuable to the community, I will give a design draft. > What's your opinion? Any feedback and comment are welcome! Best, > Vino.