Hi Paul, Thanks for your reply.
You are right, currently, the queryable state has few users. And I totally agree with you, it makes the streaming works more like a DB. About the architecture and the problem you concern: yes, it maybe affect the JobManager if they are deployed together. I think it's important to guarantee the JobManager's available and stability, and the QueryProxyServer is just a secondary service component. So when describing the role of the QueryProxyServer, I mentioned SLA policy, I think it's a solution. But the detail may need to be discussed. About starting queryable state client with a cmd, I think it's a good idea and valuable. Best, Vino. Paul Lam <paullin3...@gmail.com> 于2019年4月26日周五 下午3:31写道: > Hi Vino, > > Thanks a lot for bringing up the discussion! Queryable state has been at > beta version for a long time, and due to its complexity and instability I > think there are not many users, but there’s a great value in it which makes > state as database one step closer. > > WRT the architecture, I’d vote for opt 3, because it fits the cloud > architecture the most and avoids putting more burdens on JM (sometimes the > queries could be slow and resources intensive). My concern is that on many > cluster frameworks the container resources are limited (IIUC, the JM and QS > are running in the same container), would JM gets killed if QS eats up too > much memory? > > And a minor suggestion: can we introduce a cmd script to setup a > QueryableStateClient? That would be easier for users who wants to try out > this feature. > > Best, > Paul Lam > > > 在 2019年4月26日,11:09,vino yang <yanghua1...@gmail.com> 写道: > > > > Hi Quan, > > > > Thanks for your reply. > > > > Actually, I did not try this way. > > > > But, there are two factors we should consider: > > > > > > 1. The local state storage is not equals to RocksDB, otherwise Flink > > does not need to provide a queryable state client. What's more, > querying > > the RocksDB is still an address-explicit action. > > 2. IMO, the proposal's more valuable suggestion is to make the > queryable > > state's architecture more reasonable, let it encapsulated more details > and > > improve its scalability. > > > > Best, > > Vino > > > > > > > > Shi Quan <qua...@outlook.com> 于2019年4月26日周五 上午10:38写道: > > > >> Hi, > >> > >> How about take states from RocksDB directly, in this case, TM host is > >> unnecessary. > >> > >> Best > >> > >> Quan Shi > >> > >> ________________________________ > >> From: vino yang <yanghua1...@gmail.com> > >> Sent: Thursday, April 25, 2019 10:18:20 PM > >> To: dev; user > >> Cc: Stefan Richter; Aljoscha Krettek; kklou...@gmail.com > >> Subject: [DISCUSS] Improve Queryable State and introduce a > >> QueryServerProxy component > >> > >> Hi all, > >> > >> I want to share my thought with you about improving the queryable state > >> and introducing a QueryServerProxy component. > >> > >> I think the current queryable state's client is hard to use. Because it > >> needs users to know the TaskManager's address and proxy's port. > Actually, > >> some business users who do not have good knowledge about the Flink's > inner > >> or runtime in production. However, sometimes they need to query the > values > >> of states. > >> > >> IMO, the reason caused this problem is because of the queryable state's > >> architecture. Currently, the queryable state clients interact with query > >> state client proxy components which host on each TaskManager. This > design > >> is difficult to encapsulate the point of change and exposes too much > detail > >> to the user. > >> > >> My personal idea is that we could introduce a really queryable state > >> server, named e.g. QueryStateProxyServer which would delegate all the > query > >> state request and query the local registry then redirect the request to > the > >> specific QueryStateClientProxy(runs on each TaskManager). The server is > the > >> users really want to care about. And it would make the users ignorant to > >> the TaskManagers' address and proxies' port. The current > >> QueryStateClientProxy would become QueryStateProxyClient. > >> > >> Generally speaking, the roles of the QueryStateProxyServer list below: > >> > >> > >> * works as all the query client's proxy to receive all the request > and > >> send response; > >> * a router to redirect the real query requests to the specific proxy > >> client; > >> * maintain route table registry (state <-> TaskManager, > >> TaskManager<->proxy client address) > >> * more fine-granted control, such as cache result, ACL, TTL, SLA(rate > >> limit) and so on > >> > >> About the implementation, there are three opts: > >> > >> opt 1: > >> > >> Let the JobManager acts as the query proxy server. > >> > >> * pros: reuse the exists JM, do not need to introduce a new process > >> can reduce the complexity; > >> * cons: would make JM heavy burdens, depends on the query frequency, > >> may impact on the stability > >> > >> [Screen Shot 2019-04-25 at 5.12.07 PM.png] > >> > >> opt 2: > >> > >> Introduce a new component which runs as a single process and acts as > the > >> query proxy server: > >> > >> > >> * pros: reduce the burdens and make the JM more stability > >> * cons: introduced a new component will make the implementation more > >> complexity > >> > >> [Screen Shot 2019-04-25 at 5.14.05 PM.png] > >> > >> opt 3 (suggestion comes from Stefan Richter): > >> > >> Combining the two opts, the query server could run as a single entry > >> point(process) and integrate with JobManager. > >> > >> If we keep it well encapsulated, the only difference would be how we > >> register new TMs with the query server in the different scenarios, in > JM we > >> might have this information already, in standalone e.g. the TMs be > started > >> with the query server address to register. This would give the > convenience > >> to start QS with the JM and the flexibility for power user to reduce > load > >> on their JM. > >> > >> IMO, the queryable state is a very valuable feature. It can let users > >> query some real-time measure results. I hope it will get the attention > of > >> the community. > >> > >> It is just a roughly thought. If it is valuable to the community, I will > >> give a design draft. > >> > >> What's your opinion? Any feedback and comment are welcome! > >> > >> Best, > >> Vino. > >> > >> > >