Hi Georgi, Thanks for your feedback. And glad to hear you are using queryable 
state. I agree that implementation of option 1 is easier than others. However, 
when we design the new architecture we need to consider more aspects .e.g. 
scalability. So it seems option 3 is more suitable. Actually, some committers 
such as Stefan, Gordon and Aljoscha have given me feedback and direction. 
Currently, I am writing the design document. If it is ready to be presented. I 
will copy to this thread and we can discuss further details. ---- Best, Vino On 
2019-06-07 19:03 , Georgi Stoyanov Wrote: Hi Vino, I was investigating the 
current architecture and AFAIK the first proposal will be a lot easier to 
implement, cause currently JM has the information about the states (where, 
which etc thanks to KvStateLocationRegistry. Correct me if I’m wrong) We are 
using the feature and it’s indeed not very cool to iterate trough ports, check 
which TM is the responsible one etc etc. It will be very useful if someone from 
the committers joins the topic and give us some insights what’s going to happen 
with that feature. Kind Regards, Georgi From: vino yang <yanghua1...@gmail.com> 
Sent: Thursday, April 25, 2019 5:18 PM To: dev <dev@flink.apache.org>; user 
<u...@flink.apache.org> Cc: Stefan Richter <s.rich...@ververica.com>; Aljoscha 
Krettek <aljos...@apache.org>; kklou...@gmail.com Subject: [DISCUSS] Improve 
Queryable State and introduce a QueryServerProxy component Hi all, I want to 
share my thought with you about improving the queryable state and introducing a 
QueryServerProxy component. I think the current queryable state's client is 
hard to use. Because it needs users to know the TaskManager's address and 
proxy's port. Actually, some business users who do not have good knowledge 
about the Flink's inner or runtime in production. However, sometimes they need 
to query the values of states. IMO, the reason caused this problem is because 
of the queryable state's architecture. Currently, the queryable state clients 
interact with query state client proxy components which host on each 
TaskManager. This design is difficult to encapsulate the point of change and 
exposes too much detail to the user. My personal idea is that we could 
introduce a really queryable state server, named e.g. QueryStateProxyServer 
which would delegate all the query state request and query the local registry 
then redirect the request to the specific QueryStateClientProxy(runs on each 
TaskManager). The server is the users really want to care about. And it would 
make the users ignorant to the TaskManagers' address and proxies' port. The 
current QueryStateClientProxy would become QueryStateProxyClient.  Generally 
speaking, the roles of the QueryStateProxyServer list below: works as all the 
query client's proxy to receive all the request and send response; a router to 
redirect the real query requests to the specific proxy client; maintain route 
table registry (state <-> TaskManager, TaskManager<->proxy client address) more 
fine-granted control, such as cache result, ACL, TTL, SLA(rate limit) and so on 
About the implementation, there are three opts: opt 1: Let the JobManager acts 
as the query proxy server. ·  pros: reuse the exists JM, do not need to 
introduce a new process can reduce the complexity; ·  cons: would make JM heavy 
burdens, depends on the query frequency, may impact on the stability opt 2: 
Introduce a new component  which runs as a single process and acts as the query 
proxy server: ·  pros: reduce the burdens and make the JM more stability ·  
cons: introduced a new component will make the implementation more complexity 
opt 3 (suggestion comes from Stefan Richter):  Combining the two opts, the 
query server could run as a single entry point(process) and integrate with 
JobManager.  If we keep it well encapsulated, the only difference would be how 
we register new TMs with the query server in the different scenarios, in JM we 
might have this information already, in standalone e.g. the TMs be started with 
the query server address to register. This would give the convenience to start 
QS with the JM and the flexibility for power user to reduce load on their JM. 
IMO, the queryable state is a very valuable feature. It can let users query 
some real-time measure results. I hope it will get the attention of the 
community. It is just a roughly thought. If it is valuable to the community, I 
will give a design draft. What's your opinion? Any feedback and comment are 
welcome! Best, Vino.

Reply via email to