Re: RE: [DISCUSS] Improve Queryable State and introduce aQueryServerProxy component

vino yang Thu, 04 Jul 2019 00:37:45 -0700

Hi Jiayi,

Thanks for your comments.


It's valuable. I have accepted it and refined my design document.

You can have another review when you have time.

Best,
Vino

bupt_ljy <[email protected]> 于2019年7月3日周三 下午2:48写道：

> Hi vino,
> Big +1 for this.
>
> Glad to see new progress on this topic! I’ve left some comments on it.
>
>
> Best Regards,
>
> Jiayi Liao
>
>  Original Message
> *Sender:* vino yang<[email protected]>
> *Recipient:* Georgi Stoyanov<[email protected]>
> *Cc:* dev<[email protected]>; user<[email protected]>; Stefan
> Richter<[email protected]>; Aljoscha Krettek<[email protected]>;
> [email protected]<[email protected]>; Stephan Ewen<[email protected]>;
> [email protected]<[email protected]>; Tzu-Li (Gordon) Tai<[email protected]>
> *Date:* Tuesday, Jul 2, 2019 16:45
> *Subject:* Re: RE: [DISCUSS] Improve Queryable State and introduce
> aQueryServerProxy component
>
> Hi all,
>
> In the past, I have tried to further refine the design of this topic
> thread and wrote a design document to give more detailed design images and
> text description, so that it is more conducive to discussion.[1]
>
> Note: The document is not yet completed, for example, the "Implementation"
> section is missing. Therefore, it is still in an open discussion state. I
> will improve the rest while listening to the opinions of the community.
>
> Welcome and appreciate more discussions and feedback.
>
> Best,
> Vino
>
> [1]:
> https://docs.google.com/document/d/181qYVIiHQGrc3hCj3QBn1iEHF4bUztdw4XO8VSaf_uI/edit?usp=sharing
>
>
> yanghua1127 <[email protected]> 于2019年6月7日周五 下午11:32写道：
>
>> Hi Georgi,
>>
>> Thanks for your feedback. And glad to hear you are using queryable state.
>>
>> I agree that implementation of option 1 is easier than others. However,
>> when we design the new architecture we need to consider more aspects .e.g.
>> scalability. So it seems option 3 is more suitable. Actually, some
>> committers such as Stefan, Gordon and Aljoscha have given me feedback and
>> direction.
>>
>> Currently, I am writing the design document. If it is ready to be
>> presented. I will copy to this thread and we can discuss further details.
>>
>> ----
>> Best,
>> Vino
>>
>>
>> On 2019-06-07 19:03 , Georgi Stoyanov <[email protected]> Wrote:
>>
>> Hi Vino,
>>
>>
>>
>> I was investigating the current architecture and AFAIK the first proposal
>> will be a lot easier to implement, cause currently JM has the information
>> about the states (where, which etc thanks to KvStateLocationRegistry.
>> Correct me if I’m wrong)
>>
>> We are using the feature and it’s indeed not very cool to iterate trough
>> ports, check which TM is the responsible one etc etc.
>>
>>
>>
>> It will be very useful if someone from the committers joins the topic and
>> give us some insights what’s going to happen with that feature.
>>
>>
>>
>>
>>
>> Kind Regards,
>>
>> Georgi
>>
>>
>>
>>
>>
>>
>>
>> *From:* vino yang <[email protected]>
>> *Sent:* Thursday, April 25, 2019 5:18 PM
>> *To:* dev <[email protected]>; user <[email protected]>
>> *Cc:* Stefan Richter <[email protected]>; Aljoscha Krettek <
>> [email protected]>; [email protected]
>> *Subject:* [DISCUSS] Improve Queryable State and introduce a
>> QueryServerProxy component
>>
>>
>>
>> Hi all,
>>
>>
>>
>> I want to share my thought with you about improving the queryable state
>> and introducing a QueryServerProxy component.
>>
>>
>>
>> I think the current queryable state's client is hard to use. Because it
>> needs users to know the TaskManager's address and proxy's port. Actually,
>> some business users who do not have good knowledge about the Flink's inner
>> or runtime in production. However, sometimes they need to query the values
>> of states.
>>
>>
>>
>> IMO, the reason caused this problem is because of the queryable state's
>> architecture. Currently, the queryable state clients interact with
>> query state client proxy components which host on each TaskManager. This
>> design is difficult to encapsulate the point of change and exposes too much
>> detail to the user.
>>
>>
>>
>> My personal idea is that we could introduce a really queryable state
>> server, named e.g. *QueryStateProxyServer* which would delegate all the
>> query state request and query the local registry then redirect the request
>> to the specific *QueryStateClientProxy*(runs on each TaskManager). The
>> server is the users really want to care about. And it would make the users
>> ignorant to the TaskManagers' address and proxies' port. The current
>> *QueryStateClientProxy* would become *QueryStateProxyClient*.
>>
>>
>>
>> Generally speaking, the roles of the QueryStateProxyServer list below:
>>
>>
>>
>>    - works as all the query client's proxy to receive all the request
>>    and send response;
>>    - a router to redirect the real query requests to the specific proxy
>>    client;
>>    - maintain route table registry (state <-> TaskManager,
>>    TaskManager<->proxy client address)
>>    - more fine-granted control, such as cache result, ACL, TTL, SLA(rate
>>    limit) and so on
>>
>> About the implementation, there are three opts:
>>
>>
>>
>> opt 1:
>>
>>
>>
>> Let the JobManager acts as the query proxy server.
>>
>> ·  pros: reuse the exists JM, do not need to introduce a new process can
>> reduce the complexity;
>>
>> ·  cons: would make JM heavy burdens, depends on the query frequency,
>> may impact on the stability
>>
>>
>>
>> [image: Screen Shot 2019-04-25 at 5.12.07 PM.png]
>>
>>
>>
>> opt 2:
>>
>>
>>
>> Introduce a new component  which runs as a single process and acts as the
>> query proxy server:
>>
>>
>>
>> ·  pros: reduce the burdens and make the JM more stability
>>
>> ·  cons: introduced a new component will make the implementation more
>> complexity
>>
>> [image: Screen Shot 2019-04-25 at 5.14.05 PM.png]
>>
>>
>>
>> opt 3 (suggestion comes from Stefan Richter):
>>
>>
>>
>> Combining the two opts, the query server could run as a single entry
>> point(process) and integrate with JobManager.
>>
>>
>>
>> If we keep it well encapsulated, the only difference would be how we
>> register new TMs with the query server in the different scenarios, in JM we
>> might have this information already, in standalone e.g. the TMs be started
>> with the query server address to register. This would give the convenience
>> to start QS with the JM and the flexibility for power user to reduce load
>> on their JM.
>>
>>
>>
>> IMO, the queryable state is a very valuable feature. It can let users
>> query some real-time measure results. I hope it will get the attention of
>> the community.
>>
>>
>>
>> It is just a roughly thought. If it is valuable to the community, I will
>> give a design draft.
>>
>>
>>
>> What's your opinion? Any feedback and comment are welcome!
>>
>>
>>
>> Best,
>>
>> Vino.
>>
>>
>>
>>

Re: RE: [DISCUSS] Improve Queryable State and introduce aQueryServerProxy component

Reply via email to