Hi Yu, OK, now I know your comments more clearly.
Now, answer your two questions: 1. the value of this work: As I mentioned in the last reply mail to you: "we found the queryable state is hard to use and it may cause few users to use this function. We may think the reason and the result affect each other. And IMO, currently, the queryable state's architecture caused this problem. So I opened a thread to see how to improve them." We try to improve this issue, to break the cycle of the reason and the result. About the queryable state, its value, I think it does not need to clarify, and the previous reply mail from others has verified it. We did not use this feature in critical scenarios, but there are many common scenarios suit this feature, e.g. : - calculations' period is very long, but need the more fine-grained real-time result, for example, get current measure value for real-time OLAP, get consume offset of message system and so on; - Debugging application - .... If the queryable state has a better use experience, IMO, more and more users would use this feature. 2. about duplicated work, I do not know. For now, the ledger project has not been joined into Flink's repository. But I can ping @Stephan, he maybe wants to answer this question. About a whole and global plan and view, I totally agree with you. I did not give more thought and details, I have replied to you about the reason: because I did not know the community's opinion and if it can be added in Flink's roadmap. All right, we can discuss more details. IMO, a more completed solution may contain these : - refactor query client's API, with meta-service, we may provide more useful API, e.g. scan all keys or scan a key range and so on, obviously, the client API need to adjust to provide new information for query; - introduce a query proxy server, which contains request router, metadata manage/sync, ACL, SLA, and more plugin(I think a plugin architecture is a good choice) or sub-component; - interact with JobManager - interact with TaskManager - plugin's loading strategy - refactor the real querier runs on each TaskManager, it needs to interact with the query proxy server; Obviously, each step can also be split into several steps. Hope for your suggestion and guidance. Any questions, pls let me know. Best, Vino Yu Li <l...@apache.org> 于2019年4月28日周日 下午3:40写道: > TL;DR: IMO a more complete solution is to cover both query and meta request > serving in a new component. We could use the proposal here as step one but > we should have a global plan. And before improving a seemingly not widely > used feature, we'd better weigh the gain and efforts. > > Let me clarify the purpose of my previous questions, that before we start > detailed design and code development, it's better to get consensus on: > 1. What's the value of the work? > - As noticed, the queryable state feature has been implemented for some > while but not widely used in production (AFAIK), why? If it did been used > in critical scenarios, what those scenarios are? > - I think it's a good time discussing about this (since raised in this > thread by others) and confirm the value of efforts improving this feature. > 2. Would there be duplicated work? > - This is the main reason I asked about the relationship between ledger > and queryable-state. > > And some answers to the inline comments: > > bq. About the relationship between ledger and Queryable State, I also think > it is out of this thread > True, that's why I suggested to open another thread. But as mentioned > above, the question is relative if we think about the whole. > > bq. Yes, the QueryableState's isolation level is *Read Uncommitted*... > However, I think it would not affect we discuss how to improve the > queryable state's architecture, right? > Correct, but my real question here is what kind of application could bear > the changing query result. > > bq. The intermediate data is also valuable, for example, we just need a > partitioned data stream's real-time measure value. > In this case there must be some complicated operation in the pipeline which > causes long latency at sink? Could you talk more about the real-world case? > Thanks. > > bq. Your worry is reasonable. > Then I suggest to think it as a whole. We could split the implementation > into steps, but better to have a global plan, to make it really applicable > in production (under heavy load). > > Best Regards, > Yu > > > On Sun, 28 Apr 2019 at 14:48, vino yang <yanghua1...@gmail.com> wrote: > > > Hi yu, > > > > Thanks for your reply. I have some inline comment. > > > > Yu Li <l...@apache.org> 于2019年4月28日周日 下午12:24写道: > > > > > Glad to see discussions around QueryableState in mailing list, and it > > seems > > > we have included a bigger scope in the discussion, that what's the data > > > model in Flink and how to (or is it possible to) use Flink as a > > database. I > > > suggest to open another thread for this bigger topic and personally I > > think > > > the first question should be answered is what's the relationship > between > > > Flink ledger and QueryableState. > > > > > > > *About the scope, yes, it seems it's big. Actually, I think the questions > > you provided make it bigger than I have done.* > > *Here I think we don't need to answer the two questions(we can discuss in > > another thread, or answer it later).* > > > > *My original thought is that we found the queryable state is hard to use > > and it may cause few users to use this function. We may think the reason > > and the result affect each other. And IMO, currently, the queryable > state's > > architecture caused this problem. So I opened a thread to see how to > > improve them. * > > > > *We mentioned these keywords e.g. "state、database" is to emphasize the > > queryable state is very important. The data model and use Flink as a > > database is not this thread's main topic (as Elias's reply said, many > > issues cause the road to this goal is so long). This thread I assume we > do > > not change the state's core design and the goal is to bring a better > query > > solution.* > > > > *About the relationship between ledger and Queryable State, I also think > it > > is out of this thread.* > > > > > > > > > > Back to the user scenario itself, I'd like to post two open questions > > about > > > QueryableState for ad-hoc query: > > > 1. Currently the isolation level of QueryableState is *Read > Uncommitted* > > > since failover might happen and cause data rollback. Although the > > > "uncommitted" data will be replayed again and get final consistency, > > > application will see unstable query result. Probably some kind of > > > applications could bare such drawback but what exactly? > > > > > > > *Yes, the QueryableState's isolation level is *Read Uncommitted*. I think > > if we need a higher isolation level, may need other mechanisms to > guarantee > > this. I am sorry, I can not give the solution.* > > *However, I think it would not affect we discuss how to improve the > > queryable state's architecture, right?* > > > > > > > > > > 2. Currently in Flink sink is more commonly regarded as the "result > > > partition" and state of operators in the pipeline more like > "intermediate > > > data". Used for debugging purpose is easy to understand but not for > > ad-hoc > > > query. Or in another word, what makes user prefer querying the state > data > > > instead of sink? Or why we need to query the intermediate data instead > of > > > the result? > > > > > > > > *About the opinion that state of operators in the pipeline more like > > "intermediate data". Yes, you are right. It's intermediate data, and we > > need it in some scene.* > > *The valuable is that it represents "real-time". When querying a state, > we > > need its current value, we can not wait for sink. The intermediate data > is > > also valuable, for example, we just need a partitioned data stream's > > real-time measure value.* > > > > > > > Further back to the original topic proposed in this thread about > > > introducing a QueryableStateProxy, I could see some careful > consideration > > > on query load on the proxy. However, under heavy load the pressure is > not > > > only on query serving but also on meta requesting, which is handled by > JM > > > for now. So to release JM pressure, we should also extract the meta > > serving > > > task out, and my suggestion is to introduce a new component like > > > *StateMetaServer* and take over both query and meta serving > > > responsibilities. > > > > > > > *I think the opinion of metadata's pressure and *StateMetaServer* are > good. > > We need to care about them when we design.* > > *I mentioned the meta info(registry) in the two option's simple > > architecture picture. Although, I just emphasized the query proxy server, > > because it is the main component.* > > > > *Your worry is reasonable. The proxy server's architecture is good for > > processing this, such as the mechanisms of request flow control, pressure > > transfer to a single entry point(for opt2 and opt3, we can serve > meta-query > > in a single process).* > > > > *Anyway, it just opened a discussion to listen to the community's > opinion.* > > > > > > > > > > Best Regards, > > > Yu > > > > > > > > > On Sat, 27 Apr 2019 at 11:58, vino yang <yanghua1...@gmail.com> wrote: > > > > > > > Hi Elias, > > > > > > > > I agree with your opinion that "*Flink jobs don't sufficiently meet > > these > > > > requirements to work as a replacement for a data store.*". > Actually, I > > > > think it's obviously not Flink's goal. If we think that the database > > > > contains the main two parts(inexactitude): data query and data store. > > > What > > > > I and Paul mean is the former. > > > > > > > > Yes, you have mentioned it's major value: ad hoc and debugging(IMO, > > > > especially for the former). To give a real-time calculation result is > > > very > > > > import for some scene(such as real-time measure for real-time OLAP) > in > > a > > > > long-term (no-window or large window). > > > > > > > > So, my opinion: Queryable state is not dedicated to replacing data > > > stores. > > > > However, if we could query state more conveniently, it makes the > > > streaming > > > > works more like DB in query aspect. > > > > > > > > Best, > > > > Vino. > > > > > > > > Elias Levy <fearsome.lucid...@gmail.com> 于2019年4月27日周六 上午1:30写道: > > > > > > > > > On Fri, Apr 26, 2019 at 1:41 AM vino yang <yanghua1...@gmail.com> > > > wrote: > > > > > > > > > > > You are right, currently, the queryable state has few users. And > I > > > > > totally > > > > > > agree with you, it makes the streaming works more like a DB. > > > > > > > > > > > > > > > > Alas, I don't think queryable state will really be used much in > > > > production > > > > > other than for ad hoc queries or debugging. Real data stores at > > scale > > > > are > > > > > resilient, replicated, and with very low downtime. In my opinion, > > > Flink > > > > > jobs don't sufficiently meet these requirements to work as a > > > replacement > > > > > for a data store. Jobs too frequently fail and restart because of > > > > > checkpoint failures, particularly ones with large state. And when > a > > > job > > > > > does restart, all too often local restore can't be used (e.g. if > you > > > > loose > > > > > a node). And since there is no fine grained job recovery and there > > is > > > no > > > > > hot replicas of the data, all the state will need to be restored > from > > > the > > > > > DFS, which for something with large state can take a while. It's a > > > nice > > > > > idea, just not realistic in practice. > > > > > > > > > > > > > > >