Hi Elias, I agree with your opinion that "*Flink jobs don't sufficiently meet these requirements to work as a replacement for a data store.*". Actually, I think it's obviously not Flink's goal. If we think that the database contains the main two parts(inexactitude): data query and data store. What I and Paul mean is the former.
Yes, you have mentioned it's major value: ad hoc and debugging(IMO, especially for the former). To give a real-time calculation result is very import for some scene(such as real-time measure for real-time OLAP) in a long-term (no-window or large window). So, my opinion: Queryable state is not dedicated to replacing data stores. However, if we could query state more conveniently, it makes the streaming works more like DB in query aspect. Best, Vino. Elias Levy <fearsome.lucid...@gmail.com> 于2019年4月27日周六 上午1:30写道: > On Fri, Apr 26, 2019 at 1:41 AM vino yang <yanghua1...@gmail.com> wrote: > > > You are right, currently, the queryable state has few users. And I > totally > > agree with you, it makes the streaming works more like a DB. > > > > Alas, I don't think queryable state will really be used much in production > other than for ad hoc queries or debugging. Real data stores at scale are > resilient, replicated, and with very low downtime. In my opinion, Flink > jobs don't sufficiently meet these requirements to work as a replacement > for a data store. Jobs too frequently fail and restart because of > checkpoint failures, particularly ones with large state. And when a job > does restart, all too often local restore can't be used (e.g. if you loose > a node). And since there is no fine grained job recovery and there is no > hot replicas of the data, all the state will need to be restored from the > DFS, which for something with large state can take a while. It's a nice > idea, just not realistic in practice. >