Re: [DISCUSS] Improve Queryable State and introduce a QueryServerProxy component

Yu Li Sat, 27 Apr 2019 21:26:25 -0700

Glad to see discussions around QueryableState in mailing list, and it seems
we have included a bigger scope in the discussion, that what's the data
model in Flink and how to (or is it possible to) use Flink as a database. I
suggest to open another thread for this bigger topic and personally I think
the first question should be answered is what's the relationship between
Flink ledger and QueryableState.

Back to the user scenario itself, I'd like to post two open questions about
QueryableState for ad-hoc query:
1. Currently the isolation level of QueryableState is *Read Uncommitted*
since failover might happen and cause data rollback. Although the
"uncommitted" data will be replayed again and get final consistency,
application will see unstable query result. Probably some kind of
applications could bare such drawback but what exactly?

2. Currently in Flink sink is more commonly regarded as the "result
partition" and state of operators in the pipeline more like "intermediate
data". Used for debugging purpose is easy to understand but not for ad-hoc
query. Or in another word, what makes user prefer querying the state data
instead of sink? Or why we need to query the intermediate data instead of
the result?

Further back to the original topic proposed in this thread about
introducing a QueryableStateProxy, I could see some careful consideration
on query load on the proxy. However, under heavy load the pressure is not
only on query serving but also on meta requesting, which is handled by JM
for now. So to release JM pressure, we should also extract the meta serving
task out, and my suggestion is to introduce a new component like
*StateMetaServer* and take over both query and meta serving
responsibilities.

Best Regards,
Yu

On Sat, 27 Apr 2019 at 11:58, vino yang <[email protected]> wrote:

> Hi Elias,
>
> I agree with your opinion that "*Flink jobs don't sufficiently meet these
> requirements to work as a replacement for a data store.*".  Actually, I
> think it's obviously not Flink's goal. If we think that the database
> contains the main two parts(inexactitude): data query and data store. What
> I and Paul mean is the former.
>
> Yes, you have mentioned it's major value: ad hoc and debugging(IMO,
> especially for the former). To give a real-time calculation result is very
> import for some scene(such as real-time measure for real-time OLAP) in a
> long-term (no-window or large window).
>
> So, my opinion: Queryable state is not dedicated to replacing data stores.
> However, if we could query state more conveniently, it makes the streaming
> works more like DB in query aspect.
>
> Best,
> Vino.
>
> Elias Levy <[email protected]> 于2019年4月27日周六 上午1:30写道：
>
> > On Fri, Apr 26, 2019 at 1:41 AM vino yang <[email protected]> wrote:
> >
> > > You are right, currently, the queryable state has few users. And I
> > totally
> > > agree with you, it makes the streaming works more like a DB.
> > >
> >
> > Alas, I don't think queryable state will really be used much in
> production
> > other than for ad hoc queries or debugging.  Real data stores at scale
> are
> > resilient, replicated, and with very low downtime.  In my opinion, Flink
> > jobs don't sufficiently meet these requirements to work as a replacement
> > for a data store.  Jobs too frequently fail and restart because of
> > checkpoint failures, particularly ones with large state.  And when a job
> > does restart, all too often local restore can't be used (e.g. if you
> loose
> > a node).  And since there is no fine grained job recovery and there is no
> > hot replicas of the data, all the state will need to be restored from the
> > DFS, which for something with large state can take a while.  It's a nice
> > idea, just not realistic in practice.
> >
>

Re: [DISCUSS] Improve Queryable State and introduce a QueryServerProxy component

Reply via email to