Re: [DISCUSS] Persistent SQL Gateway

Ferenc Csaky Tue, 27 Jun 2023 03:41:50 -0700

Hi Jark,

In the current implementation, any job submitted via the SQL Gateway has to be 
done through a session, cause all the operations are grouped under sessions.


Starting from there, if I close a session, that will close the 
"SessionContext", which closes the "OperationManager" [1], and the 
"OperationManager" closes all submitted operations tied to that session [2], 
which results closing all the jobs executed in the session.

Maybe I am missing something, but my experience is that the jobs I submit via 
the SQL Gateway are getting cleaned up on gateway session close.

WDYT?

Cheers,
F

[1] 
https://github.com/apache/flink/blob/149a5e34c1ed8d8943c901a98c65c70693915811/flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/service/context/SessionContext.java#L204
[2] 
https://github.com/apache/flink/blob/149a5e34c1ed8d8943c901a98c65c70693915811/flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/service/operation/OperationManager.java#L194



------- Original Message -------
On Tuesday, June 27th, 2023 at 04:37, Jark Wu <imj...@gmail.com> wrote:


> 
> 
> Hi Ferenc,
> 
> But the job lifecycle doesn't tie to the SQL Gateway session.
> Even if the session is closed, all the running jobs are not affected.
> 
> Best,
> Jark
> 
> 
> 
> 
> On Tue, 27 Jun 2023 at 04:14, Ferenc Csaky ferenc.cs...@pm.me.invalid
> 
> wrote:
> 
> > Hi Jark,
> > 
> > Thank you for pointing out FLIP-295 abouth catalog persistence, I was not
> > aware the current state. Although as far as I see, that persistent catalogs
> > are necessary, but not sufficient achieving a "persistent gateway".
> > 
> > The current implementation ties the job lifecycle to the SQL gateway
> > session, so if it gets closed, it will cancel all the jobs. So that would
> > be the next step I think. Any work or thought regarding this aspect? We are
> > definitely willing to help out on this front.
> > 
> > Cheers,
> > F
> > 
> > ------- Original Message -------
> > On Sunday, June 25th, 2023 at 06:23, Jark Wu imj...@gmail.com wrote:
> > 
> > > Hi Ferenc,
> > > 
> > > Making SQL Gateway to be an easy-to-use platform infrastructure of Flink
> > > SQL
> > > is one of the important roadmaps 1.
> > > 
> > > The persistence ability of the SQL Gateway is a major work in 1.18
> > > release.
> > > One of the persistence demand is that the registered catalogs are
> > > currently
> > > kept in memory and lost when Gateway restarts. There is an accepted FLIP
> > > (FLIP-295)[2] target to resolve this issue and make Gateway can persist
> > > the
> > > registered catalogs information into files or databases.
> > > 
> > > I'm not sure whether this is something you are looking for?
> > > 
> > > Best,
> > > Jark
> > > 
> > > [2]:
> > 
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-295%3A+Support+lazy+initialization+of+catalogs+and+persistence+of+catalog+configurations
> > 
> > > On Fri, 23 Jun 2023 at 00:25, Ferenc Csaky ferenc.cs...@pm.me.invalid
> > > 
> > > wrote:
> > > 
> > > > Hello devs,
> > > > 
> > > > I would like to open a discussion about persistence possibilitis for
> > > > the
> > > > SQL Gateway. At Cloudera, we are happy to see the work already done on
> > > > this
> > > > project and looking for ways to utilize it on our platform as well, but
> > > > currently it lacks some features that would be essential in our case,
> > > > where
> > > > we could help out.
> > > > 
> > > > I am not sure if any thought went into gateway persistence specifics
> > > > already, and this feature could be implemented in fundamentally
> > > > differnt
> > > > ways, so I think the frist step could be to agree on the basics.
> > > > 
> > > > First, in my opinion, persistence should be an optional feature of the
> > > > gateway, that can be enabled if desired. There can be a lot of
> > > > implementation details, but there can be some major directions to
> > > > follow:
> > > > 
> > > > - Utilize Hive catalog: The Hive catalog can already be used to have
> > > > persistenct meta-objects, so the crucial thing that would be missing in
> > > > this case is other catalogs. Personally, I would not pursue this
> > > > option,
> > > > because in my opinion it would limit the usability of this feature too
> > > > much.
> > > > - Serialize the session as is: Saving the whole session (or its
> > > > context)
> > > > 1 as is to durable storage, so it can be kept and picked up again.
> > > > - Serialize the required elements (catalogs, tables, functions, etc.),
> > > > not
> > > > necessarily as a whole: The main point here would be to serialize a
> > > > different object, so the persistent data will not be that sensitive to
> > > > changes of the session (or its context). There can be numerous factors
> > > > here, like try to keep the model close to the session itself, so the
> > > > boilerplate required for the mapping can be kept to minimal, or focus
> > > > on
> > > > saving what is actually necessary, making the persistent storage more
> > > > portable.
> > > > 
> > > > WDYT?
> > > > 
> > > > Cheers,
> > > > F
> > > > 
> > > > 1
> > 
> > https://github.com/apache/flink/blob/master/flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/service/session/Session.java

Re: [DISCUSS] Persistent SQL Gateway

Reply via email to