Hi Shammon, Thank you for your answer and explanation, my latest experiment was a SELECT query and my assumptions were based on that, INSERT works as described.
Regarding the state of FLIP-295, I just checked out the recently created jiras [1] and if I can help out with any part, please let me know. Cheers, F [1] https://issues.apache.org/jira/browse/FLINK-32427 ------- Original Message ------- On Tuesday, June 27th, 2023 at 13:39, Shammon FY <zjur...@gmail.com> wrote: > > > Hi Ferenc, > > If I understand correctly, there will be two types of jobs in sql-gateway: > `SELECT` and `NON-SELECT` such as `INSERT`. > > 1. `SELECT` jobs need to collect results from Flink cluster in a > corresponding session of sql gateway, and when the session is closed, the > job should be canceled. These jobs are generally short queries similar to > OLAP and I think it may be acceptable. > > 2. `NON-SELECT` jobs may be batch or streaming jobs, and when the jobs are > submitted successfully, they won't be killed or canceled even if the > session or sql-gateway is closed. After these assignments are successfully > submitted, the lifecycle is no longer managed by SQL gateway. > > I don't know if it covers your usage scenario. Could you describe yours for > us to test and confirm? > > Best, > Shammon FY > > > On Tue, Jun 27, 2023 at 6:43 PM Ferenc Csaky ferenc.cs...@pm.me.invalid > > wrote: > > > Hi Jark, > > > > In the current implementation, any job submitted via the SQL Gateway has > > to be done through a session, cause all the operations are grouped under > > sessions. > > > > Starting from there, if I close a session, that will close the > > "SessionContext", which closes the "OperationManager" [1], and the > > "OperationManager" closes all submitted operations tied to that session > > [2], which results closing all the jobs executed in the session. > > > > Maybe I am missing something, but my experience is that the jobs I submit > > via the SQL Gateway are getting cleaned up on gateway session close. > > > > WDYT? > > > > Cheers, > > F > > > > [1] > > https://github.com/apache/flink/blob/149a5e34c1ed8d8943c901a98c65c70693915811/flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/service/context/SessionContext.java#L204 > > [2] > > https://github.com/apache/flink/blob/149a5e34c1ed8d8943c901a98c65c70693915811/flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/service/operation/OperationManager.java#L194 > > > > ------- Original Message ------- > > On Tuesday, June 27th, 2023 at 04:37, Jark Wu imj...@gmail.com wrote: > > > > > Hi Ferenc, > > > > > > But the job lifecycle doesn't tie to the SQL Gateway session. > > > Even if the session is closed, all the running jobs are not affected. > > > > > > Best, > > > Jark > > > > > > On Tue, 27 Jun 2023 at 04:14, Ferenc Csaky ferenc.cs...@pm.me.invalid > > > > > > wrote: > > > > > > > Hi Jark, > > > > > > > > Thank you for pointing out FLIP-295 abouth catalog persistence, I was > > > > not > > > > aware the current state. Although as far as I see, that persistent > > > > catalogs > > > > are necessary, but not sufficient achieving a "persistent gateway". > > > > > > > > The current implementation ties the job lifecycle to the SQL gateway > > > > session, so if it gets closed, it will cancel all the jobs. So that > > > > would > > > > be the next step I think. Any work or thought regarding this aspect? > > > > We are > > > > definitely willing to help out on this front. > > > > > > > > Cheers, > > > > F > > > > > > > > ------- Original Message ------- > > > > On Sunday, June 25th, 2023 at 06:23, Jark Wu imj...@gmail.com wrote: > > > > > > > > > Hi Ferenc, > > > > > > > > > > Making SQL Gateway to be an easy-to-use platform infrastructure of > > > > > Flink > > > > > SQL > > > > > is one of the important roadmaps 1. > > > > > > > > > > The persistence ability of the SQL Gateway is a major work in 1.18 > > > > > release. > > > > > One of the persistence demand is that the registered catalogs are > > > > > currently > > > > > kept in memory and lost when Gateway restarts. There is an accepted > > > > > FLIP > > > > > (FLIP-295)[2] target to resolve this issue and make Gateway can > > > > > persist > > > > > the > > > > > registered catalogs information into files or databases. > > > > > > > > > > I'm not sure whether this is something you are looking for? > > > > > > > > > > Best, > > > > > Jark > > > > > > > > > > [2]: > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-295%3A+Support+lazy+initialization+of+catalogs+and+persistence+of+catalog+configurations > > > > > > > On Fri, 23 Jun 2023 at 00:25, Ferenc Csaky ferenc.cs...@pm.me.invalid > > > > > > > > > > wrote: > > > > > > > > > > > Hello devs, > > > > > > > > > > > > I would like to open a discussion about persistence possibilitis > > > > > > for > > > > > > the > > > > > > SQL Gateway. At Cloudera, we are happy to see the work already > > > > > > done on > > > > > > this > > > > > > project and looking for ways to utilize it on our platform as > > > > > > well, but > > > > > > currently it lacks some features that would be essential in our > > > > > > case, > > > > > > where > > > > > > we could help out. > > > > > > > > > > > > I am not sure if any thought went into gateway persistence > > > > > > specifics > > > > > > already, and this feature could be implemented in fundamentally > > > > > > differnt > > > > > > ways, so I think the frist step could be to agree on the basics. > > > > > > > > > > > > First, in my opinion, persistence should be an optional feature of > > > > > > the > > > > > > gateway, that can be enabled if desired. There can be a lot of > > > > > > implementation details, but there can be some major directions to > > > > > > follow: > > > > > > > > > > > > - Utilize Hive catalog: The Hive catalog can already be used to > > > > > > have > > > > > > persistenct meta-objects, so the crucial thing that would be > > > > > > missing in > > > > > > this case is other catalogs. Personally, I would not pursue this > > > > > > option, > > > > > > because in my opinion it would limit the usability of this feature > > > > > > too > > > > > > much. > > > > > > - Serialize the session as is: Saving the whole session (or its > > > > > > context) > > > > > > 1 as is to durable storage, so it can be kept and picked up again. > > > > > > - Serialize the required elements (catalogs, tables, functions, > > > > > > etc.), > > > > > > not > > > > > > necessarily as a whole: The main point here would be to serialize a > > > > > > different object, so the persistent data will not be that > > > > > > sensitive to > > > > > > changes of the session (or its context). There can be numerous > > > > > > factors > > > > > > here, like try to keep the model close to the session itself, so > > > > > > the > > > > > > boilerplate required for the mapping can be kept to minimal, or > > > > > > focus > > > > > > on > > > > > > saving what is actually necessary, making the persistent storage > > > > > > more > > > > > > portable. > > > > > > > > > > > > WDYT? > > > > > > > > > > > > Cheers, > > > > > > F > > > > > > > > > > > > 1 > > > > https://github.com/apache/flink/blob/master/flink-table/flink-sql-gateway/src/main/java/org/apache/flink/table/gateway/service/session/Session.java