Hi Timo, Thanks for your proposal. I think this is a nice feature for users and I prefer option 3.
I only have one concern about the concept of pseudo-column or system-column, because this is the first time we introduce it in Flink SQL. The confusion is similar to the question of Benchao and Sergey about the propagation of pseudo-column. >From my understanding, a pseudo-column can be get from an arbitrary query, just similar to ROWNUM in Oracle[1], such as : SELECT * FROM (SELECT * FROM employees ORDER BY employee_id) WHERE ROWNUM < 11; However, IIUC, the proposed "$rowtime" pseudo-column can only be got from the physical table and can't be got from queries even if the query propagates the rowtime attribute. There was also a discussion about adding a pseudo-column "_proctime" [2] to make lookup join easier to use which can be got from arbitrary queries. That "_proctime" may conflict with the proposed pseudo-column concept. Did you consider making it as a built-in defined pseudo-column "$rowtime" which returns the time attribute value (if exists) or null (if non-exists) for every table/query, and pseudo-column "$proctime" always returns PROCTIME() value for each table/query. In this way, catalogs only need to provide a default rowtime attribute and users can get it in the same way. And we don't need to introduce the contract interface of "Metadata Key Prefix Constraint" which is still a little complex for users and devs to understand. Best, Jark [1]: https://docs.oracle.com/cd/E11882_01/server.112/e41084/pseudocolumns009.htm#SQLRF00255 [2]: https://lists.apache.org/thread/7ln106qxyw8sp7ljq40hs2p1lb1gdwj5 On Fri, 28 Jul 2023 at 06:18, Alexey Leonov-Vendrovskiy < vendrov...@gmail.com> wrote: > > > > `SELECT * FROM (SELECT $rowtime, * FROM t);` > > Am I right that it will show `$rowtime` in output ? > > > Yes, all explicitly selected columns become a part of the result (and > intermediate) schema, and hence propagate. > > On Thu, Jul 27, 2023 at 2:40 PM Alexey Leonov-Vendrovskiy < > vendrov...@gmail.com> wrote: > > > Thank you, Timo, for starting this FLIP! > > > > I propose the following change: > > > > Remove the requirement that DESCRIBE need to show system columns. > > > > > > Some concrete vendor specific catalog implementations might prefer this > > approach. > > Usually the same system columns are available on all (or family) of > > tables, and it can be easily captured in the documentation. > > > > For example, BigQuery does exactly this: there, pseudo-columns do not > show > > up in the table schema in any place, but can be accessed via reference. > > > > So I propose we: > > a) Either we say that DESCRIBE doesn't show system columns, > > b) Or leave this vendor-specific / or configurable via flag (if needed). > > > > Regards, > > Alexey > > > > On Thu, Jul 27, 2023 at 3:27 AM Sergey Nuyanzin <snuyan...@gmail.com> > > wrote: > > > >> Hi Timo, > >> > >> Thanks for the FLIP. > >> I also tend to think that Option 3 is better. > >> > >> I would be also interested in a question mentioned by Benchao Li. > >> And a similar question about nested queries like > >> `SELECT * FROM (SELECT $rowtime, * FROM t);` > >> Am I right that it will show `$rowtime` in output ? > >> > >> > >> On Thu, Jul 27, 2023 at 6:58 AM Benchao Li <libenc...@apache.org> > wrote: > >> > >> > Hi Timo, > >> > > >> > Thanks for the FLIP, I also like the idea and option 3 sounds good to > >> me. > >> > > >> > I would like to discuss a case which is not mentioned in the current > >> FLIP. > >> > How are the "System column"s expressed in intermediate result, e.g. > >> Join? > >> > E.g. `SELECT * FROM t1 JOIN t2`, I guess it should not include "system > >> > columns" from t1 and t2 as you proposed, and for `SELECT t1.$rowtime, > * > >> > FROM t1 JOIN t2`, it should also be valid. > >> > Then the question is how to you plan to implement the "system > columns", > >> do > >> > we need to add it to `RelNode` level? Or we just need to do it in the > >> > parsing/validating phase? > >> > I'm not sure that Calcite's "system column" feature is fully ready for > >> this > >> > since the code about this part is imported from the earlier project > >> before > >> > it gets into Apache, and has not been considered much in the past > >> > development. > >> > > >> > > >> > Jing Ge <j...@ververica.com.invalid> 于2023年7月26日周三 00:01写道: > >> > > >> > > Hi Timo, > >> > > > >> > > Thanks for your proposal. It is a very pragmatic feature. Among all > >> > options > >> > > in the FLIP, option 3 is one I prefer too and I'd like to ask some > >> > > questions to understand your thoughts. > >> > > > >> > > 1. I did some research on pseudo columns, just out of curiosity, do > >> you > >> > > know why most SQL systems do not need any prefix with their pseudo > >> > column? > >> > > 2. Some platform providers will use ${variable_name} to define their > >> own > >> > > configurations and allow them to be embedded into SQL scripts. Will > >> there > >> > > be any conflict with option 3? > >> > > > >> > > Best regards, > >> > > Jing > >> > > > >> > > On Tue, Jul 25, 2023 at 7:00 PM Konstantin Knauf <kna...@apache.org > > > >> > > wrote: > >> > > > >> > > > Hi Timo, > >> > > > > >> > > > this makes sense to me. Option 3 seems reasonable, too. > >> > > > > >> > > > Cheers, > >> > > > > >> > > > Konstantin > >> > > > > >> > > > Am Di., 25. Juli 2023 um 12:53 Uhr schrieb Timo Walther < > >> > > > twal...@apache.org > >> > > > >: > >> > > > > >> > > > > Hi everyone, > >> > > > > > >> > > > > I would like to start a discussion about introducing the concept > >> of > >> > > > > "System Columns" in SQL and Table API. > >> > > > > > >> > > > > The subject sounds bigger than it actually is. Luckily, Flink > SQL > >> > > > > already exposes the concept of metadata columns. And this > >> proposal is > >> > > > > just a slight adjustment for how metadata columns can be used as > >> > system > >> > > > > columns. > >> > > > > > >> > > > > The biggest problem of metadata columns currently is that a > >> catalog > >> > > > > implementation can't provide them by default because they would > >> > affect > >> > > > > `SELECT *` when adding another one. > >> > > > > > >> > > > > Looking forward to your feedback on FLIP-348: > >> > > > > > >> > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-348%3A+Support+System+Columns+in+SQL+and+Table+API > >> > > > > > >> > > > > Thanks, > >> > > > > Timo > >> > > > > > >> > > > > >> > > > > >> > > > -- > >> > > > https://twitter.com/snntrable > >> > > > https://github.com/knaufk > >> > > > > >> > > > >> > > >> > > >> > -- > >> > > >> > Best, > >> > Benchao Li > >> > > >> > >> > >> -- > >> Best regards, > >> Sergey > >> > > >