Hi Timo,

Thanks for your proposal. I think this is a nice feature for users and I
prefer option 3.

I only have one concern about the concept of pseudo-column or
system-column,
because this is the first time we introduce it in Flink SQL. The
confusion is similar to the
question of Benchao and Sergey about the propagation of pseudo-column.

>From my understanding, a pseudo-column can be get from an arbitrary query,
just similar to
ROWNUM in Oracle[1], such as :

SELECT *
FROM (SELECT * FROM employees ORDER BY employee_id)
WHERE ROWNUM < 11;

However, IIUC, the proposed "$rowtime" pseudo-column can only be got from
the physical table
and can't be got from queries even if the query propagates the rowtime
attribute. There was also
a discussion about adding a pseudo-column "_proctime" [2] to make lookup
join easier to use
which can be got from arbitrary queries. That "_proctime" may conflict with
the proposed
pseudo-column concept.

Did you consider making it as a built-in defined pseudo-column "$rowtime"
which returns the
time attribute value (if exists) or null (if non-exists) for every
table/query, and pseudo-column
 "$proctime" always returns PROCTIME() value for each table/query. In this
way, catalogs only need
 to provide a default rowtime attribute and users can get it in the same
way. And we don't need
to introduce the contract interface of "Metadata Key Prefix Constraint"
which is still a little complex
 for users and devs to understand.

Best,
Jark

[1]:
https://docs.oracle.com/cd/E11882_01/server.112/e41084/pseudocolumns009.htm#SQLRF00255
[2]: https://lists.apache.org/thread/7ln106qxyw8sp7ljq40hs2p1lb1gdwj5




On Fri, 28 Jul 2023 at 06:18, Alexey Leonov-Vendrovskiy <
vendrov...@gmail.com> wrote:

> >
> > `SELECT * FROM (SELECT $rowtime, * FROM t);`
> > Am I right that it will show `$rowtime` in output ?
>
>
> Yes, all explicitly selected columns become a part of the result (and
> intermediate) schema, and hence propagate.
>
> On Thu, Jul 27, 2023 at 2:40 PM Alexey Leonov-Vendrovskiy <
> vendrov...@gmail.com> wrote:
>
> > Thank you, Timo, for starting this FLIP!
> >
> > I propose the following change:
> >
> > Remove the requirement that DESCRIBE need to show system columns.
> >
> >
> > Some concrete vendor specific catalog implementations might prefer this
> > approach.
> > Usually the same system columns are available on all (or family) of
> > tables, and it can be easily captured in the documentation.
> >
> > For example, BigQuery does exactly this: there, pseudo-columns do not
> show
> > up in the table schema in any place, but can be accessed via reference.
> >
> > So I propose we:
> > a) Either we say that DESCRIBE doesn't show system columns,
> > b) Or leave this vendor-specific / or configurable via flag (if needed).
> >
> > Regards,
> > Alexey
> >
> > On Thu, Jul 27, 2023 at 3:27 AM Sergey Nuyanzin <snuyan...@gmail.com>
> > wrote:
> >
> >> Hi Timo,
> >>
> >> Thanks for the FLIP.
> >> I also tend to think that Option 3 is better.
> >>
> >> I would be also interested in a question mentioned by Benchao Li.
> >> And a similar question about nested queries like
> >> `SELECT * FROM (SELECT $rowtime, * FROM t);`
> >> Am I right that it will show `$rowtime` in output ?
> >>
> >>
> >> On Thu, Jul 27, 2023 at 6:58 AM Benchao Li <libenc...@apache.org>
> wrote:
> >>
> >> > Hi Timo,
> >> >
> >> > Thanks for the FLIP, I also like the idea and option 3 sounds good to
> >> me.
> >> >
> >> > I would like to discuss a case which is not mentioned in the current
> >> FLIP.
> >> > How are the "System column"s expressed in intermediate result, e.g.
> >> Join?
> >> > E.g. `SELECT * FROM t1 JOIN t2`, I guess it should not include "system
> >> > columns" from t1 and t2 as you proposed, and for `SELECT t1.$rowtime,
> *
> >> > FROM t1 JOIN t2`, it should also be valid.
> >> > Then the question is how to you plan to implement the "system
> columns",
> >> do
> >> > we need to add it to `RelNode` level? Or we just need to do it in the
> >> > parsing/validating phase?
> >> > I'm not sure that Calcite's "system column" feature is fully ready for
> >> this
> >> > since the code about this part is imported from the earlier project
> >> before
> >> > it gets into Apache, and has not been considered much in the past
> >> > development.
> >> >
> >> >
> >> > Jing Ge <j...@ververica.com.invalid> 于2023年7月26日周三 00:01写道:
> >> >
> >> > > Hi Timo,
> >> > >
> >> > > Thanks for your proposal. It is a very pragmatic feature. Among all
> >> > options
> >> > > in the FLIP, option 3 is one I prefer too and I'd like to ask some
> >> > > questions to understand your thoughts.
> >> > >
> >> > > 1. I did some research on pseudo columns, just out of curiosity, do
> >> you
> >> > > know why most SQL systems do not need any prefix with their pseudo
> >> > column?
> >> > > 2. Some platform providers will use ${variable_name} to define their
> >> own
> >> > > configurations and allow them to be embedded into SQL scripts. Will
> >> there
> >> > > be any conflict with option 3?
> >> > >
> >> > > Best regards,
> >> > > Jing
> >> > >
> >> > > On Tue, Jul 25, 2023 at 7:00 PM Konstantin Knauf <kna...@apache.org
> >
> >> > > wrote:
> >> > >
> >> > > > Hi Timo,
> >> > > >
> >> > > > this makes sense to me. Option 3 seems reasonable, too.
> >> > > >
> >> > > > Cheers,
> >> > > >
> >> > > > Konstantin
> >> > > >
> >> > > > Am Di., 25. Juli 2023 um 12:53 Uhr schrieb Timo Walther <
> >> > > > twal...@apache.org
> >> > > > >:
> >> > > >
> >> > > > > Hi everyone,
> >> > > > >
> >> > > > > I would like to start a discussion about introducing the concept
> >> of
> >> > > > > "System Columns" in SQL and Table API.
> >> > > > >
> >> > > > > The subject sounds bigger than it actually is. Luckily, Flink
> SQL
> >> > > > > already exposes the concept of metadata columns. And this
> >> proposal is
> >> > > > > just a slight adjustment for how metadata columns can be used as
> >> > system
> >> > > > > columns.
> >> > > > >
> >> > > > > The biggest problem of metadata columns currently is that a
> >> catalog
> >> > > > > implementation can't provide them by default because they would
> >> > affect
> >> > > > > `SELECT *` when adding another one.
> >> > > > >
> >> > > > > Looking forward to your feedback on FLIP-348:
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-348%3A+Support+System+Columns+in+SQL+and+Table+API
> >> > > > >
> >> > > > > Thanks,
> >> > > > > Timo
> >> > > > >
> >> > > >
> >> > > >
> >> > > > --
> >> > > > https://twitter.com/snntrable
> >> > > > https://github.com/knaufk
> >> > > >
> >> > >
> >> >
> >> >
> >> > --
> >> >
> >> > Best,
> >> > Benchao Li
> >> >
> >>
> >>
> >> --
> >> Best regards,
> >> Sergey
> >>
> >
>

Reply via email to