Hi Kurt,
Dawid is currently working on making a tableEnv.fromElements/values()
kind of source possible in the future. We can use this to replace some
of the tests. Otherwise I guess we should come up with a better test
infrastructure to make defining source not necessary anymore.
Regards,
Timo
On 07.02.20 11:24, Kurt Young wrote:
Thanks all for your feedback, since no objection has been raised, I've
created
https://issues.apache.org/jira/browse/FLINK-15950 to track this issue.
Since this issue would require lots of tests adjustment before it really
happen,
it won't be done in a short time. Feel free to give feedback anytime here
or in jira
if you have other opinions.
Best,
Kurt
On Wed, Feb 5, 2020 at 8:26 PM Kurt Young <ykt...@gmail.com> wrote:
Hi Zhenghua,
After removing TableSource::getTableSchema, during optimization, I could
imagine
the schema information might come from relational nodes such as TableScan.
Best,
Kurt
On Wed, Feb 5, 2020 at 8:24 PM Kurt Young <ykt...@gmail.com> wrote:
Hi Jingsong,
Yes current TableFactory is not ideal for users to use either. I think we
should
also spend some time in 1.11 to improve the usability of TableEnvironment
when
users trying to read or write something. Automatic scheme inference would
be
one of them. Other from this, we also support convert a DataStream to
Table, which
can serve some flexible requirements to read or write data.
Best,
Kurt
On Wed, Feb 5, 2020 at 7:29 PM Zhenghua Gao <doc...@gmail.com> wrote:
+1 to remove these methods.
One concern about invocations of TableSource::getTableSchema:
By removing such methods, we can stop calling TableSource::getTableSchema
in some place(such
as BatchTableEnvImpl/TableEnvironmentImpl#validateTableSource,
ConnectorCatalogTable, TableSourceQueryOperation).
But in other place we need field types and names of the table source(such
as
BatchExecLookupJoinRule/StreamExecLookupJoinRule,
PushProjectIntoTableSourceScanRule,
CommonLookupJoin). So how should we deal with this?
*Best Regards,*
*Zhenghua Gao*
On Wed, Feb 5, 2020 at 2:36 PM Kurt Young <ykt...@gmail.com> wrote:
Hi all,
I'd like to bring up a discussion about removing registration of
TableSource and
TableSink in TableEnvironment as well as in ConnectTableDescriptor. The
affected
method would be:
TableEnvironment::registerTableSource
TableEnvironment::fromTableSource
TableEnvironment::registerTableSink
ConnectTableDescriptor::registerTableSource
ConnectTableDescriptor::registerTableSink
ConnectTableDescriptor::registerTableSourceAndSink
(Most of them are already deprecated, except for
TableEnvironment::fromTableSource,
which was intended to deprecate but missed by accident).
FLIP-64 [1] already explained why we want to deprecate TableSource &
TableSink from
user's interface. In a short word, these interfaces should only read &
write the physical
representation of the table, and they are not fitting well after we
already
introduced some
logical table fields such as computed column, watermarks.
Another reason is the exposure of registerTableSource in Table Env just
make the whole
SQL protocol opposite. TableSource should be used as a reader of
table, it
should rely on
other metadata information held by framework, which eventually comes
from
DDL or
ConnectDescriptor. But if we register a TableSource to Table Env, we
have
no choice but
have to rely on TableSource::getTableSchema. It will make the design
obscure, sometimes
TableSource should trust the information comes from framework, but
sometimes it should
also generate its own schema information.
Furthermore, if the authority about schema information is not clear, it
will make things much
more complicated if we want to improve the table api usability such as
introducing automatic
schema inference in the near future.
Since this is an API break change, I've also included user mailing
list to
gather more feedbacks.
Best,
Kurt
[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-64%3A+Support+for+Temporary+Objects+in+Table+module