Hi Kurt,

no there is no JIRA ticket yet. But in any case, I think it is better to have good testing infrastructure that abstracts source generation, sink generation, testing data etc. If we will introduce tableEnv.values() it will also not solve everything because time-based operations might need time attributes and so on.

Using DDL in tests should also be avoided because strings are even more difficult to maintain.

Regards,
Timo


On 08.02.20 04:29, Kurt Young wrote:
Hi Timo,

tableEnv.fromElements/values() sounds good, do we have a jira ticket to
track the issue?

Best,
Kurt


On Fri, Feb 7, 2020 at 10:56 PM Timo Walther <twal...@apache.org> wrote:

Hi Kurt,

Dawid is currently working on making a tableEnv.fromElements/values()
kind of source possible in the future. We can use this to replace some
of the tests. Otherwise I guess we should come up with a better test
infrastructure to make defining source not necessary anymore.

Regards,
Timo


On 07.02.20 11:24, Kurt Young wrote:
Thanks all for your feedback, since no objection has been raised, I've
created
https://issues.apache.org/jira/browse/FLINK-15950 to track this issue.

Since this issue would require lots of tests adjustment before it really
happen,
it won't be done in a short time. Feel free to give feedback anytime here
or in jira
if you have other opinions.

Best,
Kurt


On Wed, Feb 5, 2020 at 8:26 PM Kurt Young <ykt...@gmail.com> wrote:

Hi Zhenghua,

After removing TableSource::getTableSchema, during optimization, I could
imagine
the schema information might come from relational nodes such as
TableScan.

Best,
Kurt


On Wed, Feb 5, 2020 at 8:24 PM Kurt Young <ykt...@gmail.com> wrote:

Hi Jingsong,

Yes current TableFactory is not ideal for users to use either. I think
we
should
also spend some time in 1.11 to improve the usability of
TableEnvironment
when
users trying to read or write something. Automatic scheme inference
would
be
one of them. Other from this, we also support convert a DataStream to
Table, which
can serve some flexible requirements to read or write data.

Best,
Kurt


On Wed, Feb 5, 2020 at 7:29 PM Zhenghua Gao <doc...@gmail.com> wrote:

+1 to remove these methods.

One concern about invocations of TableSource::getTableSchema:
By removing such methods, we can stop calling
TableSource::getTableSchema
in some place(such
as BatchTableEnvImpl/TableEnvironmentImpl#validateTableSource,
ConnectorCatalogTable, TableSourceQueryOperation).

But in other place we need field types and names of the table
source(such
as
BatchExecLookupJoinRule/StreamExecLookupJoinRule,
PushProjectIntoTableSourceScanRule,
CommonLookupJoin).  So how should we deal with this?

*Best Regards,*
*Zhenghua Gao*


On Wed, Feb 5, 2020 at 2:36 PM Kurt Young <ykt...@gmail.com> wrote:

Hi all,

I'd like to bring up a discussion about removing registration of
TableSource and
TableSink in TableEnvironment as well as in ConnectTableDescriptor.
The
affected
method would be:

TableEnvironment::registerTableSource
TableEnvironment::fromTableSource
TableEnvironment::registerTableSink
ConnectTableDescriptor::registerTableSource
ConnectTableDescriptor::registerTableSink
ConnectTableDescriptor::registerTableSourceAndSink

(Most of them are already deprecated, except for
TableEnvironment::fromTableSource,
which was intended to deprecate but missed by accident).

FLIP-64 [1] already explained why we want to deprecate TableSource &
TableSink from
user's interface. In a short word, these interfaces should only read
&
write the physical
representation of the table, and they are not fitting well after we
already
introduced some
logical table fields such as computed column, watermarks.

Another reason is the exposure of registerTableSource in Table Env
just
make the whole
SQL protocol opposite. TableSource should be used as a reader of
table, it
should rely on
other metadata information held by framework, which eventually comes
from
DDL or
ConnectDescriptor. But if we register a TableSource to Table Env, we
have
no choice but
have to rely on TableSource::getTableSchema. It will make the design
obscure, sometimes
TableSource should trust the information comes from framework, but
sometimes it should
also generate its own schema information.

Furthermore, if the authority about schema information is not clear,
it
will make things much
more complicated if we want to improve the table api usability such
as
introducing automatic
schema inference in the near future.

Since this is an API break change, I've also included user mailing
list to
gather more feedbacks.

Best,
Kurt

[1]



https://cwiki.apache.org/confluence/display/FLINK/FLIP-64%3A+Support+for+Temporary+Objects+in+Table+module








Reply via email to