Hi Zhenghua, After removing TableSource::getTableSchema, during optimization, I could imagine the schema information might come from relational nodes such as TableScan.
Best, Kurt On Wed, Feb 5, 2020 at 8:24 PM Kurt Young <ykt...@gmail.com> wrote: > Hi Jingsong, > > Yes current TableFactory is not ideal for users to use either. I think we > should > also spend some time in 1.11 to improve the usability of TableEnvironment > when > users trying to read or write something. Automatic scheme inference would > be > one of them. Other from this, we also support convert a DataStream to > Table, which > can serve some flexible requirements to read or write data. > > Best, > Kurt > > > On Wed, Feb 5, 2020 at 7:29 PM Zhenghua Gao <doc...@gmail.com> wrote: > >> +1 to remove these methods. >> >> One concern about invocations of TableSource::getTableSchema: >> By removing such methods, we can stop calling TableSource::getTableSchema >> in some place(such >> as BatchTableEnvImpl/TableEnvironmentImpl#validateTableSource, >> ConnectorCatalogTable, TableSourceQueryOperation). >> >> But in other place we need field types and names of the table source(such >> as >> BatchExecLookupJoinRule/StreamExecLookupJoinRule, >> PushProjectIntoTableSourceScanRule, >> CommonLookupJoin). So how should we deal with this? >> >> *Best Regards,* >> *Zhenghua Gao* >> >> >> On Wed, Feb 5, 2020 at 2:36 PM Kurt Young <ykt...@gmail.com> wrote: >> >> > Hi all, >> > >> > I'd like to bring up a discussion about removing registration of >> > TableSource and >> > TableSink in TableEnvironment as well as in ConnectTableDescriptor. The >> > affected >> > method would be: >> > >> > TableEnvironment::registerTableSource >> > TableEnvironment::fromTableSource >> > TableEnvironment::registerTableSink >> > ConnectTableDescriptor::registerTableSource >> > ConnectTableDescriptor::registerTableSink >> > ConnectTableDescriptor::registerTableSourceAndSink >> > >> > (Most of them are already deprecated, except for >> > TableEnvironment::fromTableSource, >> > which was intended to deprecate but missed by accident). >> > >> > FLIP-64 [1] already explained why we want to deprecate TableSource & >> > TableSink from >> > user's interface. In a short word, these interfaces should only read & >> > write the physical >> > representation of the table, and they are not fitting well after we >> already >> > introduced some >> > logical table fields such as computed column, watermarks. >> > >> > Another reason is the exposure of registerTableSource in Table Env just >> > make the whole >> > SQL protocol opposite. TableSource should be used as a reader of table, >> it >> > should rely on >> > other metadata information held by framework, which eventually comes >> from >> > DDL or >> > ConnectDescriptor. But if we register a TableSource to Table Env, we >> have >> > no choice but >> > have to rely on TableSource::getTableSchema. It will make the design >> > obscure, sometimes >> > TableSource should trust the information comes from framework, but >> > sometimes it should >> > also generate its own schema information. >> > >> > Furthermore, if the authority about schema information is not clear, it >> > will make things much >> > more complicated if we want to improve the table api usability such as >> > introducing automatic >> > schema inference in the near future. >> > >> > Since this is an API break change, I've also included user mailing list >> to >> > gather more feedbacks. >> > >> > Best, >> > Kurt >> > >> > [1] >> > >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-64%3A+Support+for+Temporary+Objects+in+Table+module >> > >> >