Hi Wangsan,
yes, the Hive integration is limited so far. However, we provide an
external catalog feature [0] that allows you to implement custom logic
to retrieve Hive tables. I think it is not possible to do all you
operations in Flink's SQL API right now. For now, I think you need to
combine DataStream and SQL. E.g. the Hive lookups should happen in an
asychronous fashion to reduce latency [1]. As far as I know, JDBC does
not allow to retrieve records in a streaming fashion easily. That's why
there is only a TableSink but no Source. Stream joining is limited so
far. We will support window joins in the upcoming release and likely
provide a full history joins in 1.5. The Table & SQL API is still a
young API but the development happens quickly. If you are interested in
contributing, feel free to wring on the dev@ mailing list.
Regards,
Timo
[0]
https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/table/common.html#register-an-external-catalog
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/stream/asyncio.html
Am 11/20/17 um 1:27 PM schrieb wangsan:
Hi all,
I am currently learning table API and SQL in Flink. I noticed that Flink does
not support Hive tables as table source, and even JDBC table source are not
provided. There are cases we do need to join a stream table with static Hive or
other database tables to get more specific attributes, so how can I implements
this functionality. Do I need to implement my own dataset connectors to load
data from external tables using JDBC and register the dataset as table, or
should I provide an external catalog?
Thanks,
wangsan