Hi Fabian,

thank you for summarizing the most important issues. I already worked on FLINK-3152 / FLINK-3580 but stopped in favor of FLINK-3859. I will open a PR for FLINK-3859 very soon, just need to rebase it onto the latest validation layer and do some testing. Unfortunately I'm on vacation next week. I would like to take care of the above issues. I can also help coordinating the next weeks. What are the plans for 1.1.0 release so far?

Regards,
Timo

On 20.05.2016 14:49, Yijie Shen wrote:
Hi Fabian,

The priority seems reasonable to me. I skimmed through your TPCH branch and
find data type and related functions would be quite important to enable
most of TPC-H queries.

I would try out your TPC-H branch and pick up some issues you filed these
two days for the 1.1.0 release.

Hope this helps :)

Best,
Yijie

On Thu, May 19, 2016 at 11:59 PM, Fabian Hueske <fhue...@gmail.com> wrote:

Hi everybody,

I'd like to start a discussion about blocking issues and outstanding
features of the Table API and SQL for the 1.1.0 release. As you probably
know, the Table API was completely reworked and ported to Apache Calcite.
Moreover, we added initial support for SQL on batch and streaming tables.

We have come quite far but there are still a couple of issue that need to
be resolved before we can release a new version of Flink. I would like to
start collecting and prioritizing issues such that we can work towards a
feature set that we would like to be included in the next release. In order
to prepare this list, I tried to execute the TPC-H query set using the
currently supported SQL feature set. Only one (Q18) out  of the 22 queries
could be executed. The others failed due to unsupported features or bugs.

In the following, I list issues ordered by priority that I think need be
resolved for the release.

     - FLINK-3728:  Detect unsupported operators and improve error messages.
While we can effectively prevent unsupported operations in the Table API,
this is not easily possible with SQL queries. At the moment, unsupported
operations are either not detected and translated into invalid plans or
throw a hard to understand exceptions.
     - FLINK-3859: Add support for DECIMAL. Without this feature, it is not
possible to use floating point literals in SQL queries.
     - FLINK-3152 / FLINK-3580: Add support for date types and date
functions.
     - FLINK-3586: Prevent AVG(LONG) overflow by using BigInteger as
intermediate data type.
     - FLINK-2971: Add support for outer joins (a PR for this issue exists
#1981)
     - FLINK-3936 : Add MIN / MAX aggregation function for BOOLEAN types
     - FLINK-3916: Add support for generic types which are handled by
     - FLINK-3723: This is an proposal to split the Table API select()
method into select() for projection and aggregate() for aggregations. At
the moment, both are handled by select() (such as in SQL) and internally
separated by the Table API. We should decide for Flink 1.1.0 whether to
implement the proposal or not.
     - FLINK-3871 / FLINK-3873: Add Table Source and TableSink for Avro
encoded Kafka sources
     - FLINK-3872 / FLINK-3874 : Add TableSource and TableSink for JSON
encoded Kafka sources
     - More TableSource / TableSinks

Please review this list, add issues that you think should go in as well,
and discuss the priorities of the features.
Also if you would like to get involved with improving the Table API / SQL,
drop a mail to the mailing list or a comment to a JIRA issue.

I think it would be good if somebody would coordinate these efforts. I
would be happy to do it. However, I will leave in one month for a
two-months parental leave and I don't know how much I can contribute in
that time. So if somebody would like to step up and help coordinating,
please let me and the others know.

Cheers, Fabian



--
Freundliche Grüße / Kind Regards

Timo Walther

Follow me: @twalthr
https://www.linkedin.com/in/twalthr

Reply via email to