Thanks for driving these efforts, Stephan! Great news that the Blink code base will be available for everyone soon. I already got access to it and the added functionality and improved architecture is impressive. There will be nice additions to Flink.

I guess the Blink code base will be continuously updated while the Flink community merged chunks of it, right? If yes, I would also be in favor of a separate repository similar to flink-shaded.

Regards,
Timo


Am 22.01.19 um 09:20 schrieb Kurt Young:
Hi Driesprong,

Glad to hear that you're interested with blink's codes. Actually, blink
only has one branch by itself, so either a separated repo or a flink's
branch works for blink's code share.

Best,
Kurt


On Tue, Jan 22, 2019 at 2:30 PM Driesprong, Fokko <fo...@driesprong.frl>
wrote:

Great news Stephan!

Why not make the code available by having a fork of Flink on Alibaba's
Github account. This will allow us to do easy diff's in the Github UI and
create PR's of cherry-picked commits if needed. I can imagine that the
Blink codebase has a lot of branches by itself, so just pushing a couple of
branches to the main Flink repo is not ideal. Looking forward to it!

Cheers, Fokko





Op di 22 jan. 2019 om 03:48 schreef Shaoxuan Wang <wshaox...@gmail.com>:

big +1 to contribute Blink codebase directly into the Apache Flink
project.
Looking forward to the new journey.

Regards,
Shaoxuan

On Tue, Jan 22, 2019 at 3:52 AM Xiaowei Jiang <xiaow...@gmail.com>
wrote:
  Thanks Stephan! We are hoping to make the process as non-disruptive as
possible to the Flink community. Making the Blink codebase public is
the
first step that hopefully facilitates further discussions.
Xiaowei

     On Monday, January 21, 2019, 11:46:28 AM PST, Stephan Ewen <
se...@apache.org> wrote:

  Dear Flink Community!

Some of you may have heard it already from announcements or from a
Flink
Forward talk:
Alibaba has decided to open source its in-house improvements to Flink,
called Blink!
First of all, big thanks to team that developed these improvements and
made
this
contribution possible!

Blink has some very exciting enhancements, most prominently on the
Table
API/SQL side
and the unified execution of these programs. For batch (bounded) data,
the
SQL execution
has full TPC-DS coverage (which is a big deal), and the execution is
more
than 10x faster
than the current SQL runtime in Flink. Blink has also added support for
catalogs,
improved the failover speed of batch queries and the resource
management.
It also
makes some good steps in the direction of more deeply unifying the
batch
and streaming
execution.

The proposal is to merge Blink's enhancements into Flink, to give
Flink's
SQL/Table API and
execution a big boost in usability and performance.

Just to avoid any confusion: This is not a suggested change of focus to
batch processing,
nor would this break with any of the streaming architecture and vision
of
Flink.
This contribution follows very much the principle of "batch is a
special
case of streaming".
As a special case, batch makes special optimizations possible. In its
current state,
Flink does not exploit many of these optimizations. This contribution
adds
exactly these
optimizations and makes the streaming model of Flink applicable to
harder
batch use cases.

Assuming that the community is excited about this as well, and in favor
of
these enhancements
to Flink's capabilities, below are some thoughts on how this
contribution
and integration
could work.

--- Making the code available ---

At the moment, the Blink code is in the form of a big Flink fork
(rather
than isolated
patches on top of Flink), so the integration is unfortunately not as
easy
as merging a
few patches or pull requests.

To support a non-disruptive merge of such a big contribution, I believe
it
make sense to make
the code of the fork available in the Flink project first.
 From there on, we can start to work on the details for merging the
enhancements, including
the refactoring of the necessary parts in the Flink master and the
Blink
code to make a
merge possible without repeatedly breaking compatibility.

The first question is where do we put the code of the Blink fork during
the
merging procedure?
My first thought was to temporarily add a repository (like
"flink-blink-staging"), but we could
also put it into a special branch in the main Flink repository.


I will start a separate thread about discussing a possible strategy to
handle and merge
such a big contribution.

Best,
Stephan


Reply via email to