Regarding the content of a `blink-1.5` branch, is it possible to rebase
the big Blink commit on top of the current master or the last Flink release?
I don't mean a full rebase here, but just forking the branch from
current Flink, and putting the Blink content into the repository, and
commit it. This would enable to see a diff which classes and lines have
changed and which are still the same. I guess this would be very helpful
instead of a branch with a big commit that has no common origin.
Thanks,
Timo
Am 24.01.19 um 02:54 schrieb Becket Qin:
Thanks Stephan,
The plan makes sense to me.
Regarding the docs, it seems better to have a separate versioned website
because there are a lot of changes spread over the places. We can add the
banner to remind users that they are looking at the blink docs, which is
temporary and will eventually be merged into Flink master. (The banner is
pretty similar to what user will see when they visit docs of old flink
versions
<https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/libs/ml/quickstart.html>
[1]).
Thanks,
Jiangjie (Becket) Qn
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/libs/ml/quickstart.html
On Thu, Jan 24, 2019 at 6:21 AM Shaoxuan Wang <wshaox...@gmail.com> wrote:
Thanks Stephan,
The entire plan looks good to me. WRT the "Docs for Flink", a subsection
should be good enough if we just introduce the outlines of what blink has
changed. However, we have made detailed introductions to blink based on the
framework of current release document of Flink (those introductions are
distributed in each subsections). Does it make sense to create a blink
document as a separate one, under the documentation section, say blink-1.5
(temporary, not a release).
Regards,
Shaoxuan
On Wed, Jan 23, 2019 at 10:15 PM Stephan Ewen <se...@apache.org> wrote:
Nice to see this lively discussion.
*--- Branch Versus Repository ---*
Looks like this is converging towards pushing a branch.
How about naming the branch simply "blink-1.5" ? That would be in line
with
the 1.5 version branch of Flink, which is simply called "release-1.5" ?
*--- SGA --- *
The SGA (Software Grant Agreement) should be either filed already or in
the
process of filing.
*--- Offering Jars for Blink ---*
As Chesnay and Timo mentioned, we cannot easily offer a "Release" of
Blink
(source or binary), because that would require a thorough
checking of licenses and creating/ bundling license files. That is a lot
of
work, as we recently experienced again in the Flink master.
What we can do is upload compiled jar files and link to them somewhere in
the blink docs. We need to add a disclaimer that these are
convenience jars, and not an official Apache release. I hope that would
work for the users that are curious to try things out.
*--- Docs for Blink --- *
Do we need a versioned website here? If not, can we simply make this a
subsection of the current Flink snapshot docs?
Next to "Flink Development" and "Internals", we could have a section on
"Blink branch".
I think it is crucial, thought, to make it clear that this is temporary
and
will eventually be subsumed by the main release, just
so that users do not get confused.
Best,
Stephan
On Wed, Jan 23, 2019 at 12:23 PM Becket Qin <becket....@gmail.com>
wrote:
Really excited to see Blink joining the Flink community!
My two cents regarding repo v.s. branch, I am +1 for a branch in Flink.
Among many things, what's most important at this point is probably to
make
Blink code available to the developers so people can discuss the merge
strategy. Creating a branch is probably the one of the fastest way to
do
that. We can always create separate repo later if necessary.
WRT the doc and jar distribution, It is true that we are going to have
some major refactoring to the code. But I can imagine some curious
users
may still want to try out something in Blink and it would be good if we
can
do them a favor. Legal wise, my hunch is that it is probably OK for
someone
to just build the jars and docs, host it somewhere for convenience. But
it
should be clear that this is just for convenience purpose instead of an
official release form Apache (unless we would like to make it
official).
Thanks,
Jiangjie (Becket) Qin
On Wed, Jan 23, 2019 at 6:48 PM Chesnay Schepler <ches...@apache.org>
wrote:
From the ASF side Jar files do notrequire a vote/release process,
this
is at the discretion of the PMC.
However, I have my doubts whether at this time we could even create a
source release of Blink given that we'd have to vet the code-base
first.
Even without source release we could still distribute jars, but would
not be allowed to advertise them to users as they do not constitute an
official release.
On 23.01.2019 11:41, Timo Walther wrote:
As far as I know it, we will not provide any binaries but only the
source code. JAR files on Apache servers would need an official
voting/release process. Interested users can build Blink themselves
using `mvn clean package`.
@Stephan: Please correct me if I'm wrong.
Regards,
Timo
Am 23.01.19 um 11:16 schrieb Kurt Young:
Hi Timo,
What about the jar files, will blink's jar be uploaded to apache
repository? If not, i think it will be very inconvenient for users
who
wants to try blink and view the documents if they need some help
from
doc.
Best,
Kurt
On Wed, Jan 23, 2019 at 6:09 PM Timo Walther <twal...@apache.org>
wrote:
Hi Kurt,
I would not make the Blink's documentation visible to users or
search
engines via a website. Otherwise this would communicate that Blink
is an
official release. I would suggest to put the Blink docs into
`/docs`
and
people can build it with `./docs/build.sh -pi` if there are
interested.
I would not invest time into setting up a docs infrastructure.
Regards,
Timo
Am 23.01.19 um 08:56 schrieb Kurt Young:
Thanks @Stephan for this exciting announcement!
>From my point of view, i would prefer to use branch. It makes
the
message
"Blink is pat of Flink" more straightforward and clear.
Except for the location of blink codes, there are some other
questions
like
what version should should use, and where do we put blink's
documents.
Currently, we choose to use "1.5.1-blink-r0" as blink's version
since
blink
forked from Flink's 1.5.1. We also added some docs to blink just
as
Flink
did. Can blink use a website like
"https://ci.apache.org/projects/flink/flink-docs-release-1.7/"
to
put
all
blink's docs, change it to something like
https://ci.apache.org/projects/flink/flink-docs-blink-r0/ ?
Best,
Kurt
On Wed, Jan 23, 2019 at 10:55 AM Hequn Cheng <
chenghe...@gmail.com
wrote:
Hi all,
@Stephan Thanks a lot for driving these efforts. I think a lot
of
people
is already waiting for this.
+1 for opening the blink source code.
Both a separate repository or a special branch is ok for me.
Hopefully,
this will not last too long.
Best, Hequn
On Tue, Jan 22, 2019 at 11:35 PM Jark Wu <imj...@gmail.com>
wrote:
Great news! Looking forward to the new wave of developments.
If Blink needs to be continuously updated, fix bugs, release
versions,
maybe a separate repository is a better idea.
Best,
Jark
On Tue, 22 Jan 2019 at 18:29, Dominik Wosiński <
wos...@gmail.com
wrote:
Hey!
I also think that creating the separate branch for Blink in
Flink repo
is a
better idea than creating the fork as IMHO it will allow
merging
changes
more easily.
Best Regards,
Dom.
wt., 22 sty 2019 o 10:09 Ufuk Celebi <u...@apache.org>
napisał(a):
Hey Stephan and others,
thanks for the summary. I'm very excited about the outlined
improvements.
:-)
Separate branch vs. fork: I'm fine with either of the
suggestions.
Depending on the expected strategy for merging the changes,
expected
number of additional changes, etc., either one or the other
approach
might be better suited.
– Ufuk
On Tue, Jan 22, 2019 at 9:20 AM Kurt Young <ykt...@gmail.com
wrote:
Hi Driesprong,
Glad to hear that you're interested with blink's codes.
Actually,
blink
only has one branch by itself, so either a separated repo
or a
flink's
branch works for blink's code share.
Best,
Kurt
On Tue, Jan 22, 2019 at 2:30 PM Driesprong, Fokko
<fo...@driesprong.frl
wrote:
Great news Stephan!
Why not make the code available by having a fork of Flink
on
Alibaba's
Github account. This will allow us to do easy diff's in the
Github
UI
and
create PR's of cherry-picked commits if needed. I can
imagine
that
the
Blink codebase has a lot of branches by itself, so just
pushing a
couple of
branches to the main Flink repo is not ideal. Looking
forward
to
it!
Cheers, Fokko
Op di 22 jan. 2019 om 03:48 schreef Shaoxuan Wang <
wshaox...@gmail.com
:
big +1 to contribute Blink codebase directly into the
Apache
Flink
project.
Looking forward to the new journey.
Regards,
Shaoxuan
On Tue, Jan 22, 2019 at 3:52 AM Xiaowei Jiang <
xiaow...@gmail.com>
wrote:
Thanks Stephan! We are hoping to make the process as
non-disruptive as
possible to the Flink community. Making the Blink
codebase
public
is
the
first step that hopefully facilitates further
discussions.
Xiaowei
On Monday, January 21, 2019, 11:46:28 AM PST,
Stephan
Ewen
<
se...@apache.org> wrote:
Dear Flink Community!
Some of you may have heard it already from announcements
or
from
a
Flink
Forward talk:
Alibaba has decided to open source its in-house
improvements
to
Flink,
called Blink!
First of all, big thanks to team that developed these
improvements
and
made
this
contribution possible!
Blink has some very exciting enhancements, most
prominently
on
the
Table
API/SQL side
and the unified execution of these programs. For batch
(bounded)
data,
the
SQL execution
has full TPC-DS coverage (which is a big deal), and the
execution
is
more
than 10x faster
than the current SQL runtime in Flink. Blink has also
added
support for
catalogs,
improved the failover speed of batch queries and the
resource
management.
It also
makes some good steps in the direction of more deeply
unifying
the
batch
and streaming
execution.
The proposal is to merge Blink's enhancements into Flink,
to
give
Flink's
SQL/Table API and
execution a big boost in usability and performance.
Just to avoid any confusion: This is not a suggested
change
of
focus to
batch processing,
nor would this break with any of the streaming
architecture
and
vision
of
Flink.
This contribution follows very much the principle of
"batch
is
a
special
case of streaming".
As a special case, batch makes special optimizations
possible.
In
its
current state,
Flink does not exploit many of these optimizations. This
contribution
adds
exactly these
optimizations and makes the streaming model of Flink
applicable
to
harder
batch use cases.
Assuming that the community is excited about this as
well,
and
in
favor
of
these enhancements
to Flink's capabilities, below are some thoughts on how
this
contribution
and integration
could work.
--- Making the code available ---
At the moment, the Blink code is in the form of a big
Flink
fork
(rather
than isolated
patches on top of Flink), so the integration is
unfortunately
not
as
easy
as merging a
few patches or pull requests.
To support a non-disruptive merge of such a big
contribution, I
believe
it
make sense to make
the code of the fork available in the Flink project
first.
From there on, we can start to work on the details for
merging
the
enhancements, including
the refactoring of the necessary parts in the Flink
master
and
the
Blink
code to make a
merge possible without repeatedly breaking compatibility.
The first question is where do we put the code of the
Blink
fork
during
the
merging procedure?
My first thought was to temporarily add a repository
(like
"flink-blink-staging"), but we could
also put it into a special branch in the main Flink
repository.
I will start a separate thread about discussing a
possible
strategy to
handle and merge
such a big contribution.
Best,
Stephan