Re: [DISCUSS] Project build time and possible restructuring

Robert Metzger Wed, 15 Mar 2017 08:24:31 -0700

Thank you. Running on AWS is a good idea!
Let me know if you (or anybody else) wants to help me with the
infrastructure work! Any help is much appreciated (as I've said before, I
don't really have time for doing this, but it has to be done :) )


I'm against creating two new repositories. I fear that this introduces too
much complexity and too many repositories.
"flink" and "flink-libraries" are hopefully enough to get the build time
significantly down.
We can also consider putting the connectors into the "flink-libraries" repo
if we need to further reduce the build time.

We should probably move "flink-table" of out "flink-libraries" if we want
to keep "flink-table" in the main repo. (This would eliminate the
"flink-libraries" module from main.

Also, I agree that "flink-statebackend-rocksdb" is not correctly placed in
contrib anymore.


On Wed, Mar 15, 2017 at 4:07 PM, Greg Hogan <[email protected]> wrote:

> Robert, appreciate your kickstarting this task.
>
> We should compare the verification time with and without the listed
> modules. I’ll try to run this by tomorrow on AWS and on Travis.
>
> Should we maintain separate repos for flink-contrib and flink-libraries?
> Are you intending that we move flink-table out of flink-libraries (and
> perhaps flink-statebackend-rocksdb out of flink-contrib)?
>
> Greg
>
>
> > On Mar 15, 2017, at 9:55 AM, Robert Metzger <[email protected]> wrote:
> >
> > Thank you for looking into this Till.
> >
> > I think we then have to split the repositories.
> > My main motivation for doing this is that it seems to be the only
> feasible
> > way of scaling the community to allow more committers working on the
> > libraries.
> >
> > I'll take care of getting things started.
> >
> > As the next steps I propose to:
> > 1. Ask INFRA to rename https://git-wip-us.apache.org/repos/asf?p=flink-
> > connectors.git;a=summary to "flink-libraries"
> > 2. Ask INFRA to set up GitHub and travis integration for
> "flink-libraries"
> > 3. Put the code of "flink-ml", "flink-gelly", "flink-python",
> "flink-cep",
> > "flink-scala-shell", "flink-storm" into the new repository. (I decided
> > against moving flink-contrib there, because rocksdb is in the contrib
> > module, for flink-table, I'm undecided, but I kept it in the main repo
> > because its probably going to interact more with the core code in the
> > future)
> > I try to preserve the history of those modules when splitting them into
> the
> > new repo
> > 4. I'll close all pull requests against those modules in the main repo.
> > 5. I'll set up a minimal documentation page for the library repository,
> > similar to the main documentation.
> > 6. I'll update the documentation build process to build both
> documentations
> > & link them to each other
> > 7. I'll update the nightly deployment process to include both
> repositories
> > 8. I'll update the release script to create the Flink release out of both
> > repositories. In order to put the libraries into the opt/ dir of the
> > release, I'll need to change the build of "flink-dist" so that it first
> > builds flink core, then the libraries and then the core again with the
> > libraries as an additional dependency.
> >
> > The main question for the community is: do you agree with point 3 ? Would
> > you like to include more or less?
> >
> > I'll start with 1. and 2. tomorrow morning.
> >
> >
> >
> > On Wed, Mar 15, 2017 at 1:48 PM, Till Rohrmann <[email protected]>
> wrote:
> >
> >> In theory we could have a merging bot which solves the problem of the
> >> "commit window". Once the PR passes all tests and has enough +1s, the
> bot
> >> could do the merging and, thus, it effectively linearizes the merge
> >> process.
> >>
> >> I think the second point is actually a disadvantage because there is not
> >> such an immediate incentive/pressure to fix the broken module if it
> lives
> >> in a separate repository. Furthermore, breaking API changes in the core
> >> will most likely go unnoticed for some time in other modules which are
> not
> >> developed so actively. In the worst case these things will only be
> noticed
> >> when we try to make a release.
> >>
> >> But I also agree that we are not Google and we don't have the
> capacities to
> >> maintain such a smooth a build process that we can keep all the code in
> a
> >> single repository.
> >>
> >> I looked a bit into Gradle and as far as I can tell it offers some nice
> >> features wrt incrementally building projects. This would be beneficial
> for
> >> local development but it would not solve our build time problems on
> Travis.
> >> Gradle intends to introduce a task result cache which allows to reuse
> >> results across builds. This could help when building on Travis,
> however, it
> >> is not yet fully implemented. Moreover, migrating from Maven to Gradle
> >> won't come for free (there's simply no free lunch out there) and we
> might
> >> risk to introduce new bugs. Therefore, I would vote to split the
> repository
> >> in order to mitigate our current problems with Travis and the build
> time in
> >> general. Whether to use a different build system or not can then be
> >> discussed as an orthogonal question.
> >>
> >> Cheers,
> >> Till
> >>
> >> On Tue, Mar 14, 2017 at 8:05 PM, Stephan Ewen <[email protected]> wrote:
> >>
> >>> Some other thoughts on how repository split would help. I am not sure
> for
> >>> all of them, so please comment:
> >>>
> >>>  - There is less competition for a "commit window". It happens a lot
> >>> already that you run all tests and want to commit, but there was a
> commit
> >>> in the meantime. You rebase, need to re-test, again commit in the
> >> meantime.
> >>>    For a "linear" commit history, this may become a bottleneck
> >> eventually
> >>> as well.
> >>>
> >>>  - There is less risk of broken master. If one repository/modules
> breaks
> >>> its master, the others can still continue.
> >>>
> >>> Stephan
> >>>
> >>>
> >>> On Fri, Mar 10, 2017 at 12:20 PM, Till Rohrmann <[email protected]>
> >>> wrote:
> >>>
> >>>> Thanks for all your input. In order to wrap the discussion up I'd like
> >> to
> >>>> summarize the mentioned points:
> >>>>
> >>>> The problem of increasing build times and complexity of the project
> has
> >>>> been acknowledged. Ideally we would have everything in one repository
> >>> using
> >>>> an incremental build tool. Since Maven does not properly support this
> >> we
> >>>> would have to switch our build tool to something like Gradle, for
> >>> example.
> >>>>
> >>>> Another option is introducing build profiles for different sets of
> >>> modules
> >>>> as well as separating integration and unit tests. The third
> alternative
> >>>> would be creating sub-projects with their own repositories. I actually
> >>>> think that these two proposal are not necessarily exclusive and it
> >> would
> >>>> also make sense to have a separation between unit and integration
> tests
> >>> if
> >>>> we split the respository.
> >>>>
> >>>> The overall consensus seems to be that we don't want to split the
> >>> community
> >>>> and want to keep everything under the same umbrella. I think this is
> >> the
> >>>> right way to go, because otherwise some parts of the project could
> >> become
> >>>> second class citizens. Given that and that we continue using Maven, I
> >>> still
> >>>> think that creating sub-projects for the libraries, for example, could
> >> be
> >>>> beneficial. A split could reduce the project's complexity and make it
> >>>> potentially easier for libraries to get actively developed. The main
> >>>> concern is setting up the build infrastructure to aggregate docs from
> >>>> multiple repositories and making them publicly available.
> >>>>
> >>>> Since I started this thread and I would really like to see Flink's ML
> >>>> library being revived again, I'd volunteer investigating first whether
> >> it
> >>>> is doable establishing a proper incremental build for Flink. If that
> >>> should
> >>>> not be possible, I will look into splitting the repository, first only
> >>> for
> >>>> the libraries. I'll share my results with the community once I'm done
> >>> with
> >>>> the investigation.
> >>>>
> >>>> Cheers,
> >>>> Till
> >>>>
> >>>> On Fri, Feb 24, 2017 at 3:50 PM, Robert Metzger <[email protected]>
> >>>> wrote:
> >>>>
> >>>>> @Jin Mingjian: You can not use the paid travis version for open
> >> source
> >>>>> projects. It only works for private repositories (at least back then
> >>> when
> >>>>> we've asked them about that).
> >>>>>
> >>>>> @Stephan: I don't think that incremental builds will be available
> >> with
> >>>>> Maven anytime soon.
> >>>>>
> >>>>> I agree that we need to fix the build time issue on Travis. I've
> >>> recently
> >>>>> pushed a commit to use now three instead of two test groups.
> >>>>> But I don't think that this is feasible long-term solution.
> >>>>>
> >>>>> If this discussion is only about reducing the build and test time,
> >>>>> introducing build profiles for different components as Aljoscha
> >>> suggested
> >>>>> would solve the problem Till mentioned.
> >>>>> Also, if we decide that travis is not a good tool anymore for the
> >>>> testing,
> >>>>> I guess we can find a different solution. There are now competitors
> >> to
> >>>>> Travis that might be willing to offer a paid plan for an open source
> >>>>> project, or we set up our own infra on a server sponsored by one of
> >> the
> >>>>> contributing companies.
> >>>>> If we want to solve "community issues" with the change as well, then
> >> I
> >>>>> think its work the effort of splitting up Flink into different
> >>>>> repositories.
> >>>>>
> >>>>> Splitting up repositories is not a trivial task in my opinion. As
> >>> others
> >>>>> have mentioned before, we need to consider the following things:
> >>>>> - How are we doing to build the documentation? Ideally every repo
> >>> should
> >>>>> contain its docs, so we would need to pull them together when
> >> building
> >>>> the
> >>>>> main docs.
> >>>>> - How do organize the dependencies? If we have library repository
> >>> depend
> >>>> on
> >>>>> snapshot Flink versions, we need to make sure that the snapshot
> >>>> deployment
> >>>>> always works. This also means that people working on a library
> >>> repository
> >>>>> will pull from snapshot OR need to build first locally.
> >>>>> - We need to update the release scripts
> >>>>>
> >>>>> If we commit to do these changes, we need to assign at least one
> >>>> committer
> >>>>> (yes, in this case we need somebody who can commit, for example for
> >>>>> updating the buildbot stuff) who volunteers to do the change.
> >>>>> I've done a lot of infrastructure work in the past, but I'm currently
> >>>>> pretty booked with many other things, so I don't realistically see
> >>> myself
> >>>>> doing that. Max who used to work on these things is taking some time
> >>> off.
> >>>>> I think we need, best case 3 days for the change, worst case 5 days.
> >>> The
> >>>>> problem is that there are no "unit tests" for the infra stuff, so
> >> many
> >>>>> things are "trial and error" (like Apache's buildbot, our release
> >>>> scripts,
> >>>>> the doc scripts, maven stuff, nightly builds).
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Thu, Feb 23, 2017 at 1:33 PM, Stephan Ewen <[email protected]>
> >>> wrote:
> >>>>>
> >>>>>> If we can get a incremental builds to work, that would actually be
> >>> the
> >>>>>> preferred solution in my opinion.
> >>>>>>
> >>>>>> Many companies have invested heavily in making a "single
> >> repository"
> >>>> code
> >>>>>> base work, because it has the advantage of not having to
> >>> update/publish
> >>>>>> several repositories first.
> >>>>>> However, the strong prerequisite for that is an incremental build
> >>>> system
> >>>>>> that builds only (fine grained) what it has to build. I am not sure
> >>> how
> >>>>> we
> >>>>>> could make that work
> >>>>>> with Maven and Travis...
> >>>>>>
> >>>>>> On Wed, Feb 22, 2017 at 10:42 PM, Greg Hogan <[email protected]>
> >>>> wrote:
> >>>>>>
> >>>>>>> An additional option for reducing time to build and test is
> >>> parallel
> >>>>>>> execution. This would help users more than on TravisCI since
> >> we're
> >>>>>>> generally running on multi-core machines rather than VM slices.
> >>>>>>>
> >>>>>>> Is the idea that each user would only check out the modules that
> >> he
> >>>> or
> >>>>>> she
> >>>>>>> is developing with? For example, if a developer is not working on
> >>>>>>> flink-mesos or flink-yarn then the "flink-deploy" module would
> >> not
> >>> be
> >>>>>> clone
> >>>>>>> to their filesystem?
> >>>>>>>
> >>>>>>> We can run a TravisCI nightly build on each repo to validate
> >>> against
> >>>>> API
> >>>>>>> changes.
> >>>>>>>
> >>>>>>> Greg
> >>>>>>>
> >>>>>>> On Wed, Feb 22, 2017 at 12:24 PM, Fabian Hueske <
> >> [email protected]
> >>>>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>>> Hi everybody,
> >>>>>>>>
> >>>>>>>> I think this should be a discussion about the benefits and
> >>>> drawbacks
> >>>>> of
> >>>>>>>> separating the code into distinct repositories from a
> >> development
> >>>>> point
> >>>>>>> of
> >>>>>>>> view.
> >>>>>>>> So I agree with Stephan that we should not divide the community
> >>> by
> >>>>>>> creating
> >>>>>>>> separate groups of committers.
> >>>>>>>> Also the discussion about independent releases is not be
> >> strictly
> >>>>>> related
> >>>>>>>> to the decision, IMO.
> >>>>>>>>
> >>>>>>>> I see a few pros and cons for splitting the code base into
> >>> separate
> >>>>>>>> repositories which (I think) haven't been mentioned before:
> >>>>>>>> pros:
> >>>>>>>> - IDE setup will be leaner. It is not necessary to compile the
> >>>> whole
> >>>>>> code
> >>>>>>>> base to run a test after switching a branch.
> >>>>>>>> cons:
> >>>>>>>> - developing libraries features that require changes in the
> >> core
> >>> /
> >>>>> APIs
> >>>>>>>> become more time consuming due to back-and-forth between code
> >>>> bases.
> >>>>>>>> However, I think this is not very often the case.
> >>>>>>>>
> >>>>>>>> Aljoscha has good points as well. Many of the build issues
> >> could
> >>> be
> >>>>>>> solved
> >>>>>>>> by different build profiles and configurations.
> >>>>>>>>
> >>>>>>>> Best, Fabian
> >>>>>>>>
> >>>>>>>> 2017-02-22 14:59 GMT+01:00 Gábor Hermann <
> >> [email protected]
> >>>> :
> >>>>>>>>
> >>>>>>>>> @Stephan:
> >>>>>>>>>
> >>>>>>>>> Although I tried to raise some issues about splitting
> >>> committers,
> >>>>> I'm
> >>>>>>>>> still strongly in favor of some kind of restructuring. We
> >> just
> >>>> have
> >>>>>> to
> >>>>>>> be
> >>>>>>>>> conscious about the disadvantages.
> >>>>>>>>>
> >>>>>>>>> Not splitting the committers could leave the libraries in the
> >>>> same
> >>>>>>>>> stalling status, described by Till. Of course, dedicating
> >>> current
> >>>>>>>>> committers as shepherds of the libraries could easily resolve
> >>> the
> >>>>>>> issue.
> >>>>>>>>> But that requires time from current committers. It seems like
> >>>>>>> trade-offs
> >>>>>>>>> between code quality, speed of development, and committer
> >>>> efforts.
> >>>>>>>>>
> >>>>>>>>> From what I see in the discussion about ML, there are many
> >>> people
> >>>>>>> willing
> >>>>>>>>> to contribute as well as production use-cases. This means we
> >>>> could
> >>>>>> and
> >>>>>>>>> should move forward. However, the development speed is
> >>>>> significantly
> >>>>>>>> slowed
> >>>>>>>>> down by stalling PRs. The proposal for contributors helping
> >> the
> >>>>>> review
> >>>>>>>>> process did not really work out so far. In my opinion, either
> >>>> code
> >>>>>>>> quality
> >>>>>>>>> (by more easily accepting new committers) or some committer
> >>> time
> >>>>>>>>> (reviewing/merging) should be sacrificed to move forward. As
> >>> Till
> >>>>> has
> >>>>>>>>> indicated, it would be shameful if we let this contribution
> >>>> effort
> >>>>>> die.
> >>>>>>>>>
> >>>>>>>>> Cheers,
> >>>>>>>>> Gabor
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
>
>

Re: [DISCUSS] Project build time and possible restructuring

Reply via email to