Thank you for looking into the build times.

I didn't know that the build time situation is so bad. Even with yarn,
mesos, connectors and libraries removed, we are still running into the
build timeout :(

Aljoscha told me that the Beam community is using Jenkins for running the
tests, and they are planning to completely move away from Travis. I wonder
whether we should do the same, as having our own Jenkins servers would
allow us to run tests for more than 50 minutes.

I agree with Stephan that we should keep the yarn and mesos tests in the
core for stability / testing quality purposes.


On Mon, Mar 20, 2017 at 11:27 AM, Stephan Ewen <se...@apache.org> wrote:

> @Greg
>
> I am personally in favor of splitting "connectors" and "contrib" out as
> well. I know that @rmetzger has some reservations about the connectors, but
> we may be able to convince him.
>
> For the cluster tests (yarn / mesos) - in the past there were many cases
> where these tests caught cases that other tests did not, because they are
> the only tests that actually use the "flink-dist.jar" and thus discover
> many dependency and configuration issues. For that reason, my feeling would
> be that they are valuable in the core repository.
>
> I would actually suggest to do only the library split initially, to see
> what the challenges are in setting up the multi-repo build and release
> tooling. Once we gathered experience there, we can probably easily see what
> else we can split out.
>
> Stephan
>
>
> On Fri, Mar 17, 2017 at 8:37 PM, Greg Hogan <c...@greghogan.com> wrote:
>
> > I’d like to use this refactoring opportunity to unspilt the Travis tests.
> > With 51 builds queued up for the weekend (some of which may fail or have
> > been force pushed) we are at the limit of the number of contributions we
> > can process. Fixing this requires 1) splitting the project, 2)
> > investigating speedups for long-running tests, and 3) staying cognizant
> of
> > test performance when accepting new code.
> >
> > I’d like to add one to Stephan’s list of module group. I like that the
> > modules are generic (“libraries”) so that no one module is alone and
> > independent.
> >
> > Flink has three “libraries”: cep, ml, and gelly.
> >
> > “connectors” is a hotspot due to the long-running Kafka tests (and
> > connectors for three Kafka versions).
> >
> > Both flink-storm and flink-python have a modest number of number of tests
> > and could live with the miscellaneous modules in “contrib”.
> >
> > The YARN tests are long-running and problematic (I am unable to
> > successfully run these locally). A “cluster” module could host
> flink-mesos,
> > flink-yarn, and flink-yarn-tests.
> >
> > That gets us close to running all tests in a single Travis build.
> >   https://travis-ci.org/greghogan/flink/builds/212122590 <
> > https://travis-ci.org/greghogan/flink/builds/212122590>
> >
> > I also tested (https://github.com/greghogan/flink/commits/core_build <
> > https://github.com/greghogan/flink/commits/core_build>) with a maven
> > parallelism of 2 and 4, with the latter a 6.4% drop in build time.
> >   https://travis-ci.org/greghogan/flink/builds/212137659 <
> > https://travis-ci.org/greghogan/flink/builds/212137659>
> >   https://travis-ci.org/greghogan/flink/builds/212154470 <
> > https://travis-ci.org/greghogan/flink/builds/212154470>
> >
> > We can run Travis CI builds nightly to guard against breaking changes.
> >
> > I also wanted to get an idea of how disruptive it would be to developers
> > to divide the project into multiple git repos. I wrote a simple python
> > script and configured it with the module partitions listed above. The
> usage
> > string from the top of the file lists commits with files from multiple
> > partitions and well as the modified files.
> >   https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897 <
> > https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897>
> >
> > Accounting for the merging of the batch and streaming connector modules,
> > and assuming that the project structure has not changed much over the
> past
> > 15 months, for the following date ranges the listed number of commits
> would
> > have been split across repositories.
> >
> > since "2017-01-01"
> > 36 of 571 commits were mixed
> >
> > since "2016-07-01"
> > 155 of 1607 commits were mixed
> >
> > since "2016-01-01"
> > 272 of 2561 commits were mixed
> >
> > Greg
> >
> >
> > > On Mar 15, 2017, at 1:13 PM, Stephan Ewen <se...@apache.org> wrote:
> > >
> > > @Robert - I think once we know that a separate git repo works well, and
> > > that it actually solves problems, I see no reason to not create a
> > > connectors repository later. The infrastructure changes should be
> > identical
> > > for two or more repositories.
> > >
> > > On Wed, Mar 15, 2017 at 5:22 PM, Till Rohrmann <trohrm...@apache.org>
> > wrote:
> > >
> > >> I think it should not be at least the flink-dist but exactly the
> > remaining
> > >> flink-dist module. Otherwise we do redundant work.
> > >>
> > >> On Wed, Mar 15, 2017 at 5:03 PM, Robert Metzger <rmetz...@apache.org>
> > >> wrote:
> > >>
> > >>> "flink-core" means the main repository, not the "flink-core" module.
> > >>>
> > >>> When doing a release, we need to build the flink main code first,
> > because
> > >>> the flink-libraries depend on that.
> > >>> Once the "flink-libraries" are build, we need to run the main build
> > again
> > >>> (at least the flink-dist module), so that it is pulling the artifacts
> > >> from
> > >>> the flink-libraries to put them into the opt/ folder of the final
> > >> artifact.
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> On Wed, Mar 15, 2017 at 4:44 PM, Till Rohrmann <trohrm...@apache.org
> >
> > >>> wrote:
> > >>>
> > >>>> I'm ok with point 3.
> > >>>>
> > >>>> Concerning point 8: Why do we have to build flink-core twice after
> > >> having
> > >>>> it built as a dependency for flink-libraries? This seems wrong to
> me.
> > >>>>
> > >>>> Cheers,
> > >>>> Till
> > >>>>
> > >>>> On Wed, Mar 15, 2017 at 4:23 PM, Robert Metzger <
> rmetz...@apache.org>
> > >>>> wrote:
> > >>>>
> > >>>>> Thank you. Running on AWS is a good idea!
> > >>>>> Let me know if you (or anybody else) wants to help me with the
> > >>>>> infrastructure work! Any help is much appreciated (as I've said
> > >>> before, I
> > >>>>> don't really have time for doing this, but it has to be done :) )
> > >>>>>
> > >>>>> I'm against creating two new repositories. I fear that this
> > >> introduces
> > >>>> too
> > >>>>> much complexity and too many repositories.
> > >>>>> "flink" and "flink-libraries" are hopefully enough to get the build
> > >>> time
> > >>>>> significantly down.
> > >>>>> We can also consider putting the connectors into the
> > >> "flink-libraries"
> > >>>> repo
> > >>>>> if we need to further reduce the build time.
> > >>>>>
> > >>>>> We should probably move "flink-table" of out "flink-libraries" if
> we
> > >>> want
> > >>>>> to keep "flink-table" in the main repo. (This would eliminate the
> > >>>>> "flink-libraries" module from main.
> > >>>>>
> > >>>>> Also, I agree that "flink-statebackend-rocksdb" is not correctly
> > >> placed
> > >>>> in
> > >>>>> contrib anymore.
> > >>>>>
> > >>>>>
> > >>>>> On Wed, Mar 15, 2017 at 4:07 PM, Greg Hogan <c...@greghogan.com>
> > >>> wrote:
> > >>>>>
> > >>>>>> Robert, appreciate your kickstarting this task.
> > >>>>>>
> > >>>>>> We should compare the verification time with and without the
> listed
> > >>>>>> modules. I’ll try to run this by tomorrow on AWS and on Travis.
> > >>>>>>
> > >>>>>> Should we maintain separate repos for flink-contrib and
> > >>>> flink-libraries?
> > >>>>>> Are you intending that we move flink-table out of flink-libraries
> > >>> (and
> > >>>>>> perhaps flink-statebackend-rocksdb out of flink-contrib)?
> > >>>>>>
> > >>>>>> Greg
> > >>>>>>
> > >>>>>>
> > >>>>>>> On Mar 15, 2017, at 9:55 AM, Robert Metzger <rmetz...@apache.org
> > >>>
> > >>>>> wrote:
> > >>>>>>>
> > >>>>>>> Thank you for looking into this Till.
> > >>>>>>>
> > >>>>>>> I think we then have to split the repositories.
> > >>>>>>> My main motivation for doing this is that it seems to be the only
> > >>>>>> feasible
> > >>>>>>> way of scaling the community to allow more committers working on
> > >>> the
> > >>>>>>> libraries.
> > >>>>>>>
> > >>>>>>> I'll take care of getting things started.
> > >>>>>>>
> > >>>>>>> As the next steps I propose to:
> > >>>>>>> 1. Ask INFRA to rename https://git-wip-us.apache.org/
> > >>>>> repos/asf?p=flink-
> > >>>>>>> connectors.git;a=summary to "flink-libraries"
> > >>>>>>> 2. Ask INFRA to set up GitHub and travis integration for
> > >>>>>> "flink-libraries"
> > >>>>>>> 3. Put the code of "flink-ml", "flink-gelly", "flink-python",
> > >>>>>> "flink-cep",
> > >>>>>>> "flink-scala-shell", "flink-storm" into the new repository. (I
> > >>>> decided
> > >>>>>>> against moving flink-contrib there, because rocksdb is in the
> > >>> contrib
> > >>>>>>> module, for flink-table, I'm undecided, but I kept it in the main
> > >>>> repo
> > >>>>>>> because its probably going to interact more with the core code in
> > >>> the
> > >>>>>>> future)
> > >>>>>>> I try to preserve the history of those modules when splitting
> > >> them
> > >>>> into
> > >>>>>> the
> > >>>>>>> new repo
> > >>>>>>> 4. I'll close all pull requests against those modules in the main
> > >>>> repo.
> > >>>>>>> 5. I'll set up a minimal documentation page for the library
> > >>>> repository,
> > >>>>>>> similar to the main documentation.
> > >>>>>>> 6. I'll update the documentation build process to build both
> > >>>>>> documentations
> > >>>>>>> & link them to each other
> > >>>>>>> 7. I'll update the nightly deployment process to include both
> > >>>>>> repositories
> > >>>>>>> 8. I'll update the release script to create the Flink release out
> > >>> of
> > >>>>> both
> > >>>>>>> repositories. In order to put the libraries into the opt/ dir of
> > >>> the
> > >>>>>>> release, I'll need to change the build of "flink-dist" so that it
> > >>>> first
> > >>>>>>> builds flink core, then the libraries and then the core again
> > >> with
> > >>>> the
> > >>>>>>> libraries as an additional dependency.
> > >>>>>>>
> > >>>>>>> The main question for the community is: do you agree with point
> > >> 3 ?
> > >>>>> Would
> > >>>>>>> you like to include more or less?
> > >>>>>>>
> > >>>>>>> I'll start with 1. and 2. tomorrow morning.
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On Wed, Mar 15, 2017 at 1:48 PM, Till Rohrmann <
> > >>> trohrm...@apache.org
> > >>>>>
> > >>>>>> wrote:
> > >>>>>>>
> > >>>>>>>> In theory we could have a merging bot which solves the problem
> > >> of
> > >>>> the
> > >>>>>>>> "commit window". Once the PR passes all tests and has enough
> > >> +1s,
> > >>>> the
> > >>>>>> bot
> > >>>>>>>> could do the merging and, thus, it effectively linearizes the
> > >>> merge
> > >>>>>>>> process.
> > >>>>>>>>
> > >>>>>>>> I think the second point is actually a disadvantage because
> > >> there
> > >>> is
> > >>>>> not
> > >>>>>>>> such an immediate incentive/pressure to fix the broken module if
> > >>> it
> > >>>>>> lives
> > >>>>>>>> in a separate repository. Furthermore, breaking API changes in
> > >> the
> > >>>>> core
> > >>>>>>>> will most likely go unnoticed for some time in other modules
> > >> which
> > >>>> are
> > >>>>>> not
> > >>>>>>>> developed so actively. In the worst case these things will only
> > >> be
> > >>>>>> noticed
> > >>>>>>>> when we try to make a release.
> > >>>>>>>>
> > >>>>>>>> But I also agree that we are not Google and we don't have the
> > >>>>>> capacities to
> > >>>>>>>> maintain such a smooth a build process that we can keep all the
> > >>> code
> > >>>>> in
> > >>>>>> a
> > >>>>>>>> single repository.
> > >>>>>>>>
> > >>>>>>>> I looked a bit into Gradle and as far as I can tell it offers
> > >> some
> > >>>>> nice
> > >>>>>>>> features wrt incrementally building projects. This would be
> > >>>> beneficial
> > >>>>>> for
> > >>>>>>>> local development but it would not solve our build time problems
> > >>> on
> > >>>>>> Travis.
> > >>>>>>>> Gradle intends to introduce a task result cache which allows to
> > >>>> reuse
> > >>>>>>>> results across builds. This could help when building on Travis,
> > >>>>>> however, it
> > >>>>>>>> is not yet fully implemented. Moreover, migrating from Maven to
> > >>>> Gradle
> > >>>>>>>> won't come for free (there's simply no free lunch out there) and
> > >>> we
> > >>>>>> might
> > >>>>>>>> risk to introduce new bugs. Therefore, I would vote to split the
> > >>>>>> repository
> > >>>>>>>> in order to mitigate our current problems with Travis and the
> > >>> build
> > >>>>>> time in
> > >>>>>>>> general. Whether to use a different build system or not can then
> > >>> be
> > >>>>>>>> discussed as an orthogonal question.
> > >>>>>>>>
> > >>>>>>>> Cheers,
> > >>>>>>>> Till
> > >>>>>>>>
> > >>>>>>>> On Tue, Mar 14, 2017 at 8:05 PM, Stephan Ewen <se...@apache.org
> > >>>
> > >>>>> wrote:
> > >>>>>>>>
> > >>>>>>>>> Some other thoughts on how repository split would help. I am
> > >> not
> > >>>> sure
> > >>>>>> for
> > >>>>>>>>> all of them, so please comment:
> > >>>>>>>>>
> > >>>>>>>>> - There is less competition for a "commit window". It happens
> > >> a
> > >>>> lot
> > >>>>>>>>> already that you run all tests and want to commit, but there
> > >> was
> > >>> a
> > >>>>>> commit
> > >>>>>>>>> in the meantime. You rebase, need to re-test, again commit in
> > >> the
> > >>>>>>>> meantime.
> > >>>>>>>>>   For a "linear" commit history, this may become a bottleneck
> > >>>>>>>> eventually
> > >>>>>>>>> as well.
> > >>>>>>>>>
> > >>>>>>>>> - There is less risk of broken master. If one
> > >> repository/modules
> > >>>>>> breaks
> > >>>>>>>>> its master, the others can still continue.
> > >>>>>>>>>
> > >>>>>>>>> Stephan
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> On Fri, Mar 10, 2017 at 12:20 PM, Till Rohrmann <
> > >>>>> trohrm...@apache.org>
> > >>>>>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> Thanks for all your input. In order to wrap the discussion up
> > >>> I'd
> > >>>>> like
> > >>>>>>>> to
> > >>>>>>>>>> summarize the mentioned points:
> > >>>>>>>>>>
> > >>>>>>>>>> The problem of increasing build times and complexity of the
> > >>>> project
> > >>>>>> has
> > >>>>>>>>>> been acknowledged. Ideally we would have everything in one
> > >>>>> repository
> > >>>>>>>>> using
> > >>>>>>>>>> an incremental build tool. Since Maven does not properly
> > >> support
> > >>>>> this
> > >>>>>>>> we
> > >>>>>>>>>> would have to switch our build tool to something like Gradle,
> > >>> for
> > >>>>>>>>> example.
> > >>>>>>>>>>
> > >>>>>>>>>> Another option is introducing build profiles for different
> > >> sets
> > >>> of
> > >>>>>>>>> modules
> > >>>>>>>>>> as well as separating integration and unit tests. The third
> > >>>>>> alternative
> > >>>>>>>>>> would be creating sub-projects with their own repositories. I
> > >>>>> actually
> > >>>>>>>>>> think that these two proposal are not necessarily exclusive
> > >> and
> > >>> it
> > >>>>>>>> would
> > >>>>>>>>>> also make sense to have a separation between unit and
> > >>> integration
> > >>>>>> tests
> > >>>>>>>>> if
> > >>>>>>>>>> we split the respository.
> > >>>>>>>>>>
> > >>>>>>>>>> The overall consensus seems to be that we don't want to split
> > >>> the
> > >>>>>>>>> community
> > >>>>>>>>>> and want to keep everything under the same umbrella. I think
> > >>> this
> > >>>> is
> > >>>>>>>> the
> > >>>>>>>>>> right way to go, because otherwise some parts of the project
> > >>> could
> > >>>>>>>> become
> > >>>>>>>>>> second class citizens. Given that and that we continue using
> > >>>> Maven,
> > >>>>> I
> > >>>>>>>>> still
> > >>>>>>>>>> think that creating sub-projects for the libraries, for
> > >> example,
> > >>>>> could
> > >>>>>>>> be
> > >>>>>>>>>> beneficial. A split could reduce the project's complexity and
> > >>> make
> > >>>>> it
> > >>>>>>>>>> potentially easier for libraries to get actively developed.
> > >> The
> > >>>> main
> > >>>>>>>>>> concern is setting up the build infrastructure to aggregate
> > >> docs
> > >>>>> from
> > >>>>>>>>>> multiple repositories and making them publicly available.
> > >>>>>>>>>>
> > >>>>>>>>>> Since I started this thread and I would really like to see
> > >>> Flink's
> > >>>>> ML
> > >>>>>>>>>> library being revived again, I'd volunteer investigating first
> > >>>>> whether
> > >>>>>>>> it
> > >>>>>>>>>> is doable establishing a proper incremental build for Flink.
> > >> If
> > >>>> that
> > >>>>>>>>> should
> > >>>>>>>>>> not be possible, I will look into splitting the repository,
> > >>> first
> > >>>>> only
> > >>>>>>>>> for
> > >>>>>>>>>> the libraries. I'll share my results with the community once
> > >> I'm
> > >>>>> done
> > >>>>>>>>> with
> > >>>>>>>>>> the investigation.
> > >>>>>>>>>>
> > >>>>>>>>>> Cheers,
> > >>>>>>>>>> Till
> > >>>>>>>>>>
> > >>>>>>>>>> On Fri, Feb 24, 2017 at 3:50 PM, Robert Metzger <
> > >>>>> rmetz...@apache.org>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>>> @Jin Mingjian: You can not use the paid travis version for
> > >> open
> > >>>>>>>> source
> > >>>>>>>>>>> projects. It only works for private repositories (at least
> > >> back
> > >>>>> then
> > >>>>>>>>> when
> > >>>>>>>>>>> we've asked them about that).
> > >>>>>>>>>>>
> > >>>>>>>>>>> @Stephan: I don't think that incremental builds will be
> > >>> available
> > >>>>>>>> with
> > >>>>>>>>>>> Maven anytime soon.
> > >>>>>>>>>>>
> > >>>>>>>>>>> I agree that we need to fix the build time issue on Travis.
> > >>> I've
> > >>>>>>>>> recently
> > >>>>>>>>>>> pushed a commit to use now three instead of two test groups.
> > >>>>>>>>>>> But I don't think that this is feasible long-term solution.
> > >>>>>>>>>>>
> > >>>>>>>>>>> If this discussion is only about reducing the build and test
> > >>>> time,
> > >>>>>>>>>>> introducing build profiles for different components as
> > >> Aljoscha
> > >>>>>>>>> suggested
> > >>>>>>>>>>> would solve the problem Till mentioned.
> > >>>>>>>>>>> Also, if we decide that travis is not a good tool anymore for
> > >>> the
> > >>>>>>>>>> testing,
> > >>>>>>>>>>> I guess we can find a different solution. There are now
> > >>>> competitors
> > >>>>>>>> to
> > >>>>>>>>>>> Travis that might be willing to offer a paid plan for an open
> > >>>>> source
> > >>>>>>>>>>> project, or we set up our own infra on a server sponsored by
> > >>> one
> > >>>> of
> > >>>>>>>> the
> > >>>>>>>>>>> contributing companies.
> > >>>>>>>>>>> If we want to solve "community issues" with the change as
> > >> well,
> > >>>>> then
> > >>>>>>>> I
> > >>>>>>>>>>> think its work the effort of splitting up Flink into
> > >> different
> > >>>>>>>>>>> repositories.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Splitting up repositories is not a trivial task in my
> > >> opinion.
> > >>> As
> > >>>>>>>>> others
> > >>>>>>>>>>> have mentioned before, we need to consider the following
> > >>> things:
> > >>>>>>>>>>> - How are we doing to build the documentation? Ideally every
> > >>> repo
> > >>>>>>>>> should
> > >>>>>>>>>>> contain its docs, so we would need to pull them together when
> > >>>>>>>> building
> > >>>>>>>>>> the
> > >>>>>>>>>>> main docs.
> > >>>>>>>>>>> - How do organize the dependencies? If we have library
> > >>> repository
> > >>>>>>>>> depend
> > >>>>>>>>>> on
> > >>>>>>>>>>> snapshot Flink versions, we need to make sure that the
> > >> snapshot
> > >>>>>>>>>> deployment
> > >>>>>>>>>>> always works. This also means that people working on a
> > >> library
> > >>>>>>>>> repository
> > >>>>>>>>>>> will pull from snapshot OR need to build first locally.
> > >>>>>>>>>>> - We need to update the release scripts
> > >>>>>>>>>>>
> > >>>>>>>>>>> If we commit to do these changes, we need to assign at least
> > >>> one
> > >>>>>>>>>> committer
> > >>>>>>>>>>> (yes, in this case we need somebody who can commit, for
> > >> example
> > >>>> for
> > >>>>>>>>>>> updating the buildbot stuff) who volunteers to do the change.
> > >>>>>>>>>>> I've done a lot of infrastructure work in the past, but I'm
> > >>>>> currently
> > >>>>>>>>>>> pretty booked with many other things, so I don't
> > >> realistically
> > >>>> see
> > >>>>>>>>> myself
> > >>>>>>>>>>> doing that. Max who used to work on these things is taking
> > >> some
> > >>>>> time
> > >>>>>>>>> off.
> > >>>>>>>>>>> I think we need, best case 3 days for the change, worst case
> > >> 5
> > >>>>> days.
> > >>>>>>>>> The
> > >>>>>>>>>>> problem is that there are no "unit tests" for the infra
> > >> stuff,
> > >>> so
> > >>>>>>>> many
> > >>>>>>>>>>> things are "trial and error" (like Apache's buildbot, our
> > >>> release
> > >>>>>>>>>> scripts,
> > >>>>>>>>>>> the doc scripts, maven stuff, nightly builds).
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> On Thu, Feb 23, 2017 at 1:33 PM, Stephan Ewen <
> > >>> se...@apache.org>
> > >>>>>>>>> wrote:
> > >>>>>>>>>>>
> > >>>>>>>>>>>> If we can get a incremental builds to work, that would
> > >>> actually
> > >>>> be
> > >>>>>>>>> the
> > >>>>>>>>>>>> preferred solution in my opinion.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Many companies have invested heavily in making a "single
> > >>>>>>>> repository"
> > >>>>>>>>>> code
> > >>>>>>>>>>>> base work, because it has the advantage of not having to
> > >>>>>>>>> update/publish
> > >>>>>>>>>>>> several repositories first.
> > >>>>>>>>>>>> However, the strong prerequisite for that is an incremental
> > >>>> build
> > >>>>>>>>>> system
> > >>>>>>>>>>>> that builds only (fine grained) what it has to build. I am
> > >> not
> > >>>>> sure
> > >>>>>>>>> how
> > >>>>>>>>>>> we
> > >>>>>>>>>>>> could make that work
> > >>>>>>>>>>>> with Maven and Travis...
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> On Wed, Feb 22, 2017 at 10:42 PM, Greg Hogan <
> > >>>> c...@greghogan.com>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> An additional option for reducing time to build and test is
> > >>>>>>>>> parallel
> > >>>>>>>>>>>>> execution. This would help users more than on TravisCI
> > >> since
> > >>>>>>>> we're
> > >>>>>>>>>>>>> generally running on multi-core machines rather than VM
> > >>> slices.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Is the idea that each user would only check out the modules
> > >>>> that
> > >>>>>>>> he
> > >>>>>>>>>> or
> > >>>>>>>>>>>> she
> > >>>>>>>>>>>>> is developing with? For example, if a developer is not
> > >>> working
> > >>>> on
> > >>>>>>>>>>>>> flink-mesos or flink-yarn then the "flink-deploy" module
> > >>> would
> > >>>>>>>> not
> > >>>>>>>>> be
> > >>>>>>>>>>>> clone
> > >>>>>>>>>>>>> to their filesystem?
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> We can run a TravisCI nightly build on each repo to
> > >> validate
> > >>>>>>>>> against
> > >>>>>>>>>>> API
> > >>>>>>>>>>>>> changes.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Greg
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> On Wed, Feb 22, 2017 at 12:24 PM, Fabian Hueske <
> > >>>>>>>> fhue...@gmail.com
> > >>>>>>>>>>
> > >>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Hi everybody,
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> I think this should be a discussion about the benefits and
> > >>>>>>>>>> drawbacks
> > >>>>>>>>>>> of
> > >>>>>>>>>>>>>> separating the code into distinct repositories from a
> > >>>>>>>> development
> > >>>>>>>>>>> point
> > >>>>>>>>>>>>> of
> > >>>>>>>>>>>>>> view.
> > >>>>>>>>>>>>>> So I agree with Stephan that we should not divide the
> > >>>> community
> > >>>>>>>>> by
> > >>>>>>>>>>>>> creating
> > >>>>>>>>>>>>>> separate groups of committers.
> > >>>>>>>>>>>>>> Also the discussion about independent releases is not be
> > >>>>>>>> strictly
> > >>>>>>>>>>>> related
> > >>>>>>>>>>>>>> to the decision, IMO.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> I see a few pros and cons for splitting the code base into
> > >>>>>>>>> separate
> > >>>>>>>>>>>>>> repositories which (I think) haven't been mentioned
> > >> before:
> > >>>>>>>>>>>>>> pros:
> > >>>>>>>>>>>>>> - IDE setup will be leaner. It is not necessary to compile
> > >>> the
> > >>>>>>>>>> whole
> > >>>>>>>>>>>> code
> > >>>>>>>>>>>>>> base to run a test after switching a branch.
> > >>>>>>>>>>>>>> cons:
> > >>>>>>>>>>>>>> - developing libraries features that require changes in
> > >> the
> > >>>>>>>> core
> > >>>>>>>>> /
> > >>>>>>>>>>> APIs
> > >>>>>>>>>>>>>> become more time consuming due to back-and-forth between
> > >>> code
> > >>>>>>>>>> bases.
> > >>>>>>>>>>>>>> However, I think this is not very often the case.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Aljoscha has good points as well. Many of the build issues
> > >>>>>>>> could
> > >>>>>>>>> be
> > >>>>>>>>>>>>> solved
> > >>>>>>>>>>>>>> by different build profiles and configurations.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Best, Fabian
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> 2017-02-22 14:59 GMT+01:00 Gábor Hermann <
> > >>>>>>>> m...@gaborhermann.com
> > >>>>>>>>>> :
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> @Stephan:
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Although I tried to raise some issues about splitting
> > >>>>>>>>> committers,
> > >>>>>>>>>>> I'm
> > >>>>>>>>>>>>>>> still strongly in favor of some kind of restructuring. We
> > >>>>>>>> just
> > >>>>>>>>>> have
> > >>>>>>>>>>>> to
> > >>>>>>>>>>>>> be
> > >>>>>>>>>>>>>>> conscious about the disadvantages.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Not splitting the committers could leave the libraries in
> > >>> the
> > >>>>>>>>>> same
> > >>>>>>>>>>>>>>> stalling status, described by Till. Of course, dedicating
> > >>>>>>>>> current
> > >>>>>>>>>>>>>>> committers as shepherds of the libraries could easily
> > >>> resolve
> > >>>>>>>>> the
> > >>>>>>>>>>>>> issue.
> > >>>>>>>>>>>>>>> But that requires time from current committers. It seems
> > >>> like
> > >>>>>>>>>>>>> trade-offs
> > >>>>>>>>>>>>>>> between code quality, speed of development, and committer
> > >>>>>>>>>> efforts.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> From what I see in the discussion about ML, there are
> > >> many
> > >>>>>>>>> people
> > >>>>>>>>>>>>> willing
> > >>>>>>>>>>>>>>> to contribute as well as production use-cases. This means
> > >>> we
> > >>>>>>>>>> could
> > >>>>>>>>>>>> and
> > >>>>>>>>>>>>>>> should move forward. However, the development speed is
> > >>>>>>>>>>> significantly
> > >>>>>>>>>>>>>> slowed
> > >>>>>>>>>>>>>>> down by stalling PRs. The proposal for contributors
> > >> helping
> > >>>>>>>> the
> > >>>>>>>>>>>> review
> > >>>>>>>>>>>>>>> process did not really work out so far. In my opinion,
> > >>> either
> > >>>>>>>>>> code
> > >>>>>>>>>>>>>> quality
> > >>>>>>>>>>>>>>> (by more easily accepting new committers) or some
> > >> committer
> > >>>>>>>>> time
> > >>>>>>>>>>>>>>> (reviewing/merging) should be sacrificed to move forward.
> > >>> As
> > >>>>>>>>> Till
> > >>>>>>>>>>> has
> > >>>>>>>>>>>>>>> indicated, it would be shameful if we let this
> > >> contribution
> > >>>>>>>>>> effort
> > >>>>>>>>>>>> die.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Cheers,
> > >>>>>>>>>>>>>>> Gabor
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>
> >
> >
>

Reply via email to