It looks like Jetbrains TeamCity supports something in that direction:
https://blog.jetbrains.com/teamcity/2012/03/incremental-building-with-maven-and-teamcity/


On Mon, Mar 20, 2017 at 2:40 PM, Timo Walther <twal...@apache.org> wrote:

> Another solution would be to make the Travis builds more efficient. For
> example, we could write a script that determines the modified Maven module
> and only run the test for this module (and maybe transitive dependencies).
> PRs for libraries such as Gelly, Table, CEP or connectors would not trigger
> a compilation of the entire stack anymore. Of course this would not solve
> all problems but many of it.
>
> What do you think about this?
>
>
>
> Am 20/03/17 um 14:02 schrieb Robert Metzger:
>
> Aljoscha, do you know how to configure jenkins?
>> Is Apache INFRA doing that, or are the beam people doing that themselves?
>>
>> One downside of Jenkins is that we probably need some machines that
>> execute
>> the tests. A Travis container has 2 CPU cores and 4 GB main memory. We
>> currently have 10 such containers available on travis concurrently. I
>> think
>> we would need at least the same amount on Jenkins.
>>
>>
>> On Mon, Mar 20, 2017 at 1:48 PM, Timo Walther <twal...@apache.org> wrote:
>>
>> I agress with Aljoscha that we might consider moving from Jenkins to
>>> Travis. Is there any disadvantage in using Jenkins?
>>>
>>> I think we should structure the project according to release management
>>> (e.g. more frequent releases of libraries) or other criteria (e.g. core
>>> and
>>> non-core) instead of build time. What would happen if the built of
>>> another
>>> submodule would become too long, would we split/restructure again and
>>> again? If Jenkins solves all our problems we should use it.
>>>
>>> Regards,
>>> Timo
>>>
>>>
>>>
>>> Am 20/03/17 um 12:21 schrieb Aljoscha Krettek:
>>>
>>> I prefer Jenkins to Travis by far. Working on Beam, where we have good
>>>> Jenkins integration, has opened my eyes to what is possible with good CI
>>>> integration.
>>>>
>>>> For example, look at this recent Beam PR:
>>>> https://github.com/apache/beam
>>>> /pull/2263 <https://github.com/apache/beam/pull/2263>. The
>>>> Jenkins-Github integration will tell you exactly which tests failed and
>>>> if
>>>> you click on the links you can look at the log output/std out of the
>>>> tests
>>>> in question.
>>>>
>>>> This is the overview page of one of the Jenkins Jobs that we have in
>>>> Beam: https://builds.apache.org/job/beam_PostCommit_Java_RunnableO
>>>> nService_Flink/ <https://builds.apache.org/job
>>>> /beam_PostCommit_Java_RunnableOnService_Flink/>. This is an example of
>>>> a
>>>> stable build: https://builds.apache.org/job/
>>>> beam_PostCommit_Java_RunnableOnService_Flink/lastStableBuild/ <
>>>> https://builds.apache.org/job/beam_PostCommit_Java_Runnable
>>>> OnService_Flink/lastStableBuild/>. Notice how it gives you fine grained
>>>> information about the Maven run. This is an unstable run:
>>>> https://builds.apache.org/job/beam_PostCommit_Java_RunnableO
>>>> nService_Flink/lastUnstableBuild/ <https://builds.apache.org/job
>>>> /beam_PostCommit_Java_RunnableOnService_Flink/lastUnstableBuild/>.
>>>> There
>>>> you can see which tests failed and you can easily drill down.
>>>>
>>>> Best,
>>>> Aljoscha
>>>>
>>>> On 20 Mar 2017, at 11:46, Robert Metzger <rmetz...@apache.org> wrote:
>>>>
>>>>> Thank you for looking into the build times.
>>>>>
>>>>> I didn't know that the build time situation is so bad. Even with yarn,
>>>>> mesos, connectors and libraries removed, we are still running into the
>>>>> build timeout :(
>>>>>
>>>>> Aljoscha told me that the Beam community is using Jenkins for running
>>>>> the tests, and they are planning to completely move away from Travis. I
>>>>> wonder whether we should do the same, as having our own Jenkins servers
>>>>> would allow us to run tests for more than 50 minutes.
>>>>>
>>>>> I agree with Stephan that we should keep the yarn and mesos tests in
>>>>> the
>>>>> core for stability / testing quality purposes.
>>>>>
>>>>>
>>>>> On Mon, Mar 20, 2017 at 11:27 AM, Stephan Ewen <se...@apache.org
>>>>> <mailto:se...@apache.org>> wrote:
>>>>> @Greg
>>>>>
>>>>> I am personally in favor of splitting "connectors" and "contrib" out as
>>>>> well. I know that @rmetzger has some reservations about the connectors,
>>>>> but
>>>>> we may be able to convince him.
>>>>>
>>>>> For the cluster tests (yarn / mesos) - in the past there were many
>>>>> cases
>>>>> where these tests caught cases that other tests did not, because they
>>>>> are
>>>>> the only tests that actually use the "flink-dist.jar" and thus discover
>>>>> many dependency and configuration issues. For that reason, my feeling
>>>>> would
>>>>> be that they are valuable in the core repository.
>>>>>
>>>>> I would actually suggest to do only the library split initially, to see
>>>>> what the challenges are in setting up the multi-repo build and release
>>>>> tooling. Once we gathered experience there, we can probably easily see
>>>>> what
>>>>> else we can split out.
>>>>>
>>>>> Stephan
>>>>>
>>>>>
>>>>> On Fri, Mar 17, 2017 at 8:37 PM, Greg Hogan <c...@greghogan.com
>>>>> <mailto:
>>>>> c...@greghogan.com>> wrote:
>>>>>
>>>>> I’d like to use this refactoring opportunity to unspilt the Travis
>>>>> tests.
>>>>>
>>>>>> With 51 builds queued up for the weekend (some of which may fail or
>>>>>> have
>>>>>> been force pushed) we are at the limit of the number of contributions
>>>>>> we
>>>>>> can process. Fixing this requires 1) splitting the project, 2)
>>>>>> investigating speedups for long-running tests, and 3) staying
>>>>>> cognizant
>>>>>> of
>>>>>> test performance when accepting new code.
>>>>>>
>>>>>> I’d like to add one to Stephan’s list of module group. I like that the
>>>>>> modules are generic (“libraries”) so that no one module is alone and
>>>>>> independent.
>>>>>>
>>>>>> Flink has three “libraries”: cep, ml, and gelly.
>>>>>>
>>>>>> “connectors” is a hotspot due to the long-running Kafka tests (and
>>>>>> connectors for three Kafka versions).
>>>>>>
>>>>>> Both flink-storm and flink-python have a modest number of number of
>>>>>> tests
>>>>>> and could live with the miscellaneous modules in “contrib”.
>>>>>>
>>>>>> The YARN tests are long-running and problematic (I am unable to
>>>>>> successfully run these locally). A “cluster” module could host
>>>>>> flink-mesos,
>>>>>> flink-yarn, and flink-yarn-tests.
>>>>>>
>>>>>> That gets us close to running all tests in a single Travis build.
>>>>>>     https://travis-ci.org/greghogan/flink/builds/212122590 <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212122590> <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212122590 <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212122590>>
>>>>>>
>>>>>> I also tested (https://github.com/greghogan/flink/commits/core_build
>>>>>> <
>>>>>> https://github.com/greghogan/flink/commits/core_build> <
>>>>>> https://github.com/greghogan/flink/commits/core_build <
>>>>>> https://github.com/greghogan/flink/commits/core_build>>) with a maven
>>>>>> parallelism of 2 and 4, with the latter a 6.4% drop in build time.
>>>>>>     https://travis-ci.org/greghogan/flink/builds/212137659 <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212137659> <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212137659 <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212137659>>
>>>>>>     https://travis-ci.org/greghogan/flink/builds/212154470 <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212154470> <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212154470 <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212154470>>
>>>>>>
>>>>>> We can run Travis CI builds nightly to guard against breaking changes.
>>>>>>
>>>>>> I also wanted to get an idea of how disruptive it would be to
>>>>>> developers
>>>>>> to divide the project into multiple git repos. I wrote a simple python
>>>>>> script and configured it with the module partitions listed above. The
>>>>>> usage
>>>>>> string from the top of the file lists commits with files from multiple
>>>>>> partitions and well as the modified files.
>>>>>>     https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335
>>>>>> ac4897 <
>>>>>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897> <
>>>>>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897 <
>>>>>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897>>
>>>>>>
>>>>>> Accounting for the merging of the batch and streaming connector
>>>>>> modules,
>>>>>> and assuming that the project structure has not changed much over the
>>>>>> past
>>>>>> 15 months, for the following date ranges the listed number of commits
>>>>>> would
>>>>>> have been split across repositories.
>>>>>>
>>>>>> since "2017-01-01"
>>>>>> 36 of 571 commits were mixed
>>>>>>
>>>>>> since "2016-07-01"
>>>>>> 155 of 1607 commits were mixed
>>>>>>
>>>>>> since "2016-01-01"
>>>>>> 272 of 2561 commits were mixed
>>>>>>
>>>>>> Greg
>>>>>>
>>>>>>
>>>>>> On Mar 15, 2017, at 1:13 PM, Stephan Ewen <se...@apache.org <mailto:
>>>>>>
>>>>>>> se...@apache.org>> wrote:
>>>>>>>
>>>>>>> @Robert - I think once we know that a separate git repo works well,
>>>>>>> and
>>>>>>> that it actually solves problems, I see no reason to not create a
>>>>>>> connectors repository later. The infrastructure changes should be
>>>>>>>
>>>>>>> identical
>>>>>>
>>>>>> for two or more repositories.
>>>>>>>
>>>>>>> On Wed, Mar 15, 2017 at 5:22 PM, Till Rohrmann <trohrm...@apache.org
>>>>>>> <mailto:trohrm...@apache.org>>
>>>>>>>
>>>>>>> wrote:
>>>>>>
>>>>>> I think it should not be at least the flink-dist but exactly the
>>>>>>> remaining
>>>>>>> flink-dist module. Otherwise we do redundant work.
>>>>>>>
>>>>>>>> On Wed, Mar 15, 2017 at 5:03 PM, Robert Metzger <
>>>>>>>> rmetz...@apache.org
>>>>>>>> <mailto:rmetz...@apache.org>>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> "flink-core" means the main repository, not the "flink-core" module.
>>>>>>>>
>>>>>>>>> When doing a release, we need to build the flink main code first,
>>>>>>>>>
>>>>>>>>> because
>>>>>>>>
>>>>>>> the flink-libraries depend on that.
>>>>>>>
>>>>>>>> Once the "flink-libraries" are build, we need to run the main build
>>>>>>>>>
>>>>>>>>> again
>>>>>>>>
>>>>>>> (at least the flink-dist module), so that it is pulling the artifacts
>>>>>>>
>>>>>>>> from
>>>>>>>>
>>>>>>>> the flink-libraries to put them into the opt/ folder of the final
>>>>>>>>>
>>>>>>>>> artifact.
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Mar 15, 2017 at 4:44 PM, Till Rohrmann <
>>>>>>>>> trohrm...@apache.org
>>>>>>>>> <mailto:trohrm...@apache.org>>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> I'm ok with point 3.
>>>>>>>>>
>>>>>>>>>> Concerning point 8: Why do we have to build flink-core twice after
>>>>>>>>>>
>>>>>>>>>> having
>>>>>>>>> it built as a dependency for flink-libraries? This seems wrong to
>>>>>>>>> me.
>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Till
>>>>>>>>>>
>>>>>>>>>> On Wed, Mar 15, 2017 at 4:23 PM, Robert Metzger <
>>>>>>>>>> rmetz...@apache.org <mailto:rmetz...@apache.org>>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Thank you. Running on AWS is a good idea!
>>>>>>>>>>
>>>>>>>>>>> Let me know if you (or anybody else) wants to help me with the
>>>>>>>>>>> infrastructure work! Any help is much appreciated (as I've said
>>>>>>>>>>>
>>>>>>>>>>> before, I
>>>>>>>>>> don't really have time for doing this, but it has to be done :) )
>>>>>>>>>>
>>>>>>>>>>> I'm against creating two new repositories. I fear that this
>>>>>>>>>>>
>>>>>>>>>>> introduces
>>>>>>>>>>
>>>>>>>>> too
>>>>>>>>>
>>>>>>>>>> much complexity and too many repositories.
>>>>>>>>>>> "flink" and "flink-libraries" are hopefully enough to get the
>>>>>>>>>>> build
>>>>>>>>>>>
>>>>>>>>>>> time
>>>>>>>>>> significantly down.
>>>>>>>>>>
>>>>>>>>>>> We can also consider putting the connectors into the
>>>>>>>>>>>
>>>>>>>>>>> "flink-libraries"
>>>>>>>>>>
>>>>>>>>> repo
>>>>>>>>>
>>>>>>>>>> if we need to further reduce the build time.
>>>>>>>>>>>
>>>>>>>>>>> We should probably move "flink-table" of out "flink-libraries" if
>>>>>>>>>>> we
>>>>>>>>>>>
>>>>>>>>>>> want
>>>>>>>>>> to keep "flink-table" in the main repo. (This would eliminate the
>>>>>>>>>>
>>>>>>>>>>> "flink-libraries" module from main.
>>>>>>>>>>>
>>>>>>>>>>> Also, I agree that "flink-statebackend-rocksdb" is not correctly
>>>>>>>>>>>
>>>>>>>>>>> placed
>>>>>>>>>>
>>>>>>>>> in
>>>>>>>>>
>>>>>>>>>> contrib anymore.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Mar 15, 2017 at 4:07 PM, Greg Hogan <c...@greghogan.com
>>>>>>>>>>> <mailto:c...@greghogan.com>>
>>>>>>>>>>>
>>>>>>>>>>> wrote:
>>>>>>>>>> Robert, appreciate your kickstarting this task.
>>>>>>>>>>
>>>>>>>>>>> We should compare the verification time with and without the
>>>>>>>>>>>> listed
>>>>>>>>>>>> modules. I’ll try to run this by tomorrow on AWS and on Travis.
>>>>>>>>>>>>
>>>>>>>>>>>> Should we maintain separate repos for flink-contrib and
>>>>>>>>>>>>
>>>>>>>>>>>> flink-libraries?
>>>>>>>>>>> Are you intending that we move flink-table out of flink-libraries
>>>>>>>>>>> (and
>>>>>>>>>>>
>>>>>>>>>> perhaps flink-statebackend-rocksdb out of flink-contrib)?
>>>>>>>>>>
>>>>>>>>>>> Greg
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mar 15, 2017, at 9:55 AM, Robert Metzger <
>>>>>>>>>>>> rmetz...@apache.org
>>>>>>>>>>>>
>>>>>>>>>>>>> <mailto:rmetz...@apache.org>
>>>>>>>>>>>>>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>> Thank you for looking into this Till.
>>>>>>>>>>>>
>>>>>>>>>>>>> I think we then have to split the repositories.
>>>>>>>>>>>>> My main motivation for doing this is that it seems to be the
>>>>>>>>>>>>> only
>>>>>>>>>>>>>
>>>>>>>>>>>>> feasible
>>>>>>>>>>>>
>>>>>>>>>>>> way of scaling the community to allow more committers working on
>>>>>>>>>>>>>
>>>>>>>>>>>>> the
>>>>>>>>>>>>
>>>>>>>>>>> libraries.
>>>>>>>>>>
>>>>>>>>>>> I'll take care of getting things started.
>>>>>>>>>>>>>
>>>>>>>>>>>>> As the next steps I propose to:
>>>>>>>>>>>>> 1. Ask INFRA to rename https://git-wip-us.apache.org/ <
>>>>>>>>>>>>> https://git-wip-us.apache.org/>
>>>>>>>>>>>>>
>>>>>>>>>>>>> repos/asf?p=flink-
>>>>>>>>>>>> connectors.git;a=summary to "flink-libraries"
>>>>>>>>>>>>
>>>>>>>>>>>>> 2. Ask INFRA to set up GitHub and travis integration for
>>>>>>>>>>>>>
>>>>>>>>>>>>> "flink-libraries"
>>>>>>>>>>>>
>>>>>>>>>>>> 3. Put the code of "flink-ml", "flink-gelly", "flink-python",
>>>>>>>>>>>>>
>>>>>>>>>>>>> "flink-cep",
>>>>>>>>>>>>
>>>>>>>>>>>> "flink-scala-shell", "flink-storm" into the new repository. (I
>>>>>>>>>>>>>
>>>>>>>>>>>>> decided
>>>>>>>>>>>>
>>>>>>>>>>> against moving flink-contrib there, because rocksdb is in the
>>>>>>>>>>>
>>>>>>>>>>>> contrib
>>>>>>>>>>>>
>>>>>>>>>>> module, for flink-table, I'm undecided, but I kept it in the main
>>>>>>>>>>
>>>>>>>>>>> repo
>>>>>>>>>>>>
>>>>>>>>>>> because its probably going to interact more with the core code in
>>>>>>>>>>>
>>>>>>>>>>>> the
>>>>>>>>>>>>
>>>>>>>>>>> future)
>>>>>>>>>>
>>>>>>>>>>> I try to preserve the history of those modules when splitting
>>>>>>>>>>>>>
>>>>>>>>>>>>> them
>>>>>>>>>>>>
>>>>>>>>>>> into
>>>>>>>>>
>>>>>>>>>> the
>>>>>>>>>>>
>>>>>>>>>>>> new repo
>>>>>>>>>>>>> 4. I'll close all pull requests against those modules in the
>>>>>>>>>>>>> main
>>>>>>>>>>>>>
>>>>>>>>>>>>> repo.
>>>>>>>>>>>>
>>>>>>>>>>> 5. I'll set up a minimal documentation page for the library
>>>>>>>>>>>
>>>>>>>>>>>> repository,
>>>>>>>>>>>>
>>>>>>>>>>> similar to the main documentation.
>>>>>>>>>>>
>>>>>>>>>>>> 6. I'll update the documentation build process to build both
>>>>>>>>>>>>>
>>>>>>>>>>>>> documentations
>>>>>>>>>>>>
>>>>>>>>>>>> & link them to each other
>>>>>>>>>>>>> 7. I'll update the nightly deployment process to include both
>>>>>>>>>>>>>
>>>>>>>>>>>>> repositories
>>>>>>>>>>>>
>>>>>>>>>>>> 8. I'll update the release script to create the Flink release
>>>>>>>>>>>>> out
>>>>>>>>>>>>>
>>>>>>>>>>>>> of
>>>>>>>>>>>>
>>>>>>>>>>> both
>>>>>>>>>>
>>>>>>>>>>> repositories. In order to put the libraries into the opt/ dir of
>>>>>>>>>>>> the
>>>>>>>>>>>>
>>>>>>>>>>> release, I'll need to change the build of "flink-dist" so that it
>>>>>>>>>>
>>>>>>>>>>> first
>>>>>>>>>>>>
>>>>>>>>>>> builds flink core, then the libraries and then the core again
>>>>>>>>>>>
>>>>>>>>>>>> with
>>>>>>>>>>>>
>>>>>>>>>>> the
>>>>>>>>>
>>>>>>>>>> libraries as an additional dependency.
>>>>>>>>>>>
>>>>>>>>>>>> The main question for the community is: do you agree with point
>>>>>>>>>>>>>
>>>>>>>>>>>>> 3 ?
>>>>>>>>>>>>
>>>>>>>>>>> Would
>>>>>>>>>
>>>>>>>>>> you like to include more or less?
>>>>>>>>>>>>
>>>>>>>>>>>>> I'll start with 1. and 2. tomorrow morning.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Mar 15, 2017 at 1:48 PM, Till Rohrmann <
>>>>>>>>>>>>>
>>>>>>>>>>>>> trohrm...@apache.org <mailto:trohrm...@apache.org>
>>>>>>>>>>>>
>>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> In theory we could have a merging bot which solves the problem
>>>>>>>>>>>>> of
>>>>>>>>>>>>>
>>>>>>>>>>>> the
>>>>>>>>>
>>>>>>>>>> "commit window". Once the PR passes all tests and has enough
>>>>>>>>>>>
>>>>>>>>>>>> +1s,
>>>>>>>>>>>>>
>>>>>>>>>>>> the
>>>>>>>>>
>>>>>>>>>> bot
>>>>>>>>>>>
>>>>>>>>>>>> could do the merging and, thus, it effectively linearizes the
>>>>>>>>>>>>> merge
>>>>>>>>>>>>>
>>>>>>>>>>>> process.
>>>>>>>>>>
>>>>>>>>>>> I think the second point is actually a disadvantage because
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> there
>>>>>>>>>>>>>
>>>>>>>>>>>> is
>>>>>>>>>
>>>>>>>>> not
>>>>>>>>>>
>>>>>>>>>>> such an immediate incentive/pressure to fix the broken module if
>>>>>>>>>>>>
>>>>>>>>>>>>> it
>>>>>>>>>>>>>
>>>>>>>>>>>> lives
>>>>>>>>>>
>>>>>>>>>>> in a separate repository. Furthermore, breaking API changes in
>>>>>>>>>>>>> the
>>>>>>>>>>>>>
>>>>>>>>>>>> core
>>>>>>>>>
>>>>>>>>>> will most likely go unnoticed for some time in other modules
>>>>>>>>>>>>
>>>>>>>>>>>>> which
>>>>>>>>>>>>>
>>>>>>>>>>>> are
>>>>>>>>>
>>>>>>>>>> not
>>>>>>>>>>>
>>>>>>>>>>>> developed so actively. In the worst case these things will only
>>>>>>>>>>>>> be
>>>>>>>>>>>>>
>>>>>>>>>>>> noticed
>>>>>>>>>
>>>>>>>>>> when we try to make a release.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> But I also agree that we are not Google and we don't have the
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> capacities to
>>>>>>>>>>>>> maintain such a smooth a build process that we can keep all the
>>>>>>>>>>>>> code
>>>>>>>>>>>>>
>>>>>>>>>>>> in
>>>>>>>>>>
>>>>>>>>>>> a
>>>>>>>>>>>>
>>>>>>>>>>>> single repository.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I looked a bit into Gradle and as far as I can tell it offers
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> some
>>>>>>>>>>>>>
>>>>>>>>>>>> nice
>>>>>>>>>
>>>>>>>>>> features wrt incrementally building projects. This would be
>>>>>>>>>>>>
>>>>>>>>>>>>> beneficial
>>>>>>>>>>>>>
>>>>>>>>>>>> for
>>>>>>>>>>>
>>>>>>>>>>>> local development but it would not solve our build time problems
>>>>>>>>>>>>> on
>>>>>>>>>>>>>
>>>>>>>>>>>> Travis.
>>>>>>>>>>
>>>>>>>>>>> Gradle intends to introduce a task result cache which allows to
>>>>>>>>>>>>> reuse
>>>>>>>>>>>>>
>>>>>>>>>>>> results across builds. This could help when building on Travis,
>>>>>>>>>>>
>>>>>>>>>>>> however, it
>>>>>>>>>>>>> is not yet fully implemented. Moreover, migrating from Maven to
>>>>>>>>>>>>> Gradle
>>>>>>>>>>>>>
>>>>>>>>>>>> won't come for free (there's simply no free lunch out there) and
>>>>>>>>>>>
>>>>>>>>>>>> we
>>>>>>>>>>>>>
>>>>>>>>>>>> might
>>>>>>>>>>
>>>>>>>>>>> risk to introduce new bugs. Therefore, I would vote to split the
>>>>>>>>>>>>> repository
>>>>>>>>>>>>> in order to mitigate our current problems with Travis and the
>>>>>>>>>>>>> build
>>>>>>>>>>>>>
>>>>>>>>>>>> time in
>>>>>>>>>>
>>>>>>>>>>> general. Whether to use a different build system or not can then
>>>>>>>>>>>>> be
>>>>>>>>>>>>>
>>>>>>>>>>>> discussed as an orthogonal question.
>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>> Till
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Mar 14, 2017 at 8:05 PM, Stephan Ewen <
>>>>>>>>>>>>>> se...@apache.org
>>>>>>>>>>>>>> <mailto:se...@apache.org>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>> Some other thoughts on how repository split would help. I am
>>>>>>>>>>>>
>>>>>>>>>>>>> not
>>>>>>>>>>>>>>
>>>>>>>>>>>>> sure
>>>>>>>>>
>>>>>>>>>> for
>>>>>>>>>>>
>>>>>>>>>>>> all of them, so please comment:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> - There is less competition for a "commit window". It happens
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>
>>>>>>>>>>>>> lot
>>>>>>>>>
>>>>>>>>>> already that you run all tests and want to commit, but there
>>>>>>>>>>>
>>>>>>>>>>>> was
>>>>>>>>>>>>>>
>>>>>>>>>>>>> a
>>>>>>>>>
>>>>>>>>> commit
>>>>>>>>>>
>>>>>>>>>>> in the meantime. You rebase, need to re-test, again commit in
>>>>>>>>>>>>>
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>
>>>>>>>>>>>>> meantime.
>>>>>>>>>
>>>>>>>>>>     For a "linear" commit history, this may become a bottleneck
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> eventually
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> as well.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> - There is less risk of broken master. If one
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> repository/modules
>>>>>>>>>>>>>>
>>>>>>>>>>>>> breaks
>>>>>>>>>
>>>>>>>>>> its master, the others can still continue.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Stephan
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Mar 10, 2017 at 12:20 PM, Till Rohrmann <
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> trohrm...@apache.org <mailto:trohrm...@apache.org>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for all your input. In order to wrap the discussion up
>>>>>>>>>>>>>>> I'd
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> like
>>>>>>>>>>
>>>>>>>>>>> to
>>>>>>>>>>>>
>>>>>>>>>>>>> summarize the mentioned points:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The problem of increasing build times and complexity of the
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> project
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> has
>>>>>>>>>>>
>>>>>>>>>>>> been acknowledged. Ideally we would have everything in one
>>>>>>>>>>>>>
>>>>>>>>>>>>>> repository
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> using
>>>>>>>>>>>>
>>>>>>>>>>>>> an incremental build tool. Since Maven does not properly
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> support
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> this
>>>>>>>>>
>>>>>>>>>> we
>>>>>>>>>>>>
>>>>>>>>>>>>> would have to switch our build tool to something like Gradle,
>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> example.
>>>>>>>>>>
>>>>>>>>>>> Another option is introducing build profiles for different
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> sets
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> of
>>>>>>>>>
>>>>>>>>> modules
>>>>>>>>>>
>>>>>>>>>>> as well as separating integration and unit tests. The third
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> alternative
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> would be creating sub-projects with their own repositories. I
>>>>>>>>>>>>>
>>>>>>>>>>>>>> actually
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> think that these two proposal are not necessarily exclusive
>>>>>>>>>>>>
>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> it
>>>>>>>>>
>>>>>>>>> would
>>>>>>>>>>
>>>>>>>>>>> also make sense to have a separation between unit and
>>>>>>>>>>>>>>> integration
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> tests
>>>>>>>>>>
>>>>>>>>>>> if
>>>>>>>>>>>>>
>>>>>>>>>>>>>> we split the respository.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The overall consensus seems to be that we don't want to
>>>>>>>>>>>>>>>> split
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> community
>>>>>>>>>>
>>>>>>>>>>> and want to keep everything under the same umbrella. I think
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> is
>>>>>>>>>>
>>>>>>>>>> the
>>>>>>>>>>>
>>>>>>>>>>>> right way to go, because otherwise some parts of the project
>>>>>>>>>>>>>>> could
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> become
>>>>>>>>>>
>>>>>>>>>>> second class citizens. Given that and that we continue using
>>>>>>>>>>>>>>> Maven,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I
>>>>>>>>>>>
>>>>>>>>>>> still
>>>>>>>>>>>>
>>>>>>>>>>>>> think that creating sub-projects for the libraries, for
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> example,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> could
>>>>>>>>>
>>>>>>>>>> be
>>>>>>>>>>>>
>>>>>>>>>>>>> beneficial. A split could reduce the project's complexity and
>>>>>>>>>>>>>>> make
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> it
>>>>>>>>>>
>>>>>>>>>>> potentially easier for libraries to get actively developed.
>>>>>>>>>>>>
>>>>>>>>>>>>> The
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> main
>>>>>>>>>
>>>>>>>>>> concern is setting up the build infrastructure to aggregate
>>>>>>>>>>>
>>>>>>>>>>>> docs
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> from
>>>>>>>>>
>>>>>>>>>> multiple repositories and making them publicly available.
>>>>>>>>>>>>
>>>>>>>>>>>>> Since I started this thread and I would really like to see
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Flink's
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ML
>>>>>>>>>>
>>>>>>>>>>> library being revived again, I'd volunteer investigating first
>>>>>>>>>>>>
>>>>>>>>>>>>> whether
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> it
>>>>>>>>>>>>
>>>>>>>>>>>>> is doable establishing a proper incremental build for Flink.
>>>>>>>>>>>>>>> If
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> that
>>>>>>>>>
>>>>>>>>>> should
>>>>>>>>>>>
>>>>>>>>>>>> not be possible, I will look into splitting the repository,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> first
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> only
>>>>>>>>>>
>>>>>>>>>>> for
>>>>>>>>>>>>
>>>>>>>>>>>>> the libraries. I'll share my results with the community once
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'm
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> done
>>>>>>>>>
>>>>>>>>>> with
>>>>>>>>>>>>
>>>>>>>>>>>>> the investigation.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>> Till
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Feb 24, 2017 at 3:50 PM, Robert Metzger <
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> rmetz...@apache.org <mailto:rmetz...@apache.org>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> @Jin Mingjian: You can not use the paid travis version for
>>>>>>>>>>>>>>>> open
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> source
>>>>>>>>>
>>>>>>>>>> projects. It only works for private repositories (at least
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> back
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> then
>>>>>>>>>
>>>>>>>>>> when
>>>>>>>>>>>>
>>>>>>>>>>>>> we've asked them about that).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> @Stephan: I don't think that incremental builds will be
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> available
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> with
>>>>>>>>>>
>>>>>>>>>>> Maven anytime soon.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I agree that we need to fix the build time issue on Travis.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I've
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> recently
>>>>>>>>>>
>>>>>>>>>>> pushed a commit to use now three instead of two test groups.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> But I don't think that this is feasible long-term solution.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> If this discussion is only about reducing the build and
>>>>>>>>>>>>>>>>> test
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> time,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> introducing build profiles for different components as
>>>>>>>>>>>
>>>>>>>>>>>> Aljoscha
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> suggested
>>>>>>>>>
>>>>>>>>>> would solve the problem Till mentioned.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Also, if we decide that travis is not a good tool anymore
>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> testing,
>>>>>>>>>>
>>>>>>>>>>> I guess we can find a different solution. There are now
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> competitors
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> to
>>>>>>>>>>>
>>>>>>>>>>>> Travis that might be willing to offer a paid plan for an open
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> source
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> project, or we set up our own infra on a server sponsored by
>>>>>>>>>>>>
>>>>>>>>>>>>> one
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> of
>>>>>>>>>>
>>>>>>>>>> the
>>>>>>>>>>>
>>>>>>>>>>>> contributing companies.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> If we want to solve "community issues" with the change as
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> well,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> then
>>>>>>>>>
>>>>>>>>>> I
>>>>>>>>>>>>
>>>>>>>>>>>>> think its work the effort of splitting up Flink into
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> different
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> repositories.
>>>>>>>>>
>>>>>>>>>> Splitting up repositories is not a trivial task in my
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> opinion.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> As
>>>>>>>>>
>>>>>>>>> others
>>>>>>>>>>
>>>>>>>>>>> have mentioned before, we need to consider the following
>>>>>>>>>>>>>>>> things:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> - How are we doing to build the documentation? Ideally every
>>>>>>>>>>
>>>>>>>>>>> repo
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> should
>>>>>>>>>>
>>>>>>>>>>> contain its docs, so we would need to pull them together when
>>>>>>>>>>>>>>>> building
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> main docs.
>>>>>>>>>>>>>>>>> - How do organize the dependencies? If we have library
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> repository
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> depend
>>>>>>>>>>
>>>>>>>>>>> on
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> snapshot Flink versions, we need to make sure that the
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> snapshot
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> deployment
>>>>>>>>>
>>>>>>>>>> always works. This also means that people working on a
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> library
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> repository
>>>>>>>>>
>>>>>>>>>> will pull from snapshot OR need to build first locally.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> - We need to update the release scripts
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> If we commit to do these changes, we need to assign at
>>>>>>>>>>>>>>>>> least
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> one
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> committer
>>>>>>>>>>
>>>>>>>>>>> (yes, in this case we need somebody who can commit, for
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> example
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> for
>>>>>>>>>
>>>>>>>>>> updating the buildbot stuff) who volunteers to do the change.
>>>>>>>>>>>
>>>>>>>>>>>> I've done a lot of infrastructure work in the past, but I'm
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> currently
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> pretty booked with many other things, so I don't
>>>>>>>>>>>>
>>>>>>>>>>>>> realistically
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> see
>>>>>>>>>
>>>>>>>>>> myself
>>>>>>>>>>>
>>>>>>>>>>>> doing that. Max who used to work on these things is taking
>>>>>>>>>>>>>>>> some
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> time
>>>>>>>>>
>>>>>>>>>> off.
>>>>>>>>>>>>
>>>>>>>>>>>>> I think we need, best case 3 days for the change, worst case
>>>>>>>>>>>>>>>> 5
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> days.
>>>>>>>>>
>>>>>>>>>> The
>>>>>>>>>>>>
>>>>>>>>>>>>> problem is that there are no "unit tests" for the infra
>>>>>>>>>>>>>>>> stuff,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>
>

Reply via email to