It looks like Jetbrains TeamCity supports something in that direction: https://blog.jetbrains.com/teamcity/2012/03/incremental-building-with-maven-and-teamcity/
On Mon, Mar 20, 2017 at 2:40 PM, Timo Walther <twal...@apache.org> wrote: > Another solution would be to make the Travis builds more efficient. For > example, we could write a script that determines the modified Maven module > and only run the test for this module (and maybe transitive dependencies). > PRs for libraries such as Gelly, Table, CEP or connectors would not trigger > a compilation of the entire stack anymore. Of course this would not solve > all problems but many of it. > > What do you think about this? > > > > Am 20/03/17 um 14:02 schrieb Robert Metzger: > > Aljoscha, do you know how to configure jenkins? >> Is Apache INFRA doing that, or are the beam people doing that themselves? >> >> One downside of Jenkins is that we probably need some machines that >> execute >> the tests. A Travis container has 2 CPU cores and 4 GB main memory. We >> currently have 10 such containers available on travis concurrently. I >> think >> we would need at least the same amount on Jenkins. >> >> >> On Mon, Mar 20, 2017 at 1:48 PM, Timo Walther <twal...@apache.org> wrote: >> >> I agress with Aljoscha that we might consider moving from Jenkins to >>> Travis. Is there any disadvantage in using Jenkins? >>> >>> I think we should structure the project according to release management >>> (e.g. more frequent releases of libraries) or other criteria (e.g. core >>> and >>> non-core) instead of build time. What would happen if the built of >>> another >>> submodule would become too long, would we split/restructure again and >>> again? If Jenkins solves all our problems we should use it. >>> >>> Regards, >>> Timo >>> >>> >>> >>> Am 20/03/17 um 12:21 schrieb Aljoscha Krettek: >>> >>> I prefer Jenkins to Travis by far. Working on Beam, where we have good >>>> Jenkins integration, has opened my eyes to what is possible with good CI >>>> integration. >>>> >>>> For example, look at this recent Beam PR: >>>> https://github.com/apache/beam >>>> /pull/2263 <https://github.com/apache/beam/pull/2263>. The >>>> Jenkins-Github integration will tell you exactly which tests failed and >>>> if >>>> you click on the links you can look at the log output/std out of the >>>> tests >>>> in question. >>>> >>>> This is the overview page of one of the Jenkins Jobs that we have in >>>> Beam: https://builds.apache.org/job/beam_PostCommit_Java_RunnableO >>>> nService_Flink/ <https://builds.apache.org/job >>>> /beam_PostCommit_Java_RunnableOnService_Flink/>. This is an example of >>>> a >>>> stable build: https://builds.apache.org/job/ >>>> beam_PostCommit_Java_RunnableOnService_Flink/lastStableBuild/ < >>>> https://builds.apache.org/job/beam_PostCommit_Java_Runnable >>>> OnService_Flink/lastStableBuild/>. Notice how it gives you fine grained >>>> information about the Maven run. This is an unstable run: >>>> https://builds.apache.org/job/beam_PostCommit_Java_RunnableO >>>> nService_Flink/lastUnstableBuild/ <https://builds.apache.org/job >>>> /beam_PostCommit_Java_RunnableOnService_Flink/lastUnstableBuild/>. >>>> There >>>> you can see which tests failed and you can easily drill down. >>>> >>>> Best, >>>> Aljoscha >>>> >>>> On 20 Mar 2017, at 11:46, Robert Metzger <rmetz...@apache.org> wrote: >>>> >>>>> Thank you for looking into the build times. >>>>> >>>>> I didn't know that the build time situation is so bad. Even with yarn, >>>>> mesos, connectors and libraries removed, we are still running into the >>>>> build timeout :( >>>>> >>>>> Aljoscha told me that the Beam community is using Jenkins for running >>>>> the tests, and they are planning to completely move away from Travis. I >>>>> wonder whether we should do the same, as having our own Jenkins servers >>>>> would allow us to run tests for more than 50 minutes. >>>>> >>>>> I agree with Stephan that we should keep the yarn and mesos tests in >>>>> the >>>>> core for stability / testing quality purposes. >>>>> >>>>> >>>>> On Mon, Mar 20, 2017 at 11:27 AM, Stephan Ewen <se...@apache.org >>>>> <mailto:se...@apache.org>> wrote: >>>>> @Greg >>>>> >>>>> I am personally in favor of splitting "connectors" and "contrib" out as >>>>> well. I know that @rmetzger has some reservations about the connectors, >>>>> but >>>>> we may be able to convince him. >>>>> >>>>> For the cluster tests (yarn / mesos) - in the past there were many >>>>> cases >>>>> where these tests caught cases that other tests did not, because they >>>>> are >>>>> the only tests that actually use the "flink-dist.jar" and thus discover >>>>> many dependency and configuration issues. For that reason, my feeling >>>>> would >>>>> be that they are valuable in the core repository. >>>>> >>>>> I would actually suggest to do only the library split initially, to see >>>>> what the challenges are in setting up the multi-repo build and release >>>>> tooling. Once we gathered experience there, we can probably easily see >>>>> what >>>>> else we can split out. >>>>> >>>>> Stephan >>>>> >>>>> >>>>> On Fri, Mar 17, 2017 at 8:37 PM, Greg Hogan <c...@greghogan.com >>>>> <mailto: >>>>> c...@greghogan.com>> wrote: >>>>> >>>>> I’d like to use this refactoring opportunity to unspilt the Travis >>>>> tests. >>>>> >>>>>> With 51 builds queued up for the weekend (some of which may fail or >>>>>> have >>>>>> been force pushed) we are at the limit of the number of contributions >>>>>> we >>>>>> can process. Fixing this requires 1) splitting the project, 2) >>>>>> investigating speedups for long-running tests, and 3) staying >>>>>> cognizant >>>>>> of >>>>>> test performance when accepting new code. >>>>>> >>>>>> I’d like to add one to Stephan’s list of module group. I like that the >>>>>> modules are generic (“libraries”) so that no one module is alone and >>>>>> independent. >>>>>> >>>>>> Flink has three “libraries”: cep, ml, and gelly. >>>>>> >>>>>> “connectors” is a hotspot due to the long-running Kafka tests (and >>>>>> connectors for three Kafka versions). >>>>>> >>>>>> Both flink-storm and flink-python have a modest number of number of >>>>>> tests >>>>>> and could live with the miscellaneous modules in “contrib”. >>>>>> >>>>>> The YARN tests are long-running and problematic (I am unable to >>>>>> successfully run these locally). A “cluster” module could host >>>>>> flink-mesos, >>>>>> flink-yarn, and flink-yarn-tests. >>>>>> >>>>>> That gets us close to running all tests in a single Travis build. >>>>>> https://travis-ci.org/greghogan/flink/builds/212122590 < >>>>>> https://travis-ci.org/greghogan/flink/builds/212122590> < >>>>>> https://travis-ci.org/greghogan/flink/builds/212122590 < >>>>>> https://travis-ci.org/greghogan/flink/builds/212122590>> >>>>>> >>>>>> I also tested (https://github.com/greghogan/flink/commits/core_build >>>>>> < >>>>>> https://github.com/greghogan/flink/commits/core_build> < >>>>>> https://github.com/greghogan/flink/commits/core_build < >>>>>> https://github.com/greghogan/flink/commits/core_build>>) with a maven >>>>>> parallelism of 2 and 4, with the latter a 6.4% drop in build time. >>>>>> https://travis-ci.org/greghogan/flink/builds/212137659 < >>>>>> https://travis-ci.org/greghogan/flink/builds/212137659> < >>>>>> https://travis-ci.org/greghogan/flink/builds/212137659 < >>>>>> https://travis-ci.org/greghogan/flink/builds/212137659>> >>>>>> https://travis-ci.org/greghogan/flink/builds/212154470 < >>>>>> https://travis-ci.org/greghogan/flink/builds/212154470> < >>>>>> https://travis-ci.org/greghogan/flink/builds/212154470 < >>>>>> https://travis-ci.org/greghogan/flink/builds/212154470>> >>>>>> >>>>>> We can run Travis CI builds nightly to guard against breaking changes. >>>>>> >>>>>> I also wanted to get an idea of how disruptive it would be to >>>>>> developers >>>>>> to divide the project into multiple git repos. I wrote a simple python >>>>>> script and configured it with the module partitions listed above. The >>>>>> usage >>>>>> string from the top of the file lists commits with files from multiple >>>>>> partitions and well as the modified files. >>>>>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335 >>>>>> ac4897 < >>>>>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897> < >>>>>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897 < >>>>>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897>> >>>>>> >>>>>> Accounting for the merging of the batch and streaming connector >>>>>> modules, >>>>>> and assuming that the project structure has not changed much over the >>>>>> past >>>>>> 15 months, for the following date ranges the listed number of commits >>>>>> would >>>>>> have been split across repositories. >>>>>> >>>>>> since "2017-01-01" >>>>>> 36 of 571 commits were mixed >>>>>> >>>>>> since "2016-07-01" >>>>>> 155 of 1607 commits were mixed >>>>>> >>>>>> since "2016-01-01" >>>>>> 272 of 2561 commits were mixed >>>>>> >>>>>> Greg >>>>>> >>>>>> >>>>>> On Mar 15, 2017, at 1:13 PM, Stephan Ewen <se...@apache.org <mailto: >>>>>> >>>>>>> se...@apache.org>> wrote: >>>>>>> >>>>>>> @Robert - I think once we know that a separate git repo works well, >>>>>>> and >>>>>>> that it actually solves problems, I see no reason to not create a >>>>>>> connectors repository later. The infrastructure changes should be >>>>>>> >>>>>>> identical >>>>>> >>>>>> for two or more repositories. >>>>>>> >>>>>>> On Wed, Mar 15, 2017 at 5:22 PM, Till Rohrmann <trohrm...@apache.org >>>>>>> <mailto:trohrm...@apache.org>> >>>>>>> >>>>>>> wrote: >>>>>> >>>>>> I think it should not be at least the flink-dist but exactly the >>>>>>> remaining >>>>>>> flink-dist module. Otherwise we do redundant work. >>>>>>> >>>>>>>> On Wed, Mar 15, 2017 at 5:03 PM, Robert Metzger < >>>>>>>> rmetz...@apache.org >>>>>>>> <mailto:rmetz...@apache.org>> >>>>>>>> wrote: >>>>>>>> >>>>>>>> "flink-core" means the main repository, not the "flink-core" module. >>>>>>>> >>>>>>>>> When doing a release, we need to build the flink main code first, >>>>>>>>> >>>>>>>>> because >>>>>>>> >>>>>>> the flink-libraries depend on that. >>>>>>> >>>>>>>> Once the "flink-libraries" are build, we need to run the main build >>>>>>>>> >>>>>>>>> again >>>>>>>> >>>>>>> (at least the flink-dist module), so that it is pulling the artifacts >>>>>>> >>>>>>>> from >>>>>>>> >>>>>>>> the flink-libraries to put them into the opt/ folder of the final >>>>>>>>> >>>>>>>>> artifact. >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, Mar 15, 2017 at 4:44 PM, Till Rohrmann < >>>>>>>>> trohrm...@apache.org >>>>>>>>> <mailto:trohrm...@apache.org>> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> I'm ok with point 3. >>>>>>>>> >>>>>>>>>> Concerning point 8: Why do we have to build flink-core twice after >>>>>>>>>> >>>>>>>>>> having >>>>>>>>> it built as a dependency for flink-libraries? This seems wrong to >>>>>>>>> me. >>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> Till >>>>>>>>>> >>>>>>>>>> On Wed, Mar 15, 2017 at 4:23 PM, Robert Metzger < >>>>>>>>>> rmetz...@apache.org <mailto:rmetz...@apache.org>> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Thank you. Running on AWS is a good idea! >>>>>>>>>> >>>>>>>>>>> Let me know if you (or anybody else) wants to help me with the >>>>>>>>>>> infrastructure work! Any help is much appreciated (as I've said >>>>>>>>>>> >>>>>>>>>>> before, I >>>>>>>>>> don't really have time for doing this, but it has to be done :) ) >>>>>>>>>> >>>>>>>>>>> I'm against creating two new repositories. I fear that this >>>>>>>>>>> >>>>>>>>>>> introduces >>>>>>>>>> >>>>>>>>> too >>>>>>>>> >>>>>>>>>> much complexity and too many repositories. >>>>>>>>>>> "flink" and "flink-libraries" are hopefully enough to get the >>>>>>>>>>> build >>>>>>>>>>> >>>>>>>>>>> time >>>>>>>>>> significantly down. >>>>>>>>>> >>>>>>>>>>> We can also consider putting the connectors into the >>>>>>>>>>> >>>>>>>>>>> "flink-libraries" >>>>>>>>>> >>>>>>>>> repo >>>>>>>>> >>>>>>>>>> if we need to further reduce the build time. >>>>>>>>>>> >>>>>>>>>>> We should probably move "flink-table" of out "flink-libraries" if >>>>>>>>>>> we >>>>>>>>>>> >>>>>>>>>>> want >>>>>>>>>> to keep "flink-table" in the main repo. (This would eliminate the >>>>>>>>>> >>>>>>>>>>> "flink-libraries" module from main. >>>>>>>>>>> >>>>>>>>>>> Also, I agree that "flink-statebackend-rocksdb" is not correctly >>>>>>>>>>> >>>>>>>>>>> placed >>>>>>>>>> >>>>>>>>> in >>>>>>>>> >>>>>>>>>> contrib anymore. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Wed, Mar 15, 2017 at 4:07 PM, Greg Hogan <c...@greghogan.com >>>>>>>>>>> <mailto:c...@greghogan.com>> >>>>>>>>>>> >>>>>>>>>>> wrote: >>>>>>>>>> Robert, appreciate your kickstarting this task. >>>>>>>>>> >>>>>>>>>>> We should compare the verification time with and without the >>>>>>>>>>>> listed >>>>>>>>>>>> modules. I’ll try to run this by tomorrow on AWS and on Travis. >>>>>>>>>>>> >>>>>>>>>>>> Should we maintain separate repos for flink-contrib and >>>>>>>>>>>> >>>>>>>>>>>> flink-libraries? >>>>>>>>>>> Are you intending that we move flink-table out of flink-libraries >>>>>>>>>>> (and >>>>>>>>>>> >>>>>>>>>> perhaps flink-statebackend-rocksdb out of flink-contrib)? >>>>>>>>>> >>>>>>>>>>> Greg >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Mar 15, 2017, at 9:55 AM, Robert Metzger < >>>>>>>>>>>> rmetz...@apache.org >>>>>>>>>>>> >>>>>>>>>>>>> <mailto:rmetz...@apache.org> >>>>>>>>>>>>> >>>>>>>>>>>>> wrote: >>>>>>>>>>>> Thank you for looking into this Till. >>>>>>>>>>>> >>>>>>>>>>>>> I think we then have to split the repositories. >>>>>>>>>>>>> My main motivation for doing this is that it seems to be the >>>>>>>>>>>>> only >>>>>>>>>>>>> >>>>>>>>>>>>> feasible >>>>>>>>>>>> >>>>>>>>>>>> way of scaling the community to allow more committers working on >>>>>>>>>>>>> >>>>>>>>>>>>> the >>>>>>>>>>>> >>>>>>>>>>> libraries. >>>>>>>>>> >>>>>>>>>>> I'll take care of getting things started. >>>>>>>>>>>>> >>>>>>>>>>>>> As the next steps I propose to: >>>>>>>>>>>>> 1. Ask INFRA to rename https://git-wip-us.apache.org/ < >>>>>>>>>>>>> https://git-wip-us.apache.org/> >>>>>>>>>>>>> >>>>>>>>>>>>> repos/asf?p=flink- >>>>>>>>>>>> connectors.git;a=summary to "flink-libraries" >>>>>>>>>>>> >>>>>>>>>>>>> 2. Ask INFRA to set up GitHub and travis integration for >>>>>>>>>>>>> >>>>>>>>>>>>> "flink-libraries" >>>>>>>>>>>> >>>>>>>>>>>> 3. Put the code of "flink-ml", "flink-gelly", "flink-python", >>>>>>>>>>>>> >>>>>>>>>>>>> "flink-cep", >>>>>>>>>>>> >>>>>>>>>>>> "flink-scala-shell", "flink-storm" into the new repository. (I >>>>>>>>>>>>> >>>>>>>>>>>>> decided >>>>>>>>>>>> >>>>>>>>>>> against moving flink-contrib there, because rocksdb is in the >>>>>>>>>>> >>>>>>>>>>>> contrib >>>>>>>>>>>> >>>>>>>>>>> module, for flink-table, I'm undecided, but I kept it in the main >>>>>>>>>> >>>>>>>>>>> repo >>>>>>>>>>>> >>>>>>>>>>> because its probably going to interact more with the core code in >>>>>>>>>>> >>>>>>>>>>>> the >>>>>>>>>>>> >>>>>>>>>>> future) >>>>>>>>>> >>>>>>>>>>> I try to preserve the history of those modules when splitting >>>>>>>>>>>>> >>>>>>>>>>>>> them >>>>>>>>>>>> >>>>>>>>>>> into >>>>>>>>> >>>>>>>>>> the >>>>>>>>>>> >>>>>>>>>>>> new repo >>>>>>>>>>>>> 4. I'll close all pull requests against those modules in the >>>>>>>>>>>>> main >>>>>>>>>>>>> >>>>>>>>>>>>> repo. >>>>>>>>>>>> >>>>>>>>>>> 5. I'll set up a minimal documentation page for the library >>>>>>>>>>> >>>>>>>>>>>> repository, >>>>>>>>>>>> >>>>>>>>>>> similar to the main documentation. >>>>>>>>>>> >>>>>>>>>>>> 6. I'll update the documentation build process to build both >>>>>>>>>>>>> >>>>>>>>>>>>> documentations >>>>>>>>>>>> >>>>>>>>>>>> & link them to each other >>>>>>>>>>>>> 7. I'll update the nightly deployment process to include both >>>>>>>>>>>>> >>>>>>>>>>>>> repositories >>>>>>>>>>>> >>>>>>>>>>>> 8. I'll update the release script to create the Flink release >>>>>>>>>>>>> out >>>>>>>>>>>>> >>>>>>>>>>>>> of >>>>>>>>>>>> >>>>>>>>>>> both >>>>>>>>>> >>>>>>>>>>> repositories. In order to put the libraries into the opt/ dir of >>>>>>>>>>>> the >>>>>>>>>>>> >>>>>>>>>>> release, I'll need to change the build of "flink-dist" so that it >>>>>>>>>> >>>>>>>>>>> first >>>>>>>>>>>> >>>>>>>>>>> builds flink core, then the libraries and then the core again >>>>>>>>>>> >>>>>>>>>>>> with >>>>>>>>>>>> >>>>>>>>>>> the >>>>>>>>> >>>>>>>>>> libraries as an additional dependency. >>>>>>>>>>> >>>>>>>>>>>> The main question for the community is: do you agree with point >>>>>>>>>>>>> >>>>>>>>>>>>> 3 ? >>>>>>>>>>>> >>>>>>>>>>> Would >>>>>>>>> >>>>>>>>>> you like to include more or less? >>>>>>>>>>>> >>>>>>>>>>>>> I'll start with 1. and 2. tomorrow morning. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, Mar 15, 2017 at 1:48 PM, Till Rohrmann < >>>>>>>>>>>>> >>>>>>>>>>>>> trohrm...@apache.org <mailto:trohrm...@apache.org> >>>>>>>>>>>> >>>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> In theory we could have a merging bot which solves the problem >>>>>>>>>>>>> of >>>>>>>>>>>>> >>>>>>>>>>>> the >>>>>>>>> >>>>>>>>>> "commit window". Once the PR passes all tests and has enough >>>>>>>>>>> >>>>>>>>>>>> +1s, >>>>>>>>>>>>> >>>>>>>>>>>> the >>>>>>>>> >>>>>>>>>> bot >>>>>>>>>>> >>>>>>>>>>>> could do the merging and, thus, it effectively linearizes the >>>>>>>>>>>>> merge >>>>>>>>>>>>> >>>>>>>>>>>> process. >>>>>>>>>> >>>>>>>>>>> I think the second point is actually a disadvantage because >>>>>>>>>>>>>> >>>>>>>>>>>>>> there >>>>>>>>>>>>> >>>>>>>>>>>> is >>>>>>>>> >>>>>>>>> not >>>>>>>>>> >>>>>>>>>>> such an immediate incentive/pressure to fix the broken module if >>>>>>>>>>>> >>>>>>>>>>>>> it >>>>>>>>>>>>> >>>>>>>>>>>> lives >>>>>>>>>> >>>>>>>>>>> in a separate repository. Furthermore, breaking API changes in >>>>>>>>>>>>> the >>>>>>>>>>>>> >>>>>>>>>>>> core >>>>>>>>> >>>>>>>>>> will most likely go unnoticed for some time in other modules >>>>>>>>>>>> >>>>>>>>>>>>> which >>>>>>>>>>>>> >>>>>>>>>>>> are >>>>>>>>> >>>>>>>>>> not >>>>>>>>>>> >>>>>>>>>>>> developed so actively. In the worst case these things will only >>>>>>>>>>>>> be >>>>>>>>>>>>> >>>>>>>>>>>> noticed >>>>>>>>> >>>>>>>>>> when we try to make a release. >>>>>>>>>>>>> >>>>>>>>>>>>>> But I also agree that we are not Google and we don't have the >>>>>>>>>>>>>> >>>>>>>>>>>>>> capacities to >>>>>>>>>>>>> maintain such a smooth a build process that we can keep all the >>>>>>>>>>>>> code >>>>>>>>>>>>> >>>>>>>>>>>> in >>>>>>>>>> >>>>>>>>>>> a >>>>>>>>>>>> >>>>>>>>>>>> single repository. >>>>>>>>>>>>> >>>>>>>>>>>>>> I looked a bit into Gradle and as far as I can tell it offers >>>>>>>>>>>>>> >>>>>>>>>>>>>> some >>>>>>>>>>>>> >>>>>>>>>>>> nice >>>>>>>>> >>>>>>>>>> features wrt incrementally building projects. This would be >>>>>>>>>>>> >>>>>>>>>>>>> beneficial >>>>>>>>>>>>> >>>>>>>>>>>> for >>>>>>>>>>> >>>>>>>>>>>> local development but it would not solve our build time problems >>>>>>>>>>>>> on >>>>>>>>>>>>> >>>>>>>>>>>> Travis. >>>>>>>>>> >>>>>>>>>>> Gradle intends to introduce a task result cache which allows to >>>>>>>>>>>>> reuse >>>>>>>>>>>>> >>>>>>>>>>>> results across builds. This could help when building on Travis, >>>>>>>>>>> >>>>>>>>>>>> however, it >>>>>>>>>>>>> is not yet fully implemented. Moreover, migrating from Maven to >>>>>>>>>>>>> Gradle >>>>>>>>>>>>> >>>>>>>>>>>> won't come for free (there's simply no free lunch out there) and >>>>>>>>>>> >>>>>>>>>>>> we >>>>>>>>>>>>> >>>>>>>>>>>> might >>>>>>>>>> >>>>>>>>>>> risk to introduce new bugs. Therefore, I would vote to split the >>>>>>>>>>>>> repository >>>>>>>>>>>>> in order to mitigate our current problems with Travis and the >>>>>>>>>>>>> build >>>>>>>>>>>>> >>>>>>>>>>>> time in >>>>>>>>>> >>>>>>>>>>> general. Whether to use a different build system or not can then >>>>>>>>>>>>> be >>>>>>>>>>>>> >>>>>>>>>>>> discussed as an orthogonal question. >>>>>>>>>> >>>>>>>>>>> Cheers, >>>>>>>>>>>>>> Till >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Tue, Mar 14, 2017 at 8:05 PM, Stephan Ewen < >>>>>>>>>>>>>> se...@apache.org >>>>>>>>>>>>>> <mailto:se...@apache.org> >>>>>>>>>>>>>> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>> Some other thoughts on how repository split would help. I am >>>>>>>>>>>> >>>>>>>>>>>>> not >>>>>>>>>>>>>> >>>>>>>>>>>>> sure >>>>>>>>> >>>>>>>>>> for >>>>>>>>>>> >>>>>>>>>>>> all of them, so please comment: >>>>>>>>>>>>> >>>>>>>>>>>>>> - There is less competition for a "commit window". It happens >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> a >>>>>>>>>>>>>> >>>>>>>>>>>>> lot >>>>>>>>> >>>>>>>>>> already that you run all tests and want to commit, but there >>>>>>>>>>> >>>>>>>>>>>> was >>>>>>>>>>>>>> >>>>>>>>>>>>> a >>>>>>>>> >>>>>>>>> commit >>>>>>>>>> >>>>>>>>>>> in the meantime. You rebase, need to re-test, again commit in >>>>>>>>>>>>> >>>>>>>>>>>>>> the >>>>>>>>>>>>>> >>>>>>>>>>>>> meantime. >>>>>>>>> >>>>>>>>>> For a "linear" commit history, this may become a bottleneck >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> eventually >>>>>>>>>>>>>> >>>>>>>>>>>>>> as well. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> - There is less risk of broken master. If one >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> repository/modules >>>>>>>>>>>>>> >>>>>>>>>>>>> breaks >>>>>>>>> >>>>>>>>>> its master, the others can still continue. >>>>>>>>>>>>> >>>>>>>>>>>>>> Stephan >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Fri, Mar 10, 2017 at 12:20 PM, Till Rohrmann < >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> trohrm...@apache.org <mailto:trohrm...@apache.org>> >>>>>>>>>>>>>> >>>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Thanks for all your input. In order to wrap the discussion up >>>>>>>>>>>>>>> I'd >>>>>>>>>>>>>>> >>>>>>>>>>>>>> like >>>>>>>>>> >>>>>>>>>>> to >>>>>>>>>>>> >>>>>>>>>>>>> summarize the mentioned points: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The problem of increasing build times and complexity of the >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> project >>>>>>>>>>>>>>> >>>>>>>>>>>>>> has >>>>>>>>>>> >>>>>>>>>>>> been acknowledged. Ideally we would have everything in one >>>>>>>>>>>>> >>>>>>>>>>>>>> repository >>>>>>>>>>>>>>> >>>>>>>>>>>>>> using >>>>>>>>>>>> >>>>>>>>>>>>> an incremental build tool. Since Maven does not properly >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> support >>>>>>>>>>>>>>> >>>>>>>>>>>>>> this >>>>>>>>> >>>>>>>>>> we >>>>>>>>>>>> >>>>>>>>>>>>> would have to switch our build tool to something like Gradle, >>>>>>>>>>>>>>> for >>>>>>>>>>>>>>> >>>>>>>>>>>>>> example. >>>>>>>>>> >>>>>>>>>>> Another option is introducing build profiles for different >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> sets >>>>>>>>>>>>>>> >>>>>>>>>>>>>> of >>>>>>>>> >>>>>>>>> modules >>>>>>>>>> >>>>>>>>>>> as well as separating integration and unit tests. The third >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> alternative >>>>>>>>>>>>>>> >>>>>>>>>>>>>> would be creating sub-projects with their own repositories. I >>>>>>>>>>>>> >>>>>>>>>>>>>> actually >>>>>>>>>>>>>>> >>>>>>>>>>>>>> think that these two proposal are not necessarily exclusive >>>>>>>>>>>> >>>>>>>>>>>>> and >>>>>>>>>>>>>>> >>>>>>>>>>>>>> it >>>>>>>>> >>>>>>>>> would >>>>>>>>>> >>>>>>>>>>> also make sense to have a separation between unit and >>>>>>>>>>>>>>> integration >>>>>>>>>>>>>>> >>>>>>>>>>>>>> tests >>>>>>>>>> >>>>>>>>>>> if >>>>>>>>>>>>> >>>>>>>>>>>>>> we split the respository. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The overall consensus seems to be that we don't want to >>>>>>>>>>>>>>>> split >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>> >>>>>>>>>>>>>> community >>>>>>>>>> >>>>>>>>>>> and want to keep everything under the same umbrella. I think >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> this >>>>>>>>>>>>>>> >>>>>>>>>>>>>> is >>>>>>>>>> >>>>>>>>>> the >>>>>>>>>>> >>>>>>>>>>>> right way to go, because otherwise some parts of the project >>>>>>>>>>>>>>> could >>>>>>>>>>>>>>> >>>>>>>>>>>>>> become >>>>>>>>>> >>>>>>>>>>> second class citizens. Given that and that we continue using >>>>>>>>>>>>>>> Maven, >>>>>>>>>>>>>>> >>>>>>>>>>>>>> I >>>>>>>>>>> >>>>>>>>>>> still >>>>>>>>>>>> >>>>>>>>>>>>> think that creating sub-projects for the libraries, for >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> example, >>>>>>>>>>>>>>> >>>>>>>>>>>>>> could >>>>>>>>> >>>>>>>>>> be >>>>>>>>>>>> >>>>>>>>>>>>> beneficial. A split could reduce the project's complexity and >>>>>>>>>>>>>>> make >>>>>>>>>>>>>>> >>>>>>>>>>>>>> it >>>>>>>>>> >>>>>>>>>>> potentially easier for libraries to get actively developed. >>>>>>>>>>>> >>>>>>>>>>>>> The >>>>>>>>>>>>>>> >>>>>>>>>>>>>> main >>>>>>>>> >>>>>>>>>> concern is setting up the build infrastructure to aggregate >>>>>>>>>>> >>>>>>>>>>>> docs >>>>>>>>>>>>>>> >>>>>>>>>>>>>> from >>>>>>>>> >>>>>>>>>> multiple repositories and making them publicly available. >>>>>>>>>>>> >>>>>>>>>>>>> Since I started this thread and I would really like to see >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Flink's >>>>>>>>>>>>>>> >>>>>>>>>>>>>> ML >>>>>>>>>> >>>>>>>>>>> library being revived again, I'd volunteer investigating first >>>>>>>>>>>> >>>>>>>>>>>>> whether >>>>>>>>>>>>>>> >>>>>>>>>>>>>> it >>>>>>>>>>>> >>>>>>>>>>>>> is doable establishing a proper incremental build for Flink. >>>>>>>>>>>>>>> If >>>>>>>>>>>>>>> >>>>>>>>>>>>>> that >>>>>>>>> >>>>>>>>>> should >>>>>>>>>>> >>>>>>>>>>>> not be possible, I will look into splitting the repository, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> first >>>>>>>>>>>>>>> >>>>>>>>>>>>>> only >>>>>>>>>> >>>>>>>>>>> for >>>>>>>>>>>> >>>>>>>>>>>>> the libraries. I'll share my results with the community once >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I'm >>>>>>>>>>>>>>> >>>>>>>>>>>>>> done >>>>>>>>> >>>>>>>>>> with >>>>>>>>>>>> >>>>>>>>>>>>> the investigation. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>> Till >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Fri, Feb 24, 2017 at 3:50 PM, Robert Metzger < >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> rmetz...@apache.org <mailto:rmetz...@apache.org>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> @Jin Mingjian: You can not use the paid travis version for >>>>>>>>>>>>>>>> open >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> source >>>>>>>>> >>>>>>>>>> projects. It only works for private repositories (at least >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> back >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> then >>>>>>>>> >>>>>>>>>> when >>>>>>>>>>>> >>>>>>>>>>>>> we've asked them about that). >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> @Stephan: I don't think that incremental builds will be >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> available >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> with >>>>>>>>>> >>>>>>>>>>> Maven anytime soon. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I agree that we need to fix the build time issue on Travis. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I've >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> recently >>>>>>>>>> >>>>>>>>>>> pushed a commit to use now three instead of two test groups. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> But I don't think that this is feasible long-term solution. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> If this discussion is only about reducing the build and >>>>>>>>>>>>>>>>> test >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> time, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> introducing build profiles for different components as >>>>>>>>>>> >>>>>>>>>>>> Aljoscha >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> suggested >>>>>>>>> >>>>>>>>>> would solve the problem Till mentioned. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Also, if we decide that travis is not a good tool anymore >>>>>>>>>>>>>>>>> for >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> testing, >>>>>>>>>> >>>>>>>>>>> I guess we can find a different solution. There are now >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> competitors >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> to >>>>>>>>>>> >>>>>>>>>>>> Travis that might be willing to offer a paid plan for an open >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> source >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> project, or we set up our own infra on a server sponsored by >>>>>>>>>>>> >>>>>>>>>>>>> one >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> of >>>>>>>>>> >>>>>>>>>> the >>>>>>>>>>> >>>>>>>>>>>> contributing companies. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> If we want to solve "community issues" with the change as >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> well, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> then >>>>>>>>> >>>>>>>>>> I >>>>>>>>>>>> >>>>>>>>>>>>> think its work the effort of splitting up Flink into >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> different >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> repositories. >>>>>>>>> >>>>>>>>>> Splitting up repositories is not a trivial task in my >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> opinion. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> As >>>>>>>>> >>>>>>>>> others >>>>>>>>>> >>>>>>>>>>> have mentioned before, we need to consider the following >>>>>>>>>>>>>>>> things: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> - How are we doing to build the documentation? Ideally every >>>>>>>>>> >>>>>>>>>>> repo >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> should >>>>>>>>>> >>>>>>>>>>> contain its docs, so we would need to pull them together when >>>>>>>>>>>>>>>> building >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> main docs. >>>>>>>>>>>>>>>>> - How do organize the dependencies? If we have library >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> repository >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> depend >>>>>>>>>> >>>>>>>>>>> on >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> snapshot Flink versions, we need to make sure that the >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> snapshot >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> deployment >>>>>>>>> >>>>>>>>>> always works. This also means that people working on a >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> library >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> repository >>>>>>>>> >>>>>>>>>> will pull from snapshot OR need to build first locally. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> - We need to update the release scripts >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> If we commit to do these changes, we need to assign at >>>>>>>>>>>>>>>>> least >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> one >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> committer >>>>>>>>>> >>>>>>>>>>> (yes, in this case we need somebody who can commit, for >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> example >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> for >>>>>>>>> >>>>>>>>>> updating the buildbot stuff) who volunteers to do the change. >>>>>>>>>>> >>>>>>>>>>>> I've done a lot of infrastructure work in the past, but I'm >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> currently >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> pretty booked with many other things, so I don't >>>>>>>>>>>> >>>>>>>>>>>>> realistically >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> see >>>>>>>>> >>>>>>>>>> myself >>>>>>>>>>> >>>>>>>>>>>> doing that. Max who used to work on these things is taking >>>>>>>>>>>>>>>> some >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> time >>>>>>>>> >>>>>>>>>> off. >>>>>>>>>>>> >>>>>>>>>>>>> I think we need, best case 3 days for the change, worst case >>>>>>>>>>>>>>>> 5 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> days. >>>>>>>>> >>>>>>>>>> The >>>>>>>>>>>> >>>>>>>>>>>>> problem is that there are no "unit tests" for the infra >>>>>>>>>>>>>>>> stuff, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>> >