The Beam Jenkins jobs are configured inside the Beam src repo itself. For example: https://github.com/apache/beam/blob/master/.jenkins/job_beam_PostCommit_Java_RunnableOnService_Flink.groovy
For initial setup of the seed job you need admin rights on Jenkins, as described here: https://cwiki.apache.org/confluence/display/INFRA/Jenkins. The somewhat annoying thing is setting up our own “flink” build slaves and maintaining them. There are some general purpose build slaves but high-throughput projects usually have their own build slaves to ensure speedy processing of Jenkins jobs: https://cwiki.apache.org/confluence/display/INFRA/Jenkins+node+labels > On 20 Mar 2017, at 14:40, Timo Walther <twal...@apache.org> wrote: > > Another solution would be to make the Travis builds more efficient. For > example, we could write a script that determines the modified Maven module > and only run the test for this module (and maybe transitive dependencies). > PRs for libraries such as Gelly, Table, CEP or connectors would not trigger a > compilation of the entire stack anymore. Of course this would not solve all > problems but many of it. > > What do you think about this? > > > > Am 20/03/17 um 14:02 schrieb Robert Metzger: >> Aljoscha, do you know how to configure jenkins? >> Is Apache INFRA doing that, or are the beam people doing that themselves? >> >> One downside of Jenkins is that we probably need some machines that execute >> the tests. A Travis container has 2 CPU cores and 4 GB main memory. We >> currently have 10 such containers available on travis concurrently. I think >> we would need at least the same amount on Jenkins. >> >> >> On Mon, Mar 20, 2017 at 1:48 PM, Timo Walther <twal...@apache.org> wrote: >> >>> I agress with Aljoscha that we might consider moving from Jenkins to >>> Travis. Is there any disadvantage in using Jenkins? >>> >>> I think we should structure the project according to release management >>> (e.g. more frequent releases of libraries) or other criteria (e.g. core and >>> non-core) instead of build time. What would happen if the built of another >>> submodule would become too long, would we split/restructure again and >>> again? If Jenkins solves all our problems we should use it. >>> >>> Regards, >>> Timo >>> >>> >>> >>> Am 20/03/17 um 12:21 schrieb Aljoscha Krettek: >>> >>>> I prefer Jenkins to Travis by far. Working on Beam, where we have good >>>> Jenkins integration, has opened my eyes to what is possible with good CI >>>> integration. >>>> >>>> For example, look at this recent Beam PR: https://github.com/apache/beam >>>> /pull/2263 <https://github.com/apache/beam/pull/2263>. The >>>> Jenkins-Github integration will tell you exactly which tests failed and if >>>> you click on the links you can look at the log output/std out of the tests >>>> in question. >>>> >>>> This is the overview page of one of the Jenkins Jobs that we have in >>>> Beam: https://builds.apache.org/job/beam_PostCommit_Java_RunnableO >>>> nService_Flink/ <https://builds.apache.org/job >>>> /beam_PostCommit_Java_RunnableOnService_Flink/>. This is an example of a >>>> stable build: https://builds.apache.org/job/ >>>> beam_PostCommit_Java_RunnableOnService_Flink/lastStableBuild/ < >>>> https://builds.apache.org/job/beam_PostCommit_Java_Runnable >>>> OnService_Flink/lastStableBuild/>. Notice how it gives you fine grained >>>> information about the Maven run. This is an unstable run: >>>> https://builds.apache.org/job/beam_PostCommit_Java_RunnableO >>>> nService_Flink/lastUnstableBuild/ <https://builds.apache.org/job >>>> /beam_PostCommit_Java_RunnableOnService_Flink/lastUnstableBuild/>. There >>>> you can see which tests failed and you can easily drill down. >>>> >>>> Best, >>>> Aljoscha >>>> >>>> On 20 Mar 2017, at 11:46, Robert Metzger <rmetz...@apache.org> wrote: >>>>> Thank you for looking into the build times. >>>>> >>>>> I didn't know that the build time situation is so bad. Even with yarn, >>>>> mesos, connectors and libraries removed, we are still running into the >>>>> build timeout :( >>>>> >>>>> Aljoscha told me that the Beam community is using Jenkins for running >>>>> the tests, and they are planning to completely move away from Travis. I >>>>> wonder whether we should do the same, as having our own Jenkins servers >>>>> would allow us to run tests for more than 50 minutes. >>>>> >>>>> I agree with Stephan that we should keep the yarn and mesos tests in the >>>>> core for stability / testing quality purposes. >>>>> >>>>> >>>>> On Mon, Mar 20, 2017 at 11:27 AM, Stephan Ewen <se...@apache.org >>>>> <mailto:se...@apache.org>> wrote: >>>>> @Greg >>>>> >>>>> I am personally in favor of splitting "connectors" and "contrib" out as >>>>> well. I know that @rmetzger has some reservations about the connectors, >>>>> but >>>>> we may be able to convince him. >>>>> >>>>> For the cluster tests (yarn / mesos) - in the past there were many cases >>>>> where these tests caught cases that other tests did not, because they are >>>>> the only tests that actually use the "flink-dist.jar" and thus discover >>>>> many dependency and configuration issues. For that reason, my feeling >>>>> would >>>>> be that they are valuable in the core repository. >>>>> >>>>> I would actually suggest to do only the library split initially, to see >>>>> what the challenges are in setting up the multi-repo build and release >>>>> tooling. Once we gathered experience there, we can probably easily see >>>>> what >>>>> else we can split out. >>>>> >>>>> Stephan >>>>> >>>>> >>>>> On Fri, Mar 17, 2017 at 8:37 PM, Greg Hogan <c...@greghogan.com <mailto: >>>>> c...@greghogan.com>> wrote: >>>>> >>>>> I’d like to use this refactoring opportunity to unspilt the Travis tests. >>>>>> With 51 builds queued up for the weekend (some of which may fail or have >>>>>> been force pushed) we are at the limit of the number of contributions we >>>>>> can process. Fixing this requires 1) splitting the project, 2) >>>>>> investigating speedups for long-running tests, and 3) staying cognizant >>>>>> of >>>>>> test performance when accepting new code. >>>>>> >>>>>> I’d like to add one to Stephan’s list of module group. I like that the >>>>>> modules are generic (“libraries”) so that no one module is alone and >>>>>> independent. >>>>>> >>>>>> Flink has three “libraries”: cep, ml, and gelly. >>>>>> >>>>>> “connectors” is a hotspot due to the long-running Kafka tests (and >>>>>> connectors for three Kafka versions). >>>>>> >>>>>> Both flink-storm and flink-python have a modest number of number of >>>>>> tests >>>>>> and could live with the miscellaneous modules in “contrib”. >>>>>> >>>>>> The YARN tests are long-running and problematic (I am unable to >>>>>> successfully run these locally). A “cluster” module could host >>>>>> flink-mesos, >>>>>> flink-yarn, and flink-yarn-tests. >>>>>> >>>>>> That gets us close to running all tests in a single Travis build. >>>>>> https://travis-ci.org/greghogan/flink/builds/212122590 < >>>>>> https://travis-ci.org/greghogan/flink/builds/212122590> < >>>>>> https://travis-ci.org/greghogan/flink/builds/212122590 < >>>>>> https://travis-ci.org/greghogan/flink/builds/212122590>> >>>>>> >>>>>> I also tested (https://github.com/greghogan/flink/commits/core_build < >>>>>> https://github.com/greghogan/flink/commits/core_build> < >>>>>> https://github.com/greghogan/flink/commits/core_build < >>>>>> https://github.com/greghogan/flink/commits/core_build>>) with a maven >>>>>> parallelism of 2 and 4, with the latter a 6.4% drop in build time. >>>>>> https://travis-ci.org/greghogan/flink/builds/212137659 < >>>>>> https://travis-ci.org/greghogan/flink/builds/212137659> < >>>>>> https://travis-ci.org/greghogan/flink/builds/212137659 < >>>>>> https://travis-ci.org/greghogan/flink/builds/212137659>> >>>>>> https://travis-ci.org/greghogan/flink/builds/212154470 < >>>>>> https://travis-ci.org/greghogan/flink/builds/212154470> < >>>>>> https://travis-ci.org/greghogan/flink/builds/212154470 < >>>>>> https://travis-ci.org/greghogan/flink/builds/212154470>> >>>>>> >>>>>> We can run Travis CI builds nightly to guard against breaking changes. >>>>>> >>>>>> I also wanted to get an idea of how disruptive it would be to developers >>>>>> to divide the project into multiple git repos. I wrote a simple python >>>>>> script and configured it with the module partitions listed above. The >>>>>> usage >>>>>> string from the top of the file lists commits with files from multiple >>>>>> partitions and well as the modified files. >>>>>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897 < >>>>>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897> < >>>>>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897 < >>>>>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897>> >>>>>> >>>>>> Accounting for the merging of the batch and streaming connector modules, >>>>>> and assuming that the project structure has not changed much over the >>>>>> past >>>>>> 15 months, for the following date ranges the listed number of commits >>>>>> would >>>>>> have been split across repositories. >>>>>> >>>>>> since "2017-01-01" >>>>>> 36 of 571 commits were mixed >>>>>> >>>>>> since "2016-07-01" >>>>>> 155 of 1607 commits were mixed >>>>>> >>>>>> since "2016-01-01" >>>>>> 272 of 2561 commits were mixed >>>>>> >>>>>> Greg >>>>>> >>>>>> >>>>>> On Mar 15, 2017, at 1:13 PM, Stephan Ewen <se...@apache.org <mailto: >>>>>>> se...@apache.org>> wrote: >>>>>>> >>>>>>> @Robert - I think once we know that a separate git repo works well, and >>>>>>> that it actually solves problems, I see no reason to not create a >>>>>>> connectors repository later. The infrastructure changes should be >>>>>>> >>>>>> identical >>>>>> >>>>>>> for two or more repositories. >>>>>>> >>>>>>> On Wed, Mar 15, 2017 at 5:22 PM, Till Rohrmann <trohrm...@apache.org >>>>>>> <mailto:trohrm...@apache.org>> >>>>>>> >>>>>> wrote: >>>>>> >>>>>>> I think it should not be at least the flink-dist but exactly the >>>>>>> remaining >>>>>>> flink-dist module. Otherwise we do redundant work. >>>>>>>> On Wed, Mar 15, 2017 at 5:03 PM, Robert Metzger <rmetz...@apache.org >>>>>>>> <mailto:rmetz...@apache.org>> >>>>>>>> wrote: >>>>>>>> >>>>>>>> "flink-core" means the main repository, not the "flink-core" module. >>>>>>>>> When doing a release, we need to build the flink main code first, >>>>>>>>> >>>>>>>> because >>>>>>> the flink-libraries depend on that. >>>>>>>>> Once the "flink-libraries" are build, we need to run the main build >>>>>>>>> >>>>>>>> again >>>>>>> (at least the flink-dist module), so that it is pulling the artifacts >>>>>>>> from >>>>>>>> >>>>>>>>> the flink-libraries to put them into the opt/ folder of the final >>>>>>>>> >>>>>>>> artifact. >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, Mar 15, 2017 at 4:44 PM, Till Rohrmann <trohrm...@apache.org >>>>>>>>> <mailto:trohrm...@apache.org>> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> I'm ok with point 3. >>>>>>>>>> Concerning point 8: Why do we have to build flink-core twice after >>>>>>>>>> >>>>>>>>> having >>>>>>>>> it built as a dependency for flink-libraries? This seems wrong to me. >>>>>>>>>> Cheers, >>>>>>>>>> Till >>>>>>>>>> >>>>>>>>>> On Wed, Mar 15, 2017 at 4:23 PM, Robert Metzger < >>>>>>>>>> rmetz...@apache.org <mailto:rmetz...@apache.org>> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Thank you. Running on AWS is a good idea! >>>>>>>>>>> Let me know if you (or anybody else) wants to help me with the >>>>>>>>>>> infrastructure work! Any help is much appreciated (as I've said >>>>>>>>>>> >>>>>>>>>> before, I >>>>>>>>>> don't really have time for doing this, but it has to be done :) ) >>>>>>>>>>> I'm against creating two new repositories. I fear that this >>>>>>>>>>> >>>>>>>>>> introduces >>>>>>>>> too >>>>>>>>>>> much complexity and too many repositories. >>>>>>>>>>> "flink" and "flink-libraries" are hopefully enough to get the build >>>>>>>>>>> >>>>>>>>>> time >>>>>>>>>> significantly down. >>>>>>>>>>> We can also consider putting the connectors into the >>>>>>>>>>> >>>>>>>>>> "flink-libraries" >>>>>>>>> repo >>>>>>>>>>> if we need to further reduce the build time. >>>>>>>>>>> >>>>>>>>>>> We should probably move "flink-table" of out "flink-libraries" if >>>>>>>>>>> we >>>>>>>>>>> >>>>>>>>>> want >>>>>>>>>> to keep "flink-table" in the main repo. (This would eliminate the >>>>>>>>>>> "flink-libraries" module from main. >>>>>>>>>>> >>>>>>>>>>> Also, I agree that "flink-statebackend-rocksdb" is not correctly >>>>>>>>>>> >>>>>>>>>> placed >>>>>>>>> in >>>>>>>>>>> contrib anymore. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Wed, Mar 15, 2017 at 4:07 PM, Greg Hogan <c...@greghogan.com >>>>>>>>>>> <mailto:c...@greghogan.com>> >>>>>>>>>>> >>>>>>>>>> wrote: >>>>>>>>>> Robert, appreciate your kickstarting this task. >>>>>>>>>>>> We should compare the verification time with and without the >>>>>>>>>>>> listed >>>>>>>>>>>> modules. I’ll try to run this by tomorrow on AWS and on Travis. >>>>>>>>>>>> >>>>>>>>>>>> Should we maintain separate repos for flink-contrib and >>>>>>>>>>>> >>>>>>>>>>> flink-libraries? >>>>>>>>>>> Are you intending that we move flink-table out of flink-libraries >>>>>>>>>>> (and >>>>>>>>>> perhaps flink-statebackend-rocksdb out of flink-contrib)? >>>>>>>>>>>> Greg >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Mar 15, 2017, at 9:55 AM, Robert Metzger <rmetz...@apache.org >>>>>>>>>>>>> <mailto:rmetz...@apache.org> >>>>>>>>>>>>> >>>>>>>>>>>> wrote: >>>>>>>>>>>> Thank you for looking into this Till. >>>>>>>>>>>>> I think we then have to split the repositories. >>>>>>>>>>>>> My main motivation for doing this is that it seems to be the only >>>>>>>>>>>>> >>>>>>>>>>>> feasible >>>>>>>>>>>> >>>>>>>>>>>>> way of scaling the community to allow more committers working on >>>>>>>>>>>>> >>>>>>>>>>>> the >>>>>>>>>> libraries. >>>>>>>>>>>>> I'll take care of getting things started. >>>>>>>>>>>>> >>>>>>>>>>>>> As the next steps I propose to: >>>>>>>>>>>>> 1. Ask INFRA to rename https://git-wip-us.apache.org/ < >>>>>>>>>>>>> https://git-wip-us.apache.org/> >>>>>>>>>>>>> >>>>>>>>>>>> repos/asf?p=flink- >>>>>>>>>>>> connectors.git;a=summary to "flink-libraries" >>>>>>>>>>>>> 2. Ask INFRA to set up GitHub and travis integration for >>>>>>>>>>>>> >>>>>>>>>>>> "flink-libraries" >>>>>>>>>>>> >>>>>>>>>>>>> 3. Put the code of "flink-ml", "flink-gelly", "flink-python", >>>>>>>>>>>>> >>>>>>>>>>>> "flink-cep", >>>>>>>>>>>> >>>>>>>>>>>>> "flink-scala-shell", "flink-storm" into the new repository. (I >>>>>>>>>>>>> >>>>>>>>>>>> decided >>>>>>>>>>> against moving flink-contrib there, because rocksdb is in the >>>>>>>>>>>> contrib >>>>>>>>>> module, for flink-table, I'm undecided, but I kept it in the main >>>>>>>>>>>> repo >>>>>>>>>>> because its probably going to interact more with the core code in >>>>>>>>>>>> the >>>>>>>>>> future) >>>>>>>>>>>>> I try to preserve the history of those modules when splitting >>>>>>>>>>>>> >>>>>>>>>>>> them >>>>>>>>> into >>>>>>>>>>> the >>>>>>>>>>>>> new repo >>>>>>>>>>>>> 4. I'll close all pull requests against those modules in the main >>>>>>>>>>>>> >>>>>>>>>>>> repo. >>>>>>>>>>> 5. I'll set up a minimal documentation page for the library >>>>>>>>>>>> repository, >>>>>>>>>>> similar to the main documentation. >>>>>>>>>>>>> 6. I'll update the documentation build process to build both >>>>>>>>>>>>> >>>>>>>>>>>> documentations >>>>>>>>>>>> >>>>>>>>>>>>> & link them to each other >>>>>>>>>>>>> 7. I'll update the nightly deployment process to include both >>>>>>>>>>>>> >>>>>>>>>>>> repositories >>>>>>>>>>>> >>>>>>>>>>>>> 8. I'll update the release script to create the Flink release out >>>>>>>>>>>>> >>>>>>>>>>>> of >>>>>>>>>> both >>>>>>>>>>>> repositories. In order to put the libraries into the opt/ dir of >>>>>>>>>>>> the >>>>>>>>>> release, I'll need to change the build of "flink-dist" so that it >>>>>>>>>>>> first >>>>>>>>>>> builds flink core, then the libraries and then the core again >>>>>>>>>>>> with >>>>>>>>> the >>>>>>>>>>> libraries as an additional dependency. >>>>>>>>>>>>> The main question for the community is: do you agree with point >>>>>>>>>>>>> >>>>>>>>>>>> 3 ? >>>>>>>>> Would >>>>>>>>>>>> you like to include more or less? >>>>>>>>>>>>> I'll start with 1. and 2. tomorrow morning. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, Mar 15, 2017 at 1:48 PM, Till Rohrmann < >>>>>>>>>>>>> >>>>>>>>>>>> trohrm...@apache.org <mailto:trohrm...@apache.org> >>>>>>>>>> wrote: >>>>>>>>>>>>> In theory we could have a merging bot which solves the problem >>>>>>>>>>>>> of >>>>>>>>> the >>>>>>>>>>> "commit window". Once the PR passes all tests and has enough >>>>>>>>>>>>> +1s, >>>>>>>>> the >>>>>>>>>>> bot >>>>>>>>>>>>> could do the merging and, thus, it effectively linearizes the >>>>>>>>>>>>> merge >>>>>>>>>> process. >>>>>>>>>>>>>> I think the second point is actually a disadvantage because >>>>>>>>>>>>>> >>>>>>>>>>>>> there >>>>>>>>> is >>>>>>>>> >>>>>>>>>> not >>>>>>>>>>>> such an immediate incentive/pressure to fix the broken module if >>>>>>>>>>>>> it >>>>>>>>>> lives >>>>>>>>>>>>> in a separate repository. Furthermore, breaking API changes in >>>>>>>>>>>>> the >>>>>>>>> core >>>>>>>>>>>> will most likely go unnoticed for some time in other modules >>>>>>>>>>>>> which >>>>>>>>> are >>>>>>>>>>> not >>>>>>>>>>>>> developed so actively. In the worst case these things will only >>>>>>>>>>>>> be >>>>>>>>> noticed >>>>>>>>>>>>> when we try to make a release. >>>>>>>>>>>>>> But I also agree that we are not Google and we don't have the >>>>>>>>>>>>>> >>>>>>>>>>>>> capacities to >>>>>>>>>>>>> maintain such a smooth a build process that we can keep all the >>>>>>>>>>>>> code >>>>>>>>>> in >>>>>>>>>>>> a >>>>>>>>>>>> >>>>>>>>>>>>> single repository. >>>>>>>>>>>>>> I looked a bit into Gradle and as far as I can tell it offers >>>>>>>>>>>>>> >>>>>>>>>>>>> some >>>>>>>>> nice >>>>>>>>>>>> features wrt incrementally building projects. This would be >>>>>>>>>>>>> beneficial >>>>>>>>>>> for >>>>>>>>>>>>> local development but it would not solve our build time problems >>>>>>>>>>>>> on >>>>>>>>>> Travis. >>>>>>>>>>>>> Gradle intends to introduce a task result cache which allows to >>>>>>>>>>>>> reuse >>>>>>>>>>> results across builds. This could help when building on Travis, >>>>>>>>>>>>> however, it >>>>>>>>>>>>> is not yet fully implemented. Moreover, migrating from Maven to >>>>>>>>>>>>> Gradle >>>>>>>>>>> won't come for free (there's simply no free lunch out there) and >>>>>>>>>>>>> we >>>>>>>>>> might >>>>>>>>>>>>> risk to introduce new bugs. Therefore, I would vote to split the >>>>>>>>>>>>> repository >>>>>>>>>>>>> in order to mitigate our current problems with Travis and the >>>>>>>>>>>>> build >>>>>>>>>> time in >>>>>>>>>>>>> general. Whether to use a different build system or not can then >>>>>>>>>>>>> be >>>>>>>>>> discussed as an orthogonal question. >>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>> Till >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Tue, Mar 14, 2017 at 8:05 PM, Stephan Ewen <se...@apache.org >>>>>>>>>>>>>> <mailto:se...@apache.org> >>>>>>>>>>>>>> >>>>>>>>>>>>> wrote: >>>>>>>>>>>> Some other thoughts on how repository split would help. I am >>>>>>>>>>>>>> not >>>>>>>>> sure >>>>>>>>>>> for >>>>>>>>>>>>> all of them, so please comment: >>>>>>>>>>>>>>> - There is less competition for a "commit window". It happens >>>>>>>>>>>>>>> >>>>>>>>>>>>>> a >>>>>>>>> lot >>>>>>>>>>> already that you run all tests and want to commit, but there >>>>>>>>>>>>>> was >>>>>>>>> a >>>>>>>>> >>>>>>>>>> commit >>>>>>>>>>>>> in the meantime. You rebase, need to re-test, again commit in >>>>>>>>>>>>>> the >>>>>>>>> meantime. >>>>>>>>>>>>>>> For a "linear" commit history, this may become a bottleneck >>>>>>>>>>>>>>> >>>>>>>>>>>>>> eventually >>>>>>>>>>>>>> >>>>>>>>>>>>>>> as well. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> - There is less risk of broken master. If one >>>>>>>>>>>>>>> >>>>>>>>>>>>>> repository/modules >>>>>>>>> breaks >>>>>>>>>>>>> its master, the others can still continue. >>>>>>>>>>>>>>> Stephan >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Fri, Mar 10, 2017 at 12:20 PM, Till Rohrmann < >>>>>>>>>>>>>>> >>>>>>>>>>>>>> trohrm...@apache.org <mailto:trohrm...@apache.org>> >>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> Thanks for all your input. In order to wrap the discussion up >>>>>>>>>>>>>>> I'd >>>>>>>>>> like >>>>>>>>>>>> to >>>>>>>>>>>>>>> summarize the mentioned points: >>>>>>>>>>>>>>>> The problem of increasing build times and complexity of the >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> project >>>>>>>>>>> has >>>>>>>>>>>>> been acknowledged. Ideally we would have everything in one >>>>>>>>>>>>>>> repository >>>>>>>>>>>> using >>>>>>>>>>>>>>>> an incremental build tool. Since Maven does not properly >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> support >>>>>>>>> this >>>>>>>>>>>> we >>>>>>>>>>>>>>> would have to switch our build tool to something like Gradle, >>>>>>>>>>>>>>> for >>>>>>>>>> example. >>>>>>>>>>>>>>>> Another option is introducing build profiles for different >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> sets >>>>>>>>> of >>>>>>>>> >>>>>>>>>> modules >>>>>>>>>>>>>>>> as well as separating integration and unit tests. The third >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> alternative >>>>>>>>>>>>> would be creating sub-projects with their own repositories. I >>>>>>>>>>>>>>> actually >>>>>>>>>>>> think that these two proposal are not necessarily exclusive >>>>>>>>>>>>>>> and >>>>>>>>> it >>>>>>>>> >>>>>>>>>> would >>>>>>>>>>>>>>> also make sense to have a separation between unit and >>>>>>>>>>>>>>> integration >>>>>>>>>> tests >>>>>>>>>>>>> if >>>>>>>>>>>>>>>> we split the respository. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The overall consensus seems to be that we don't want to split >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> the >>>>>>>>>> community >>>>>>>>>>>>>>>> and want to keep everything under the same umbrella. I think >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> this >>>>>>>>>> is >>>>>>>>>> >>>>>>>>>>> the >>>>>>>>>>>>>>> right way to go, because otherwise some parts of the project >>>>>>>>>>>>>>> could >>>>>>>>>> become >>>>>>>>>>>>>>> second class citizens. Given that and that we continue using >>>>>>>>>>>>>>> Maven, >>>>>>>>>>> I >>>>>>>>>>> >>>>>>>>>>>> still >>>>>>>>>>>>>>>> think that creating sub-projects for the libraries, for >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> example, >>>>>>>>> could >>>>>>>>>>>> be >>>>>>>>>>>>>>> beneficial. A split could reduce the project's complexity and >>>>>>>>>>>>>>> make >>>>>>>>>> it >>>>>>>>>>>> potentially easier for libraries to get actively developed. >>>>>>>>>>>>>>> The >>>>>>>>> main >>>>>>>>>>> concern is setting up the build infrastructure to aggregate >>>>>>>>>>>>>>> docs >>>>>>>>> from >>>>>>>>>>>> multiple repositories and making them publicly available. >>>>>>>>>>>>>>>> Since I started this thread and I would really like to see >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Flink's >>>>>>>>>> ML >>>>>>>>>>>> library being revived again, I'd volunteer investigating first >>>>>>>>>>>>>>> whether >>>>>>>>>>>> it >>>>>>>>>>>>>>> is doable establishing a proper incremental build for Flink. >>>>>>>>>>>>>>> If >>>>>>>>> that >>>>>>>>>>> should >>>>>>>>>>>>>>>> not be possible, I will look into splitting the repository, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> first >>>>>>>>>> only >>>>>>>>>>>> for >>>>>>>>>>>>>>>> the libraries. I'll share my results with the community once >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I'm >>>>>>>>> done >>>>>>>>>>>> with >>>>>>>>>>>>>>>> the investigation. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>> Till >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Fri, Feb 24, 2017 at 3:50 PM, Robert Metzger < >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> rmetz...@apache.org <mailto:rmetz...@apache.org>> >>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>> @Jin Mingjian: You can not use the paid travis version for >>>>>>>>>>>>>>>> open >>>>>>>>> source >>>>>>>>>>>>>>> projects. It only works for private repositories (at least >>>>>>>>>>>>>>>> back >>>>>>>>> then >>>>>>>>>>>> when >>>>>>>>>>>>>>>> we've asked them about that). >>>>>>>>>>>>>>>>> @Stephan: I don't think that incremental builds will be >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> available >>>>>>>>>> with >>>>>>>>>>>>>>> Maven anytime soon. >>>>>>>>>>>>>>>>> I agree that we need to fix the build time issue on Travis. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I've >>>>>>>>>> recently >>>>>>>>>>>>>>>> pushed a commit to use now three instead of two test groups. >>>>>>>>>>>>>>>>> But I don't think that this is feasible long-term solution. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> If this discussion is only about reducing the build and test >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> time, >>>>>>>>>>> introducing build profiles for different components as >>>>>>>>>>>>>>>> Aljoscha >>>>>>>>> suggested >>>>>>>>>>>>>>>> would solve the problem Till mentioned. >>>>>>>>>>>>>>>>> Also, if we decide that travis is not a good tool anymore for >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> the >>>>>>>>>> testing, >>>>>>>>>>>>>>>>> I guess we can find a different solution. There are now >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> competitors >>>>>>>>>>> to >>>>>>>>>>>>>>> Travis that might be willing to offer a paid plan for an open >>>>>>>>>>>>>>>> source >>>>>>>>>>>> project, or we set up our own infra on a server sponsored by >>>>>>>>>>>>>>>> one >>>>>>>>>> of >>>>>>>>>> >>>>>>>>>>> the >>>>>>>>>>>>>>> contributing companies. >>>>>>>>>>>>>>>>> If we want to solve "community issues" with the change as >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> well, >>>>>>>>> then >>>>>>>>>>>> I >>>>>>>>>>>>>>> think its work the effort of splitting up Flink into >>>>>>>>>>>>>>>> different >>>>>>>>> repositories. >>>>>>>>>>>>>>>>> Splitting up repositories is not a trivial task in my >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> opinion. >>>>>>>>> As >>>>>>>>> >>>>>>>>>> others >>>>>>>>>>>>>>>> have mentioned before, we need to consider the following >>>>>>>>>>>>>>>> things: >>>>>>>>>> - How are we doing to build the documentation? Ideally every >>>>>>>>>>>>>>>> repo >>>>>>>>>> should >>>>>>>>>>>>>>>> contain its docs, so we would need to pull them together when >>>>>>>>>>>>>>>> building >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>> main docs. >>>>>>>>>>>>>>>>> - How do organize the dependencies? If we have library >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> repository >>>>>>>>>> depend >>>>>>>>>>>>>>>> on >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> snapshot Flink versions, we need to make sure that the >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> snapshot >>>>>>>>> deployment >>>>>>>>>>>>>>>>> always works. This also means that people working on a >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> library >>>>>>>>> repository >>>>>>>>>>>>>>>> will pull from snapshot OR need to build first locally. >>>>>>>>>>>>>>>>> - We need to update the release scripts >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> If we commit to do these changes, we need to assign at least >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> one >>>>>>>>>> committer >>>>>>>>>>>>>>>>> (yes, in this case we need somebody who can commit, for >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> example >>>>>>>>> for >>>>>>>>>>> updating the buildbot stuff) who volunteers to do the change. >>>>>>>>>>>>>>>>> I've done a lot of infrastructure work in the past, but I'm >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> currently >>>>>>>>>>>> pretty booked with many other things, so I don't >>>>>>>>>>>>>>>> realistically >>>>>>>>> see >>>>>>>>>>> myself >>>>>>>>>>>>>>>> doing that. Max who used to work on these things is taking >>>>>>>>>>>>>>>> some >>>>>>>>> time >>>>>>>>>>>> off. >>>>>>>>>>>>>>>> I think we need, best case 3 days for the change, worst case >>>>>>>>>>>>>>>> 5 >>>>>>>>> days. >>>>>>>>>>>> The >>>>>>>>>>>>>>>> problem is that there are no "unit tests" for the infra >>>>>>>>>>>>>>>> stuff, >>>>>>> >