I think a combination of techniques would be effective: - identifying focus areas for the next release (e.g. see Robert's thread about 1.4) - emphasizing design discussion in advance of a PR - assigning reviewers and a steward in a structured way - using a label or assignment field to 'pass the baton' - closing rejected PRs decisively
I don't mean to co-opt this thread for a broader process question, but figured that the PR template could provide additional process guidance. On Mon, Jul 24, 2017 at 11:55 AM, Stephan Ewen <se...@apache.org> wrote: > @Eron Review timeliness would be great to improve. > > Some observation from the past year: > There were periods where some components in Flink were making slow progress > because all committers knowledgeable in those components were busy handling > pull requests that were opened against those components, but were not in > good shape, were adding not discussed designs, etc. > > I think the only way to ensure timely handling of pull requests is to be > very strict in the handling. For example any non-trivial change needs prior > discussion, agreement that this should be fixed now, and an agreed upon > design doc. Otherwise the PR is not considered and simply rejected. Same > for presence of docs, proper tests, ... > > But, I fear that introducing such strictness will scare off many in the > community. So I would be very reluctant to do this. > After all, many pull requests do bring in a good piece of perspective, at > least, even if the code is not immediately suited for contribution... > > > On Mon, Jul 24, 2017 at 8:18 PM, Eron Wright <eronwri...@gmail.com> wrote: > > > This seems like a good step in establishing a better PR process. I > believe > > the process could be improved to ensure timely and targeted review by > > component experts and committers. > > > > On Mon, Jul 17, 2017 at 9:36 AM, Stephan Ewen <se...@apache.org> wrote: > > > > > Hi all! > > > > > > I have reflected a bit on the pull requests and on some of the recent > > > changes to Flink and some of the introduced bugs / regressions that we > > have > > > fixed. > > > > > > One thing that I think would have helped is to have more explicit > > > information about what the pull request does and how the contributor > > would > > > suggest to verify it. I have seen this when contributing to some other > > > project and really liked the approach. > > > > > > It requires that a contributor takes a minute to reflect on what was > > > touched, and what would be ways to verify that the changes work > properly. > > > Besides being a help to the reviewer, it also makes contributors aware > of > > > what is important during the review process. > > > > > > > > > I suggest a new pull request template, as attached below, with a > preview > > > here: > > > https://github.com/StephanEwen/incubator-flink/ > > > blob/pr_template/.github/PULL_REQUEST_TEMPLATE.md > > > > > > Don't be scared, it looks long, but a big part is the introductory text > > > (only relevant for new contributors) and the examples contents for the > > > description. > > > > > > Filling this out for code that is in shape should be a quick thing: > > Remove > > > the into and checklist, write a few sentences on what the PR does (one > > > should do that anyways) and then pick some yes/no in the classification > > > section. > > > > > > Curious to hear what you think! > > > > > > Best, > > > Stephan > > > > > > > > > ============================ > > > > > > Full suggested pull request template: > > > > > > > > > > > > *Thank you very much for contributing to Apache Flink - we are happy > that > > > you want to help us improve Flink. To help the community review you > > > contribution in the best possible way, please go through the checklist > > > below, which will get the contribution into a shape in which it can be > > best > > > reviewed.* > > > > > > *Please understand that we do not do this to make contributions to > Flink > > a > > > hassle. In order to uphold a high standard of quality for code > > > contributions, while at the same time managing a large number of > > > contributions, we need contributors to prepare the contributions well, > > and > > > give reviewers enough contextual information for the review. Please > also > > > understand that contributions that do not follow this guide will take > > > longer to review and thus typically be picked up with lower priority by > > the > > > community.* > > > > > > ## Contribution Checklist > > > > > > - Make sure that the pull request corresponds to a [JIRA issue]( > > > https://issues.apache.org/jira/projects/FLINK/issues). Exceptions are > > made > > > for typos in JavaDoc or documentation files, which need no JIRA issue. > > > > > > - Name the pull request in the form "[FLINK-1234] [component] Title > of > > > the pull request", where *FLINK-1234* should be replaced by the actual > > > issue number. Skip *component* if you are unsure about which is the > best > > > component. > > > Typo fixes that have no associated JIRA issue should be named > following > > > this pattern: `[hotfix] [docs] Fix typo in event time introduction` or > > > `[hotfix] [javadocs] Expand JavaDoc for PuncuatedWatermarkGenerator`. > > > > > > - Fill out the template below to describe the changes contributed by > > the > > > pull request. That will give reviewers the context they need to do the > > > review. > > > > > > - Make sure that the change passes the automated tests, i.e., `mvn > > clean > > > verify` > > > > > > - Each pull request should address only one issue, not mix up code > from > > > multiple issues. > > > > > > - Each commit in the pull request has a meaningful commit message > > > (including the JIRA id) > > > > > > - Once all items of the checklist are addressed, remove the above > text > > > and this checklist, leaving only the filled out template below. > > > > > > > > > **(The sections below can be removed for hotfixes of typos)** > > > > > > ## What is the purpose of the change > > > > > > *(For example: This pull request makes task deployment go through the > > blob > > > server, rather than through RPC. That way we avoid re-transferring them > > on > > > each deployment (during recovery).)* > > > > > > > > > ## Brief change log > > > > > > *(for example:)* > > > - *The TaskInfo is stored in the blob store on job creation time as a > > > persistent artifact* > > > - *Deployments RPC transmits only the blob storage reference* > > > - *TaskManagers retrieve the TaskInfo from the blob cache* > > > > > > > > > ## Verifying this change > > > > > > *(Please pick either of the following options)* > > > > > > This change is a trivial rework / code cleanup without any test > coverage. > > > > > > *(or)* > > > > > > This change is already covered by existing tests, such as *(please > > describe > > > tests)*. > > > > > > *(or)* > > > > > > This change added tests and can be verified as follows: > > > > > > *(example:)* > > > - *Added integration tests for end-to-end deployment with large > > payloads > > > (100MB)* > > > - *Extended integration test for recovery after master (JobManager) > > > failure* > > > - *Added test that validates that TaskInfo is transferred only once > > > across recoveries* > > > - *Manually verified the change by running a 4 node cluser with 2 > > > JobManagers and 4 TaskManagers, a stateful streaming program, and > killing > > > one JobManager and to TaskManagers during the execution, verifying that > > > recovery happens correctly.* > > > > > > ## Does this pull request potentially affect one of the following > parts: > > > > > > - Dependencies (does it add or upgrade a dependency): **(yes / no)** > > > - The public API, i.e., is any changed class annotated with > > > `@Public(Evolving)`: **(yes / no)** > > > - The serializers: **(yes / no / don't know)** > > > - The runtime per-record code paths (performance sensitive): **(yes / > > no > > > / don't know)** > > > - Anything that affects deployment or recovery: JobManager (and its > > > components), Checkpointing, Yarn/Mesos, ZooKeeper: **(yes / no / don't > > > know)**: > > > > > > ## Documentation > > > > > > - Does this pull request introduce a new feature? **(yes / no)** > > > - If yes, how is the feature documented? **(not applicable / docs / > > > JavaDocs / not documented)** > > > > > >