Great! :-) If Fabian is also OK with trying it out, I would ask Stephan to open a PR for this against Flink.
On Tue, Jul 25, 2017 at 8:50 AM, Chesnay Schepler <ches...@apache.org> wrote: > I'm still apprehensive about it, but don't mind trying it out. It's not like > it can break something. > > > On 24.07.2017 18:52, Ufuk Celebi wrote: >> >> What's the conclusion of last weeks discussion here? >> >> Fabian and Chesnay raised concerns about the introductory text. Are >> you still concerned? >> >> On Wed, Jul 19, 2017 at 10:04 AM, Stephan Ewen <se...@apache.org> wrote: >>> >>> @Chesnay: >>> >>> Put text into template => contributor will have to read it >>> Put link to text into template => most contributors will ignore the link >>> >>> Yes, that is pretty much what my observation from the past is. >>> >>> >>> >>> On Tue, Jul 18, 2017 at 11:03 PM, Chesnay Schepler <ches...@apache.org> >>> wrote: >>> >>>> I'm sorry but i can't follow your logic. >>>> >>>> Put text into template => contributor will definitely read it >>>> Put link to text into template => contributor will completely ignore the >>>> link >>>> >>>> The advantage of the link is we don't duplicate the contribution guide >>>> in >>>> the docs and in the template. >>>> Furthermore, you don't even need to remove something from the template, >>>> since it's just a single line. >>>> >>>> >>>> On 18.07.2017 19:25, Stephan Ewen wrote: >>>> >>>>> Concerning moving text to the contributors guide: >>>>> >>>>> I can only say it again: I believe the contribution guide is almost >>>>> dead >>>>> text. Very few people read it. >>>>> Before the current template was introduced, new contributors rarely >>>>> gave >>>>> the pull request a name with Jira number. That is a good indicator >>>>> about >>>>> how many read this guide. >>>>> Putting the test in the template is a way that every one reads it. >>>>> >>>>> >>>>> I am also wondering what the concern is. >>>>> A new contributor should clearly read through a bit of text, to learn >>>>> what >>>>> we look for in contributions. >>>>> A recurring contributor will not have to read it again, simply remove >>>>> the >>>>> text from the pull request message and go on. >>>>> >>>>> Where is the disadvantage? >>>>> >>>>> >>>>> On Tue, Jul 18, 2017 at 5:35 PM, Nico Kruber <n...@data-artisans.com> >>>>> wrote: >>>>> >>>>> I like the new template but also agree with the text being too long and >>>>>> >>>>>> would >>>>>> move the intro to the contributors guide with a link in the PR >>>>>> template. >>>>>> >>>>>> Regarding the questions to fill out - I'd like the headings to be >>>>>> short >>>>>> and >>>>>> have the affected components last so that documentation is not lost >>>>>> (although >>>>>> being more important than this checklist), e.g.: >>>>>> >>>>>> * Purpose of the change >>>>>> * Brief change log >>>>>> * Verifying the change >>>>>> * Documentation >>>>>> * Affected components >>>>>> >>>>>> The verification options in the original template look a bit too large >>>>>> but >>>>>> it >>>>>> stresses what tests should be added, especially for bigger changes. >>>>>> Can't >>>>>> think of a way to make it shorter though. >>>>>> >>>>>> >>>>>> Nico >>>>>> >>>>>> On Tuesday, 18 July 2017 11:20:41 CEST Chesnay Schepler wrote: >>>>>> >>>>>>> I fully agree with Fabian. >>>>>>> >>>>>>> Multiple-choice questions provide little value to the reviewer, since >>>>>>> the >>>>>>> validity has to be verified in any case. While text answers have to >>>>>>> be >>>>>>> validated as well, >>>>>>> they give some hint to the reviewer as to how it can be verified and >>>>>>> which steps the >>>>>>> contributor did to do so. >>>>>>> >>>>>>> I also agree that it is too long; IMO this is really intimidating to >>>>>>> new >>>>>>> contributors to be greeted with this. >>>>>>> >>>>>>> Ideally we only link to the contributors guide and ask 3 questions: >>>>>>> >>>>>>> * What is the problem? >>>>>>> * How was it fixed? >>>>>>> * How can the fix be verified? >>>>>>> >>>>>>> On 18.07.2017 10:47, Fabian Hueske wrote: >>>>>>> >>>>>>>> I like the sections about purpose, change log, and verification of >>>>>>>> the >>>>>>>> changes. >>>>>>>> >>>>>>>> However, I think the proposed template is too much text. This is >>>>>>>> >>>>>>> probably >>>>>>> the reason why the first attempt to establish a PR template failed. >>>>>>>> >>>>>>>> I would move most of the introduction and explanations incl. >>>>>>>> examples >>>>>>>> >>>>>>> to >>>>>>> the "Contribution Guidelines" and only pass a link. >>>>>>>> >>>>>>>> IMO, the template should be rather shorter than the current one and >>>>>>>> >>>>>>> only >>>>>>> have the link, the sections to fill out, and checkboxes. >>>>>>>> >>>>>>>> I'm also not sure how much the detailed questions will help. >>>>>>>> For example even if the question about changed dependencies is >>>>>>>> answered >>>>>>>> with "no", the reviewer still has to check that. >>>>>>>> >>>>>>>> I think the questions of the current template work differently. >>>>>>>> A question "Does the PR include tests?" suggests to the contributor >>>>>>>> >>>>>>> that >>>>>>> those should be included. Same for documentation. >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Fabian >>>>>>>> >>>>>>>> 2017-07-18 10:05 GMT+02:00 Tzu-Li (Gordon) Tai >>>>>>>> <tzuli...@apache.org>: >>>>>>>> >>>>>>>>> +1, I like this a lot. >>>>>>>>> With the previous template, it doesn’t really resonate with what we >>>>>>>>> should >>>>>>>>> care about, and therefore most of the time I think contributors >>>>>>>>> just >>>>>>>>> delete >>>>>>>>> that template and write down something on their own. >>>>>>>>> >>>>>>>>> I would also like to add: “Savepoint / checkpoint binary formats” >>>>>>>>> to >>>>>>>>> >>>>>>>> the >>>>>>> >>>>>>> potential affect scope check list. >>>>>>>>> >>>>>>>>> I think changes to those deserves an independent yes / no check of >>>>>>>>> its >>>>>>>>> own. >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> Gordon >>>>>>>>> >>>>>>>>> On 18 July 2017 at 3:49:42 PM, Ufuk Celebi (u...@apache.org) wrote: >>>>>>>>> >>>>>>>>> I really like this and vote to change our current template. >>>>>>>>> >>>>>>>>> The simple yes/no/... options are a really good idea. I would add >>>>>>>>> to >>>>>>>>> your email that the questions will equally help reviewers to >>>>>>>>> remember >>>>>>>>> to look at these things, which is just as important. >>>>>>>>> >>>>>>>>> When we merge this, we should make sure to strictly follow the >>>>>>>>> guide. >>>>>>>>> Ideally, in the long term we can even automate some of the >>>>>>>>> yes/no/... >>>>>>>>> questions via a bot... but let's not get ahead of ourselves here >>>>>>>>> ;-) >>>>>>>>> >>>>>>>>> On Mon, Jul 17, 2017 at 6:36 PM, Stephan Ewen <se...@apache.org> >>>>>>>>> >>>>>>>> wrote: >>>>>>> >>>>>>> Hi all! >>>>>>>>>> >>>>>>>>>> I have reflected a bit on the pull requests and on some of the >>>>>>>>>> recent >>>>>>>>>> changes to Flink and some of the introduced bugs / regressions >>>>>>>>>> that >>>>>>>>>> >>>>>>>>> we >>>>>>> >>>>>>> have >>>>>>>>> >>>>>>>>> fixed. >>>>>>>>>> >>>>>>>>>> One thing that I think would have helped is to have more explicit >>>>>>>>>> information about what the pull request does and how the >>>>>>>>>> contributor >>>>>>>>>> >>>>>>>>> would >>>>>>>>> >>>>>>>>> suggest to verify it. I have seen this when contributing to some >>>>>>>>> other >>>>>>> >>>>>>> project and really liked the approach. >>>>>>>>>> >>>>>>>>>> It requires that a contributor takes a minute to reflect on what >>>>>>>>>> was >>>>>>>>>> touched, and what would be ways to verify that the changes work >>>>>>>>>> properly. >>>>>>>>>> Besides being a help to the reviewer, it also makes contributors >>>>>>>>>> >>>>>>>>> aware >>>>>>> >>>>>>> of >>>>>>>>>> >>>>>>>>>> what is important during the review process. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I suggest a new pull request template, as attached below, with a >>>>>>>>>> >>>>>>>>> preview >>>>>>> >>>>>>> here: >>>>>>>>>> >>>>>>>>>> https://github.com/StephanEwen/incubator-flink/ >>>>>>>>>> >>>>>>>>> blob/pr_template/.github/PULL_REQUEST_TEMPLATE.md >>>>>>>>> >>>>>>>>> Don't be scared, it looks long, but a big part is the introductory >>>>>>>>> text >>>>>>> >>>>>>> (only relevant for new contributors) and the examples contents for >>>>>>>>> >>>>>>>>> the >>>>>>> >>>>>>> description. >>>>>>>>>> >>>>>>>>>> Filling this out for code that is in shape should be a quick >>>>>>>>>> thing: >>>>>>>>>> >>>>>>>>> Remove >>>>>>>>> >>>>>>>>> the into and checklist, write a few sentences on what the PR does >>>>>>>>> (one >>>>>>> >>>>>>> should do that anyways) and then pick some yes/no in the >>>>>>>>> >>>>>>>>> classification >>>>>>> >>>>>>> section. >>>>>>>>>> >>>>>>>>>> Curious to hear what you think! >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> Stephan >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> ============================ >>>>>>>>>> >>>>>>>>>> Full suggested pull request template: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> *Thank you very much for contributing to Apache Flink - we are >>>>>>>>>> happy >>>>>>>>>> that >>>>>>>>>> you want to help us improve Flink. To help the community review >>>>>>>>>> you >>>>>>>>>> contribution in the best possible way, please go through the >>>>>>>>>> >>>>>>>>> checklist >>>>>>> >>>>>>> below, which will get the contribution into a shape in which it can >>>>>>>>> >>>>>>>>> be >>>>>>> >>>>>>> best >>>>>>>>> >>>>>>>>> reviewed.* >>>>>>>>>> >>>>>>>>>> *Please understand that we do not do this to make contributions to >>>>>>>>>> >>>>>>>>> Flink >>>>>>> >>>>>>> a >>>>>>>>> >>>>>>>>> hassle. In order to uphold a high standard of quality for code >>>>>>>>>> >>>>>>>>>> contributions, while at the same time managing a large number of >>>>>>>>>> contributions, we need contributors to prepare the contributions >>>>>>>>>> >>>>>>>>> well, >>>>>>> >>>>>>> and >>>>>>>>> >>>>>>>>> give reviewers enough contextual information for the review. Please >>>>>>>>> also >>>>>>> >>>>>>> understand that contributions that do not follow this guide will take >>>>>>>>>> >>>>>>>>>> longer to review and thus typically be picked up with lower >>>>>>>>>> priority >>>>>>>>>> >>>>>>>>> by >>>>>>> >>>>>>> the >>>>>>>>> >>>>>>>>> community.* >>>>>>>>>> >>>>>>>>>> ## Contribution Checklist >>>>>>>>>> >>>>>>>>>> - Make sure that the pull request corresponds to a [JIRA issue]( >>>>>>>>>> https://issues.apache.org/jira/projects/FLINK/issues). Exceptions >>>>>>>>>> >>>>>>>>> are >>>>>>> >>>>>>> made >>>>>>>>> >>>>>>>>> for typos in JavaDoc or documentation files, which need no JIRA >>>>>>>>> issue. >>>>>>> >>>>>>> - Name the pull request in the form "[FLINK-1234] [component] Title >>>>>>>>> >>>>>>>>> of >>>>>>> >>>>>>> the pull request", where *FLINK-1234* should be replaced by the >>>>>>>>> >>>>>>>>> actual >>>>>>> >>>>>>> issue number. Skip *component* if you are unsure about which is the >>>>>>>>> >>>>>>>>> best >>>>>>> >>>>>>> component. >>>>>>>>>> >>>>>>>>>> Typo fixes that have no associated JIRA issue should be named >>>>>>>>>> >>>>>>>>> following >>>>>>> >>>>>>> this pattern: `[hotfix] [docs] Fix typo in event time introduction` >>>>>>>>> >>>>>>>>> or >>>>>>> >>>>>>> `[hotfix] [javadocs] Expand JavaDoc for PuncuatedWatermarkGenerator`. >>>>>>>>>> >>>>>>>>>> - Fill out the template below to describe the changes contributed >>>>>>>>>> by >>>>>>>>>> >>>>>>>>> the >>>>>>> >>>>>>> pull request. That will give reviewers the context they need to do >>>>>>>>> >>>>>>>>> the >>>>>>> >>>>>>> review. >>>>>>>>>> >>>>>>>>>> - Make sure that the change passes the automated tests, i.e., `mvn >>>>>>>>>> >>>>>>>>> clean >>>>>>> >>>>>>> verify` >>>>>>>>>> >>>>>>>>>> - Each pull request should address only one issue, not mix up code >>>>>>>>>> >>>>>>>>> from >>>>>>> >>>>>>> multiple issues. >>>>>>>>>> >>>>>>>>>> - Each commit in the pull request has a meaningful commit message >>>>>>>>>> (including the JIRA id) >>>>>>>>>> >>>>>>>>>> - Once all items of the checklist are addressed, remove the above >>>>>>>>>> >>>>>>>>> text >>>>>>> >>>>>>> and this checklist, leaving only the filled out template below. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> **(The sections below can be removed for hotfixes of typos)** >>>>>>>>>> >>>>>>>>>> ## What is the purpose of the change >>>>>>>>>> >>>>>>>>>> *(For example: This pull request makes task deployment go through >>>>>>>>>> the >>>>>>>>>> >>>>>>>>> blob >>>>>>>>> >>>>>>>>> server, rather than through RPC. That way we avoid re-transferring >>>>>>>>> them >>>>>>> >>>>>>> on >>>>>>>>> >>>>>>>>> each deployment (during recovery).)* >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> ## Brief change log >>>>>>>>>> >>>>>>>>>> *(for example:)* >>>>>>>>>> - *The TaskInfo is stored in the blob store on job creation time >>>>>>>>>> as a >>>>>>>>>> persistent artifact* >>>>>>>>>> - *Deployments RPC transmits only the blob storage reference* >>>>>>>>>> - *TaskManagers retrieve the TaskInfo from the blob cache* >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> ## Verifying this change >>>>>>>>>> >>>>>>>>>> *(Please pick either of the following options)* >>>>>>>>>> >>>>>>>>>> This change is a trivial rework / code cleanup without any test >>>>>>>>>> coverage. >>>>>>>>>> >>>>>>>>>> *(or)* >>>>>>>>>> >>>>>>>>>> This change is already covered by existing tests, such as *(please >>>>>>>>>> >>>>>>>>> describe >>>>>>>>> >>>>>>>>> tests)*. >>>>>>>>>> >>>>>>>>>> *(or)* >>>>>>>>>> >>>>>>>>>> This change added tests and can be verified as follows: >>>>>>>>>> >>>>>>>>>> *(example:)* >>>>>>>>>> - *Added integration tests for end-to-end deployment with large >>>>>>>>>> >>>>>>>>> payloads >>>>>>> >>>>>>> (100MB)* >>>>>>>>>> >>>>>>>>>> - *Extended integration test for recovery after master >>>>>>>>>> (JobManager) >>>>>>>>>> failure* >>>>>>>>>> - *Added test that validates that TaskInfo is transferred only >>>>>>>>>> once >>>>>>>>>> across recoveries* >>>>>>>>>> - *Manually verified the change by running a 4 node cluser with 2 >>>>>>>>>> JobManagers and 4 TaskManagers, a stateful streaming program, and >>>>>>>>>> killing >>>>>>>>>> one JobManager and to TaskManagers during the execution, verifying >>>>>>>>>> >>>>>>>>> that >>>>>>> >>>>>>> recovery happens correctly.* >>>>>>>>>> >>>>>>>>>> ## Does this pull request potentially affect one of the following >>>>>>>>>> >>>>>>>>> parts: >>>>>>> >>>>>>> - Dependencies (does it add or upgrade a dependency): **(yes / no)** >>>>>>>>>> >>>>>>>>>> - The public API, i.e., is any changed class annotated with >>>>>>>>>> `@Public(Evolving)`: **(yes / no)** >>>>>>>>>> - The serializers: **(yes / no / don't know)** >>>>>>>>>> - The runtime per-record code paths (performance sensitive): >>>>>>>>>> **(yes >>>>>>>>>> >>>>>>>>> / no >>>>>>> >>>>>>> / don't know)** >>>>>>>>>> >>>>>>>>>> - Anything that affects deployment or recovery: JobManager (and >>>>>>>>>> its >>>>>>>>>> components), Checkpointing, Yarn/Mesos, ZooKeeper: **(yes / no / >>>>>>>>>> >>>>>>>>> don't >>>>>>> >>>>>>> know)**: >>>>>>>>>> >>>>>>>>>> ## Documentation >>>>>>>>>> >>>>>>>>>> - Does this pull request introduce a new feature? **(yes / no)** >>>>>>>>>> - If yes, how is the feature documented? **(not applicable / docs >>>>>>>>>> / >>>>>>>>>> JavaDocs / not documented)** >>>>>>>>>> >