It is a good practice to create a template for Github Issues and PRs <https://docs.github.com/en/communities/using-templates-to-encourage-useful-issues-and-pull-requests/creating-a-pull-request-template-for-your-repository>. It will make a PR or issue easier to read overall. We can also enforce/suggest properties like affected versions and fixed versions.
Best, Yufei On Wed, Nov 3, 2021 at 6:02 PM Wing Yew Poon <wyp...@cloudera.com.invalid> wrote: > I wasn't aware that we were standardizing on such a practice. I don't have > a strong opinion on making changes one Spark version at a time or all at > once. I think committers who do reviews regularly should decide. My only > concern with making changes one version at a time is follow-through on the > part of the contributor. We want to ensure that a change/fix applicable to > multiple versions gets into all of them. Reviewer bandwidth is incurred > either way. (For simple changes/fixes, perhaps all at once does save > reviewer bandwidth, so we may want to be flexible.) > > On Wed, Nov 3, 2021 at 5:52 PM Jack Ye <yezhao...@gmail.com> wrote: > >> Thanks for bringing this up Kyle! >> >> My personal view is the following: >> 1. For new features, it should be very clear that we always implement >> them against the latest version. At the same time, I suggest we create an >> issue to track backport, so that if anyone is interested in backport he/she >> can work on it separately. We can tag these issues based on title names >> (e.g. "Backport: xxx" as title), and these are also good issues for new >> contributors to work on because there is already reference implementation >> in a newer version. >> 2. For bug fixes, I understand sometimes it's just a one line fix and >> people will try to just fix across versions. My take is that we should try >> to advocate for fixing 1 version and open an issue for other versions >> although it does not really need to be enforced as strictly. Sometimes even >> a 1 line fix has serious implications for different versions and might >> break stuff unintentionally. it's better that people that have production >> dependency on the specific version carefully review and test changes before >> merging. >> >> About enforcement strategy, I suggest we start to create a template for >> Github Issues and PRs >> <https://docs.github.com/en/communities/using-templates-to-encourage-useful-issues-and-pull-requests/creating-a-pull-request-template-for-your-repository>, >> where we state the guidelines related to engine versions, as well as >> Iceberg's preferred code style, naming convention, title convention, etc. >> to make new contributors a bit easier to submit changes without too much >> rewrite. Currently I observe that every time there is a new contributor, we >> need to state all the guidelines through PR review, which causes quite a >> lot of time spent on rewriting the code and also reduces the motivation for >> people to continue work on the PR. >> >> Best, >> Jack Ye >> >> >> >> >> >> On Wed, Nov 3, 2021 at 4:13 PM Kyle Bendickson <k...@tabular.io> wrote: >> >>> I submitted a PR to fix a Spark bug today, applying the same changes to >>> all eligible Spark versions. >>> >>> Jack mentioned that he thought the practice going forward was to fix / >>> apply changes on the latest Spark version in one PR, and then open a second >>> PR to backport the fixes (presumably to minimize review overhead). >>> >>> Do we have a standard / preference on that? Jack mentioned he wasn't >>> certain, so I thought I'd ask here. >>> >>> Seems like a good practice but hoping to get some clarification :) >>> >>> -- >>> Best, >>> Kyle Bendickson >>> Github: @kbendick >>> >>