It is a good practice to create a template for Github Issues and PRs
<https://docs.github.com/en/communities/using-templates-to-encourage-useful-issues-and-pull-requests/creating-a-pull-request-template-for-your-repository>.
It
will make a PR or issue easier to read overall. We can also enforce/suggest
properties like affected versions and fixed versions.

Best,
Yufei


On Wed, Nov 3, 2021 at 6:02 PM Wing Yew Poon <wyp...@cloudera.com.invalid>
wrote:

> I wasn't aware that we were standardizing on such a practice. I don't have
> a strong opinion on making changes one Spark version at a time or all at
> once. I think committers who do reviews regularly should decide. My only
> concern with making changes one version at a time is follow-through on the
> part of the contributor. We want to ensure that a change/fix applicable to
> multiple versions gets into all of them. Reviewer bandwidth is incurred
> either way. (For simple changes/fixes, perhaps all at once does save
> reviewer bandwidth, so we may want to be flexible.)
>
> On Wed, Nov 3, 2021 at 5:52 PM Jack Ye <yezhao...@gmail.com> wrote:
>
>> Thanks for bringing this up Kyle!
>>
>> My personal view is the following:
>> 1. For new features, it should be very clear that we always implement
>> them against the latest version. At the same time, I suggest we create an
>> issue to track backport, so that if anyone is interested in backport he/she
>> can work on it separately. We can tag these issues based on title names
>> (e.g. "Backport: xxx" as title), and these are also good issues for new
>> contributors to work on because there is already reference implementation
>> in a newer version.
>> 2. For bug fixes, I understand sometimes it's just a one line fix and
>> people will try to just fix across versions. My take is that we should try
>> to advocate for fixing 1 version and open an issue for other versions
>> although it does not really need to be enforced as strictly. Sometimes even
>> a 1 line fix has serious implications for different versions and might
>> break stuff unintentionally. it's better that people that have production
>> dependency on the specific version carefully review and test changes before
>> merging.
>>
>> About enforcement strategy, I suggest we start to create a template for
>> Github Issues and PRs
>> <https://docs.github.com/en/communities/using-templates-to-encourage-useful-issues-and-pull-requests/creating-a-pull-request-template-for-your-repository>,
>> where we state the guidelines related to engine versions, as well as
>> Iceberg's preferred code style, naming convention, title convention, etc.
>> to make new contributors a bit easier to submit changes without too much
>> rewrite. Currently I observe that every time there is a new contributor, we
>> need to state all the guidelines through PR review, which causes quite a
>> lot of time spent on rewriting the code and also reduces the motivation for
>> people to continue work on the PR.
>>
>> Best,
>> Jack Ye
>>
>>
>>
>>
>>
>> On Wed, Nov 3, 2021 at 4:13 PM Kyle Bendickson <k...@tabular.io> wrote:
>>
>>> I submitted a PR to fix a Spark bug today, applying the same changes to
>>> all eligible Spark versions.
>>>
>>> Jack mentioned that he thought the practice going forward was to fix /
>>> apply changes on the latest Spark version in one PR, and then open a second
>>> PR to backport the fixes (presumably to minimize review overhead).
>>>
>>> Do we have a standard / preference on that? Jack mentioned he wasn't
>>> certain, so I thought I'd ask here.
>>>
>>> Seems like a good practice but hoping to get some clarification :)
>>>
>>> --
>>> Best,
>>> Kyle Bendickson
>>> Github: @kbendick
>>>
>>

Reply via email to