thanks @manupa-arm @leandron @Mousius @electriclilies for the comments!

I have a few more comments based on more detailed analysis of the relationship 
between `CONTRIBUTORS.md` and `CODEOWNERS`. My original +1 was based on cursory 
look at `CODEOWNERS` in which I thought only a smaller subset of `CONTRIBUTORS` 
(e.g. maybe those who would opt in to lots of code-review notifications) were 
listed. In fact that's not true; instead:
- there are 42 codeowners, one of them being apache/tvm-committers
- there are 43 committers
- nearly everyone in CODEOWNERS is a committer.

In general, having the problems raised by this thread is a positive thing as it 
means the community and the pace of PRs is growing as adoption increases :). I 
think many of @electriclilies proposals reflect a need to consider how we might 
scale the whole review process as the project grows--increasing the number of 
committers is a way to help with the review load, as are modifications to the 
way we notify and assign people to code-reviews.

However, I would like to request we focus this thread on the issue being voted 
on, which I'll attempt to qualify more below. We should absolutely continue 
discussing on another thread e.g. in the discuss forum or perhaps also at 
Community Meeting/TVMConf. I personally feel the pain of the current system--I 
get about 50 notifications per day and properly reviewing them often takes 
about 3h of time--and would be happy to discuss the broader code-review process 
more there. There are many days I simply don't have this time, as evidenced by 
my higher response latency recently.

To provide more context on the current thread: there are a number of things 
contributing to the current situation:
1. Prior to updating CODEOWNERS in apache/tvm#8500, mentions were the primary 
way to get a committer's attention. The change was intended to provide a more 
explicit indication to contributors of who might be a reasonable reviewer.
2. In addition to mentions, we also have the "assign review" feature of GitHub. 
Assign is typically used manually to indicate who is shepherding a PR.
3. Then there are review requests, which are similar to assignments but which 
can be added by non-committers. Prior to apache/tvm#8500, at least my personal 
experience was that review requests were infrequent.
4. After apache/tvm#8500 landed, review requests became quite frequent (many 
PRs wound up requesting reviews from a large cross-section of the CODEOWNERS 
file because the way we've chosen the ownership means that a typical change 
which spans e.g. `python`, `src`, and `tests` has a good chance of triggering 
`CODEOWNERS` lines underneath those directories. For example, a relay change 
which includes an incidental microTVM test fix might result in requesting from 
a microTVM CODEOWNER. Prior to apache/tvm#8500, an assigned reviewer (either 
explicitly through GH assignment or implicitly through mention/thread history) 
could bring in others as was needed.
5. We now have 3 categories which reviewers need to monitor: mentions, 
assignments, review requests. It seems the GH solution here is notifications; 
however, both review requests and assignments trigger cause GH to subscribe and 
thereby notify you. And, we haven't assigned meaning behind a review-request vs 
an assignment.

The root problem exacerbated by apache/tvm#8500 is not so much getting review 
from a wider set of audience--it's assigning a committer to shepherd the review 
and own the process of merging it. Now, many PRs have 10+ committers "." Even 
with two committers assigned, there is ambiguity in who's shepherding the PR. 
And, with both review requests and assignments in play, I think people can be 
getting confused between the two. Compounding the problem, we aren't consistent 
in our use of assignments--so, there is no authoritative place to look for PRs 
which a committer needs to shepherd.

There are a couple of ways to address this shepherd-assignment problem:
1. Through CODEOWNERS, in which review requests suggest an assigned shepherd 
and one of the reviewers self-assigns.
2. Through mentions or side-channel pings, also a self-assignment.
3. Through project leadership combing through incoming PRs and issuing 
assignments.

apache/tvm#8500 was an attempt to move from 2 and 3 towards 1. However, I think 
it has been a bit hampered because CODEOWNERS operates at the directory level 
but we have not really correctly organized the project to support this style of 
OWNERS. For instance, the test structure is different than the actual code 
structure.

This vote was originally intended to further address the shepherd-assignment 
problem by reducing the number of suggested reviewers auto-requested on a PR. 
Any open source project is going to have committers which are more and less 
active. My concern with this proposal was primarily that a PR may get a varying 
speed of review if it's round-robin'd to a less active contributor. Originally 
I had thought this wasn't likely to be a problem immediately because I'd 
thought that `CODEOWNERS` was a subset of the committers. It looks like this 
actually could happen--but, I think there are actually a couple of remedies:
1. First off, a familiar contributor who merely wants a fast review can always 
mention a committer they work with regularly to nudge them to look. 
2. Any committer is free to reassign the review to take shepherding 
responsibilities.
3. We could consider the [triage 
role](https://infra.apache.org/github-roles.html) to help here.

There has been suggestion that contributors use e-mail filters to help sort the 
incoming review requests. First off this is something I do and I'm supportive 
of it; however there are a couple of cases which may cause the filters to be 
hard to compose (e.g. at least in gmail which many of us use):
1. Some committers get mentioned on tons of reviews which they never find time 
to comment on.
2. With the current CODEOWNERS, I don't think it's possible to differentiate 
between a requested review due to CODEOWNERS and a review requested explicitly.
3. There are situations where e.g. you neither were mentioned on a review nor 
assigned/review-requested but would like to watch a PR.

Compounding these challenges is time--these conditions may come and go 
throughout the life of a review. GitHub supports subscriptions for this reason, 
I believe.

In summary, I've gone back and forth in supporting this change (from supportive 
to unsupportive and now back to supportive given the remedies). I'm not 
completely sure this is the right thing to do, but wiling to try as I don't 
think the current fully-automatic system is working that well. I would further 
support codifying the purpose of assignments/review-requests and considering 
adopting a project-wide methodology of triaging new review requests (e.g. 
perhaps utilizing a triage role and indicating who is responsible for this). 
And, we should definitely continue discussing the process of scaling 
code-review in other threads, as this is merely one aspect to discuss.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm/issues/9057#issuecomment-931579113

Reply via email to