Hi Kaxil,

Regarding Copilot billing, the lack of clarity on server-side costs is
concerning. Given the recent surge in PR volume, there is a risk of
uncontrollable expenses if hard caps aren't in place, similar to those on
ASF CI.

I agree that automated and assisted reviews are not mutually exclusive. I
am interested to hear how others perceive this balance and what their
experiences have been.

Best,
Jarek Potiuk

On Tue, Apr 28, 2026 at 2:20 PM Kaxil Naik <[email protected]> wrote:

> Ignore the last para about links -- it is the copy/paste of what I had sent
> to the new AI initiative ASF group/list where some of related discussions
> were happening.
>
> On Tue, 28 Apr 2026 at 13:18, Kaxil Naik <[email protected]> wrote:
>
> > The Copilot reviews as I had recently found out were paid for by our
> > Astronomer's GitHub enterprise (for me and other folks at Astronomer from
> > our quota).
> >
> > And with them moving to Usage-based model (which was bound to happen), it
> > will get expensive.
> >
> > Although, I think it is still valuable for mass-reviewing on the server
> > side since this happens on CI and is a complete opt-in by the reviewer
> and
> > there are no doubts.
> >
> > As I showed in the last dev call, I have my hand crafted review skill
> that
> > I use for detailed reviews from my laptop and that has been vetted by me
> > before posting. So I am fully responsible for all the good and bad (false
> > positive or hallucinations) things it catches since I approve it. And
> have
> > been using this skill for a good quarter or half a year (time flies).
> >
> > And that is why I do not feel it is either / OR -- meaning it was never
> > Copilot review vs local review for me since I used both based on the
> > purpose. For reviewing 200 PRs as last time, I used Copilot since a
> review
> > is helpful than no review, and PR getting marked stale and I have seen
> > folks self-assign copilot review on their PRs -- for those who have
> access
> > to it. I have done the same to have multiple layers (even though my local
> > review skill already does multi-modal reviews).
> >
> > And my philosophy around Review and a lot of workflow skills have been:
> > What I look for in a PR isn't necessarily what someone else would look
> for.
> > Or what is important to me, might be nit for someone. So while as a
> project
> > we should have standards which go in AGENTS.md/Claude.md and or a
> > high-level review skill that is checked-in the project, the "what I look
> > for" will remain on my machine and is geared towards my preference,
> testing
> > etc. Time and again I'd use reviews from Ash and others to tune it --
> like
> > I would do even without AI. As an example Ash would have caught something
> > in my PR or someone else's PR that I wouldn't have realized, and like a
> > human I'd learn for next time, so are my skills. But that is sort of my
> new
> > workflow adapting.
> >
> > re: fully automated Copilot reviews may discourage contributors who
> expect
> > human interaction
> >
> > At least to me, there is no difference between that and a review from a
> > human which completely sounds AI/robotic anyway since a human can just
> run
> > a skill and post the response as well --- from a purely PR author
> reception
> > point of view. It is just coming from a different account, and the latter
> > feels even worse as it is coming from a human account. But at the end of
> > the day, that is still a personal preference. A review (from copilot,
> from
> > human that sounds like AI, a pure human review) that catches a bug is
> still
> > better than no review, PR going stale and being closed.
> >
> > Long story short: I do not think they are mutually exclusive -- or never
> > were mutually exclusive :)
> >
> > ------
> >
> > - https://github.com/apache/airflow/pull/63775#discussion_r3025383633 --
> > Copilot caught a Databricks provider importing airflow.utils.timezone
> > directly (which relies on Airflow's runtime deprecation redirect and
> > silences typing) and suggested switching to
> > airflow.providers.common.compat.sdk. That is our documented Airflow 2 /
> > Airflow 3 cross-version pattern for providers. Copilot only knows this
> > because we wrote it down.
> > - https://github.com/apache/airflow/pull/62343#discussion_r3025380683 --
> > same cross-version import pattern, different provider. Author accepted
> the
> > suggestion.
> > - https://github.com/apache/airflow/pull/64568#discussion_r3025333917 --
> > Copilot flagged a fix for failure-callback context["exception"] handling
> > and asked for a regression test specifically against the
> > InProcessTestSupervisor / dag.test() path. That is the correct execution
> > path for that change, not a generic "please add a test" remark.
> > -
> >
> https://github.com/apache/airflow/pull/64576#pullrequestreview-4047787083
> --
> > Copilot reviewed a fix I authored for xcom_pull() ignoring default when
> > map_indexes was not set, and raised a header-precedence
> > question I had not thought about.
> > -
> >
> https://github.com/apache/airflow/pull/61878#pullrequestreview-3851732779
> --
> > Dennis surfaced this one on the dev list as a real review he found useful
> > on a provider PR. He's at a
> > different company, so worth flagging as an independent signal.
> >
> > On Tue, 28 Apr 2026 at 11:02, Jarek Potiuk <[email protected]> wrote:
> >
> >> Hi Kaxil and team,
> >>
> >> I’d like to discuss our strategy for "copilot reviews" versus SKILL
> based
> >> assisted triage and maintainer review experiences.
> >>
> >> My recent experiments with skill-based auto-triage show a downward trend
> >> in
> >> stale pull requests by filtering out "drive-by" contributions, allowing
> >> maintainers to focus on high-impact work (more stats and trends soon).
> >> Unlike 3rd-party tools, this approach is agent-agnostic, easily
> fine-tuned
> >> via English prompts, and potentially reusable across other ASF projects.
> >>
> >> Regarding transparency, I’ve proposed specific attribution formats [1]
> to
> >> distinguish between purely automated comments and those reviewed by
> >> humans.
> >> This aligns with ongoing [email protected] conversations [2]
> about
> >> using "Assisted By" instead of "Generated-by" to maintain accountability
> >> and contributor motivation.
> >>
> >> I have concerns that fully automated Copilot reviews may discourage
> >> contributors who expect human interaction. My experimental
> >> /maintainer-review skill [3] allows maintainers to drive the process by
> >> reviewing AI-generated comments before posting. This reduces "token
> >> ping-pong," integrates with the CODEOWNERS discussion [4], and ensures
> we
> >> only spend tokens on relevant, triaged PRs. And keep maintainers
> >> ultimately
> >> responsible for comments they accept to send.
> >>
> >> With Copilot moving to usage-based billing on June 1 [5], using personal
> >> or
> >> open-source credits via the skill-based approach introduces some change
> >> (but I am still unclear what it means to the billing - who pays for the
> >> tokens).
> >>
> >> Conversely, for skill-based processes, it's entirely in the hands (and
> >> pocket) of those who run the skill locally. We are also working with ASF
> >> and Alpha-Omega to secure long-term resources for maintainers for that.
> >>
> >> I would love to hear your thoughts about it - especially Kaxil's
> >> experiences regarding the Copilot review experiment - my observations
> >> might
> >> be incomplete and biased :).
> >>
> >> J.
> >>
> >> [1] https://github.com/apache/airflow/pull/65965 - comment attribution
> >> [2] https://lists.apache.org/thread/qbdkky8ls6zybyy9o3pvqnpf68r089qp -
> >> legal discussion thread on attributions
> >> [3] https://github.com/apache/airflow/pull/65981 - /maintainer-review
> >> skill
> >> [4] https://lists.apache.org/thread/5ssp4ksyohdzclxqvj7ngz0hz5wy9j68 -
> >> CODEOWNERS discussion
> >> [5]
> >>
> >>
> https://github.blog/news-insights/company-news/github-copilot-is-moving-to-usage-based-billing/
> >> - change in billing for Copilot
> >>
> >> Best,
> >> Jarek Potiuk
> >>
> >
>

Reply via email to