Re: Time for drm-ci-next?

Maxime Ripard Tue, 18 Mar 2025 07:39:07 -0700

On Mon, Mar 10, 2025 at 07:44:58AM -0700, Rob Clark wrote:
> On Mon, Mar 10, 2025 at 4:07 AM Maxime Ripard <mrip...@redhat.com> wrote:
> >
> > On Fri, Mar 07, 2025 at 09:26:54AM -0800, Rob Clark wrote:
> > > On Fri, Mar 7, 2025 at 9:00 AM Maxime Ripard <mrip...@redhat.com> wrote:
> > > > On Fri, Mar 07, 2025 at 08:42:46AM -0800, Rob Clark wrote:
> > > > > On Tue, Sep 24, 2024 at 5:27 AM Vignesh Raman
> > > > > > On 12/09/24 11:18, Dmitry Baryshkov wrote:
> > > > > > > On Mon, Sep 09, 2024 at 07:34:04AM GMT, Rob Clark wrote:
> > > > > > >> On Mon, Sep 9, 2024 at 2:54 AM Dmitry Baryshkov
> > > > > > >> <dmitry.barysh...@linaro.org> wrote:
> > > > > > >>>
> > > > > > >>> On Mon, 9 Sept 2024 at 10:50, Maxime Ripard 
> > > > > > >>> <mrip...@redhat.com> wrote:
> > > > > > >>>>
> > > > > > >>>> Hi,
> > > > > > >>>>
> > > > > > >>>> On Tue, Jul 09, 2024 at 01:27:51AM GMT, Dmitry Baryshkov wrote:
> > > > > > >>>>> On Mon, 8 Jul 2024 at 21:38, Rob Clark <robdcl...@gmail.com> 
> > > > > > >>>>> wrote:
> > > > > > >>>>>>
> > > > > > >>>>>> On Mon, Jul 8, 2024 at 1:52 AM Daniel Vetter 
> > > > > > >>>>>> <daniel.vet...@ffwll.ch> wrote:
> > > > > > >>>>>>>
> > > > > > >>>>>>> On Fri, Jul 05, 2024 at 12:31:36PM -0700, Rob Clark wrote:
> > > > > > >>>>>>>> On Fri, Jul 5, 2024 at 3:36 AM Daniel Vetter 
> > > > > > >>>>>>>> <daniel.vet...@ffwll.ch> wrote:
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> On Thu, Jul 04, 2024 at 08:40:26AM -0700, Rob Clark wrote:
> > > > > > >>>>>>>>>> On Thu, Jul 4, 2024 at 7:08 AM Daniel Vetter 
> > > > > > >>>>>>>>>> <daniel.vet...@ffwll.ch> wrote:
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> On Tue, Jul 02, 2024 at 05:32:39AM -0700, Rob Clark 
> > > > > > >>>>>>>>>>> wrote:
> > > > > > >>>>>>>>>>>> On Fri, Jun 28, 2024 at 10:44 AM Daniel Vetter 
> > > > > > >>>>>>>>>>>> <dan...@ffwll.ch> wrote:
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>> On Thu, Jun 27, 2024 at 11:51:37AM -0700, Rob Clark 
> > > > > > >>>>>>>>>>>>> wrote:
> > > > > > >>>>>>>>>>>>>> On Wed, Jun 26, 2024 at 10:47 AM Daniel Vetter 
> > > > > > >>>>>>>>>>>>>> <dan...@ffwll.ch> wrote:
> > > > > > >>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>> On Wed, Jun 26, 2024 at 11:38:30AM +0300, Dmitry 
> > > > > > >>>>>>>>>>>>>>> Baryshkov wrote:
> > > > > > >>>>>>>>>>>>>>>> On Wed, Jun 26, 2024 at 09:32:44AM GMT, Daniel 
> > > > > > >>>>>>>>>>>>>>>> Vetter wrote:
> > > > > > >>>>>>>>>>>>>>>>> On Mon, Jun 24, 2024 at 10:25:25AM -0300, Helen 
> > > > > > >>>>>>>>>>>>>>>>> Koike wrote:
> > > > > > >>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>> On 24/06/2024 02:34, Vignesh Raman wrote:
> > > > > > >>>>>>>>>>>>>>>>>>> Hi,
> > > > > > >>>>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>>>> On 15/03/24 22:50, Rob Clark wrote:
> > > > > > >>>>>>>>>>>>>>>>>>>> Basically, I often find myself needing to 
> > > > > > >>>>>>>>>>>>>>>>>>>> merge CI patches on top of
> > > > > > >>>>>>>>>>>>>>>>>>>> msm-next in order to run CI, and then after a 
> > > > > > >>>>>>>>>>>>>>>>>>>> clean CI run, reset HEAD
> > > > > > >>>>>>>>>>>>>>>>>>>> back before the merge and force-push.  Which 
> > > > > > >>>>>>>>>>>>>>>>>>>> isn't really how things
> > > > > > >>>>>>>>>>>>>>>>>>>> should work.
> > > > > > >>>>>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>>>>> This sounds more like you want an integration 
> > > > > > >>>>>>>>>>>>>>>>> tree like drm-tip. Get msm
> > > > > > >>>>>>>>>>>>>>>>> branches integrated there, done. Backmerges just 
> > > > > > >>>>>>>>>>>>>>>>> for integration testing
> > > > > > >>>>>>>>>>>>>>>>> are not a good idea indeed.
> > > > > > >>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>> But AFAIU this doesn't help for pre-merge testing, 
> > > > > > >>>>>>>>>>>>>> ie. prior to a
> > > > > > >>>>>>>>>>>>>> patch landing in msm-next
> > > > > > >>>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>>> My idea was to have a drm-ci-next managed similar to 
> > > > > > >>>>>>>>>>>>>> drm-misc-next, if
> > > > > > >>>>>>>>>>>>>> we have needed drm/ci patches we could push them to 
> > > > > > >>>>>>>>>>>>>> drm-ci-next, and
> > > > > > >>>>>>>>>>>>>> then merge that into the driver tree (along with a 
> > > > > > >>>>>>>>>>>>>> PR from drm-ci-next
> > > > > > >>>>>>>>>>>>>> to Dave).
> > > > > > >>>>>>>>>>>>>
> > > > > > >>>>>>>>>>>>> I guess I'm confused about what kind of pre-merge 
> > > > > > >>>>>>>>>>>>> testing we're talking
> > > > > > >>>>>>>>>>>>> about then ... Or maybe this just doesn't work too 
> > > > > > >>>>>>>>>>>>> well with the linux
> > > > > > >>>>>>>>>>>>> kernel. The model is that you have some pile of 
> > > > > > >>>>>>>>>>>>> trees, they're split up,
> > > > > > >>>>>>>>>>>>> and testing of all the trees together is done in 
> > > > > > >>>>>>>>>>>>> integration trees like
> > > > > > >>>>>>>>>>>>> linux-next or drm-tip.
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> pre-merge: for msm we've been collecting up patches 
> > > > > > >>>>>>>>>>>> from list into a
> > > > > > >>>>>>>>>>>> fast-forward MR which triggers CI before merging to 
> > > > > > >>>>>>>>>>>> msm-next/msm-fixes
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> Ideally drm-misc and other trees would do similar, 
> > > > > > >>>>>>>>>>>> we'd catch more
> > > > > > >>>>>>>>>>>> regressions that way.  For example, in msm-next the 
> > > > > > >>>>>>>>>>>> nodebugfs build is
> > > > > > >>>>>>>>>>>> currently broken, because we merged drm-misc-next at a 
> > > > > > >>>>>>>>>>>> time when
> > > > > > >>>>>>>>>>>> komeda was broken:
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> https://gitlab.freedesktop.org/drm/msm/-/jobs/60575681#L9520
> > > > > > >>>>>>>>>>>>
> > > > > > >>>>>>>>>>>> If drm-misc was using pre-merge CI this would have 
> > > > > > >>>>>>>>>>>> been caught and the
> > > > > > >>>>>>>>>>>> offending patch dropped.
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> That sounds more like we should push on the drm-misc 
> > > > > > >>>>>>>>>>> pre-merge CI boulder
> > > > > > >>>>>>>>>>> to move it uphill, than add even more trees to make it 
> > > > > > >>>>>>>>>>> even harder to get
> > > > > > >>>>>>>>>>> there long term ...
> > > > > > >>>>>>>>>>>
> > > > > > >>>>>>>>>>> Short term it helps locally to have finer trees, but 
> > > > > > >>>>>>>>>>> only short term and
> > > > > > >>>>>>>>>>> only very locally.
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> The path to have fewer trees (ideally only one for all 
> > > > > > >>>>>>>>>> of drm) is to
> > > > > > >>>>>>>>>> use gitlab MRs to land everything :-)
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> drm-ci-next is only a stop-gap.. but one that we need.  
> > > > > > >>>>>>>>>> The
> > > > > > >>>>>>>>>> ${branchname}-external-fixes trick covers _most_ cases 
> > > > > > >>>>>>>>>> where we need
> > > > > > >>>>>>>>>> unrelated patches (ie. to fix random ToT breakage 
> > > > > > >>>>>>>>>> outside of drm or in
> > > > > > >>>>>>>>>> core drm), but it doesn't help when the needed changes 
> > > > > > >>>>>>>>>> are yml
> > > > > > >>>>>>>>>> (because gitlab processes all the yml before merging the
> > > > > > >>>>>>>>>> -external-fixes branch).  This is where we need 
> > > > > > >>>>>>>>>> drm-ci-next, otherwise
> > > > > > >>>>>>>>>> we are having to create a separate MR which cherry-picks 
> > > > > > >>>>>>>>>> drm/ci
> > > > > > >>>>>>>>>> patches for doing the CI.  This is a rather broken 
> > > > > > >>>>>>>>>> process.
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> So what I don't get is ... if we CI drm-misc, how does 
> > > > > > >>>>>>>>> that not help
> > > > > > >>>>>>>>> improve the situation here? Step one would be post-merge 
> > > > > > >>>>>>>>> (i.e. just enable
> > > > > > >>>>>>>>> CI in the repo), then get MRs going.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> I guess post-merge is better than nothing.. but pre-merge 
> > > > > > >>>>>>>> is better.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> post-merge can work if you have a "sheriff" system where 
> > > > > > >>>>>>>> someone
> > > > > > >>>>>>>> (perhaps on a rotation) is actively monitoring results and 
> > > > > > >>>>>>>> "revert and
> > > > > > >>>>>>>> ask questions later" when something breaks.  Pre-merge 
> > > > > > >>>>>>>> ensures the
> > > > > > >>>>>>>> interested party is involved in the process ;-)
> > > > > > >>>>>>>
> > > > > > >>>>>>> So ... make that happen? And it doesn't have to be for all 
> > > > > > >>>>>>> of drm-misc,
> > > > > > >>>>>>> mesa after all switched over to MR also on a bit a 
> > > > > > >>>>>>> driver/area basis. So
> > > > > > >>>>>>> agreeing among all drm-ci folks to use gitlab MR in 
> > > > > > >>>>>>> drm-misc for pre-merge
> > > > > > >>>>>>> testing shouldn't be that hard to make happen. And unlike a 
> > > > > > >>>>>>> separate
> > > > > > >>>>>>> branch it's not some kind of detour with a good chance to 
> > > > > > >>>>>>> get stuck in a
> > > > > > >>>>>>> local optimum.
> > > > > > >>>>>>
> > > > > > >>>>>> Tree vs branch doesn't really have much in the way of 
> > > > > > >>>>>> distinction,
> > > > > > >>>>>> modulo gitlab permissions.  In that it doesn't do much good 
> > > > > > >>>>>> if drm/ci
> > > > > > >>>>>> patches are landing on a different branch.
> > > > > > >>>>>>
> > > > > > >>>>>> I guess what you are suggesting is that we have a single 
> > > > > > >>>>>> tree/branch
> > > > > > >>>>>> that drm/ci + drm/msm + (plus whoever else wants to get in 
> > > > > > >>>>>> on the
> > > > > > >>>>>> drm/ci, so probably at least vkms) lands patches into via 
> > > > > > >>>>>> gitlab MRs?
> > > > > > >>>>>
> > > > > > >>>>> This again brings a separate CI-enabled tree. I think it has 
> > > > > > >>>>> been
> > > > > > >>>>> suggested to start with enabling DRM CI for drm-misc 
> > > > > > >>>>> branches. Then we
> > > > > > >>>>> can possibly start landing MRs with CI testing (probably 
> > > > > > >>>>> starting with
> > > > > > >>>>> vkms).
> > > > > > >>>>
> > > > > > >>>> It's something we've discussed with Sima for a while, but 
> > > > > > >>>> enabling CI on
> > > > > > >>>> drm-misc at some point will make sense so we could just as 
> > > > > > >>>> well start
> > > > > > >>>> now.
> > > > > > >>>>
> > > > > > >>>> The biggest unknown at the moment to start doing so is how we 
> > > > > > >>>> could keep
> > > > > > >>>> drm-tip and the rerere repo with MR.
> > > > > > >>>
> > > > > > >>> Let's do a slow start and begin with post-merge testing. At 
> > > > > > >>> least this
> > > > > > >>> gives us an idea of how stable it is (and what does it take to 
> > > > > > >>> keep it
> > > > > > >>> green). Maybe we can perform post-merge testing for both 
> > > > > > >>> drm-misc and
> > > > > > >>> drm-tip.
> > > > > > >>
> > > > > > >> The one thing is that currently the runtime for igt is quite long
> > > > > > >> (~1hr) because you can't really parallelize kms tests.  So maybe
> > > > > > >> nightly scheduled runs would be a better idea.
> > > > > > >
> > > > > > > SGTM. So, the question would be, who can set it up?
> > > > > > >
> > > > > >
> > > > > > We will test the nightly pipelines in a forked repo and then will
> > > > > > set it up for drm-misc and drm-tip.
> > > > > >
> > > > >
> > > > > Revisiting this old thread...
> > > > >
> > > > > It's becoming increasingly clear that landing drm/ci changes via 
> > > > > drm-misc
> > > > > (where gitlab CI is not used) isn't working out.
> > > >
> > > > Why?
> > >
> > > Some of it has been breakage in drm/ci, which would have been noticed
> > > if drm/ci went thru CI.
> > >
> > > And some of it is that expectation changes in other drivers when we
> > > backmerge drm-next/drm-misc-next aren't cared for.  So we end up not
> > > enforcing a green-pipeline, and just ignoring failed jobs on other
> > > drivers.  I guess we could just make expectation updates for other
> > > drivers as part of drm/msm MRs, but that feels a bit strange.
> >
> > Yeah, ok, I can see how that can happen and how it can be frustrating :)
> >
> > > Related question, should we just smash expectation updates into the
> > > merge commit?  Or keep it a separate commit?  In the latter case, I
> > > think the expectation update commit should not need a r-b on list,
> > > since it is just updating expectations to the reality of changes
> > > merged elsewhere without a CI run.
> >
> > My gut feeling is that we should review that change still, because
> > people will mess the expectations up just to get that green light. But
> > both suck, really, so I'm not really sure.
> 
> Either way I'd want to notify maintainers of the respective drivers.
> If we need to add a new expected fail, due to a backmerge, the change
> that caused the failure has already been merged.  The expectations
> update is just to reflect reality, so not really much to review.


Still, the failure might be legitimate (ie, a regression was indeed
merged) or illegitimate (ie, the driver is correct and the test isn't,
or the test never worked).

The latter makes sense for an expectations list. However, in the former
case, we should definitely not update the expectations list.

> And I should mention, at this point I'm not proposing opening up MRs
> to _everyone_, just to driver maintainers of whichever drivers want
> to participate.

ACK

> > > > Also, tbf, drm-misc doesn't use gitlab CI because nobody solved the
> > > > drm-tip issue. And because the mail I sent 6 months ago about didn't get
> > > > any feedback.
> > >
> > > If gitlab MRs were used to land changes in drm-misc, this would solve
> > > the problem.  I'd be happy to start landing drm/msm changes through
> > > that path.
> > >
> > > But I don't think there is any good way to mix/match gitlab MRs with
> > > other processes (dim, or raw git) in the same tree.  We need to be
> > > able to enforce a green CI run to land changes.
> >
> > So, I agree, and it's been the intent to make drm-misc this since even
> > before we moved to Gitlab.
> >
> > But there's different things to consider:
> >
> > - We frequently enough merge patches affecting other subsystems that
> >   will need to be notified, and will need to review our work. Just
> >   saying that a random maintainer that gets 2 MR each year from us
> >   should watch our Gitlab frequently doesn't work. So, how do we solve
> >   that? Do we take it to the ML, and once we get their AB, we merge it
> >   through a MR?
> >
> > - The kernel kind of mandates of maintainers / committers to put their
> >   Signed-off-by, how do we automate that in Gitlab?
> >
> > - Similarly, Linus *really* likes signed PRs, how do we make one
> >   switching to Gitlab?
> >
> > - Do we require people to still use dim, or do we just drop dim?
> >
> > - Do we have enough runners to begin with?
> >
> > - And the final one that really stinks: how do we make sure we send the
> >   right branches to linux-next?
> >
> > So far, nobody really decided or solved those things. And we would have
> > to solve this (except for dim) for the tree you're suggesting, because
> > most of it really is about dealing with the impedance mismatch with the
> > rest of the kernel.
> 
> What we've been doing with drm/msm (since v5.18) is pretty transparent
> to patch submitters and PR consumers:
> 
> 1) Patches sent to list and reviewed on list
> 
> 2) Driver maintainers (Dmitry, Abhinav, and myself) take turns
>    collecting up reviewed patches from list into an MR
> 
> 3) The gitlab tree is set to only allow ff merges, which sidesteps
>    the signoff/etc issues.  It would be nice if gitlab (or margebot?
>    Or something?) could apply the s-o-b's, etc.  We'd need that if
>    we wanted to open things up for submitters to send MRs directly.
> 
> 4) Gitlab CI runs on the MR..  so in addition to ad-hoc local testing,
>    we have on-device igt runs, a variety of build tests, etc.  (I'd
>    like to get to having deqp runs with a known good mesa, but we
>    aren't there yet.)
> 
>    This part is still a bit messy because we frequently have to pull
>    in drm/ci fixes, etc.  Which is what I'd like to fix by having a
>    drm-gitlab tree which is used for drm/ci and drm/msm patches (and
>    open for other drivers to join in).
> 
>    We do have an ${target_branch}-external fixes mechanism which we
>    can use for required fixes that would land via some other tree,
>    in case (for ex) one of the CI boards doesn't boot properly without
>    some fix in another subsystem.  But this has the limitation that
>    it can't deal with yaml changes, since the branch is merged in the
>    build job, after gitlab has already slurped in all the .yml
> 
> 5) PRs sent to Dave and Sima are tagged and sent as normal.  It isn't
>    automated (yet), but the PR notes are collected up from however
>    many MRs were merged.  (Typically one or two for display side
>    changes from Dmitry/Abhinav and maybe one or two for GEM/GPU/etc
>    from myself.)
> 
> You can sum all that up as just inserting an MR step into the process.
> Ie. rather than push directly to msm-next, I push to msm-next-robclark.
> Which then gets merged thru gitlab after a CI run.
> 
> It might be possible to implement a similar thing with dim, ie. the
> drm-misc maintainers push to drm-misc-next-staging which gets merged
> into drm-misc-next after an MR pipeline?

Yeah, it looks like a great intermediate step for drm-misc indeed.

> > We talked about it some time ago with Dave, and we concluded that the
> > best course of action was to create a few actions for DRM to run when we
> > send PR (like a build test and kunit run), and then expand and figure
> > things out as we use it more and more.
> 
> Build test and kunit can accomplish _some_ things.. but kunit in
> particular can only test the hw you have.

That's not entirely true. We have plenty of mocked tests already. It
won't test (all) the drivers, but it can already test a significant
bunch of the framework. And you can cross-compile for arch specific
tests.

> But gitlab CI can leverage the CI farm we already have in place for
> mesa, and run on-device CI across a range of DUTs. I don't see how you
> can replicate that any other way.
> 
> (Not to mention the side benefits of having a record of test runs and
> artifacts.)

I wasn't saying it was enough or the end goal, I was saying it was the
first step we agreed upon :)

Maxime

signature.asc
Description: PGP signature

Re: Time for drm-ci-next?

Reply via email to