pci: Disable PCI_ERR_UNCOR_MASK register for machine type < 8.0

Juan Quintela Thu, 18 May 2023 08:20:50 -0700

Peter Xu <pet...@redhat.com> wrote:
> On Thu, May 18, 2023 at 01:33:43PM +0200, Juan Quintela wrote:
>> See patch for documentation:
>> 
>> https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg03288.html
>> 
>> Basically, the best we can do is:
>> - get the patch posted.  Fixes everything except:
>>   (3) qemu-8.0 -M pc-7.2 -> qemu-8.0.1 -M pc-7.2 works
>> 
>> And for that, we can document somewhere that we need to launch
>> qemu-8.0.1 as:
>> 
>> $ qemu-8.0.1 -M pc-7.2 -device blah,x-pci-err-unc-mask=on
>
> One thing we can also do to avoid it in the future is simply having someone
> do this check around each softfreeze (and we'll also need maintainers be
> careful on merging anything that's risky though after softfreeze) rather
> than after release (what I did for this time, which is late), try to cover
> as much devices as possible. I don't know whether there's a way to always
> cover all devices.
>
> I'll volunteer myself for that as long as I'll remember.  Juan, please also
> have a check or remind me if I didn't. :)
>
> I am not sure whether I mentioned it somewhere before, but maybe it'll work
> if we can also have some way we check migrating each of the vmsd from
> old-qemu to new-qemu (and also new->old) covering all devices.  It doesn't
> need to be a real migration, just generate the per-device stream and try
> loading on the other binary.
>
> It might be an overkill to be part of CI to check each commit, but if
> there's some way to check it then at least we can run it also after
> softfreeze.  I also don't know whether it'll be easy to achieve it at all,
> but I'll think more about it too and update if I found something useful.


Hi Peter

First, thanks for volunteering.

And next, I think this is done better as part of avocado.  Several
reasons:
- We need two different qemu's
- We want to run it perhaps daily.
- We want to report any problem.

I will start with something really simply.  Like getting the
migration-test tests cases that we have, and just run them in both
directions.  I.e. new -> old and old -> new.

That will give us a baseline:
- x86_64
- i386
- aarch64
- ppc
- s390

I think nothing else cares about versioned machine types right now.

Once the mechanism is working and the reporting is sent somewhere, we
can go from there and add machines with the devices that we care about.

But just the example that I showed would have detected the problem that
we are talking about.

After that I would make sure that we are checking all virtio devices,
with/without vhost.

And once we have done that, the device authors that care about their
devices will add test to the infrastructure.

Later, Juan.

Re: [PATCH v1 1/1] hw/pci: Disable PCI_ERR_UNCOR_MASK register for machine type < 8.0

Reply via email to