On Wed, Jul 14, 2021 at 4:40 PM Jason Wang <jasow...@redhat.com> wrote: > > > 在 2021/7/14 下午2:04, Bin Meng 写道: > > On Wed, Jul 14, 2021 at 12:53 PM Jason Wang <jasow...@redhat.com> wrote: > >> > >> 在 2021/7/14 上午11:42, Bin Meng 写道: > >>> On Wed, Jul 14, 2021 at 11:10 AM Jason Wang <jasow...@redhat.com> wrote: > >>>> 在 2021/7/13 下午5:11, Bin Meng 写道: > >>>>> On Tue, Jul 13, 2021 at 5:02 PM Jason Wang <jasow...@redhat.com> wrote: > >>>>>> 在 2021/7/13 下午4:36, Bin Meng 写道: > >>>>>>> On Tue, Jul 13, 2021 at 3:03 PM Jason Wang <jasow...@redhat.com> > >>>>>>> wrote: > >>>>>>>> 在 2021/7/13 上午7:06, Bin Meng 写道: > >>>>>>>>> On Mon, Jul 5, 2021 at 1:57 PM Bin Meng <bmeng...@gmail.com> wrote: > >>>>>>>>>> On Mon, Jul 5, 2021 at 12:21 PM Jason Wang <jasow...@redhat.com> > >>>>>>>>>> wrote: > >>>>>>>>>>> 在 2021/7/2 下午5:24, Bin Meng 写道: > >>>>>>>>>>>> From: Christina Wang <christina.w...@windriver.com> > >>>>>>>>>>>> > >>>>>>>>>>>> The initial value of VLAN Ether Type (VET) register is 0x8100, > >>>>>>>>>>>> as per > >>>>>>>>>>>> the manual and real hardware. > >>>>>>>>>>>> > >>>>>>>>>>>> While Linux e1000 driver always writes VET register to 0x8100, > >>>>>>>>>>>> it is > >>>>>>>>>>>> not always the case for everyone. Drivers relying on the reset > >>>>>>>>>>>> value > >>>>>>>>>>>> of VET won't be able to transmit and receive VLAN frames in QEMU. > >>>>>>>>>>>> > >>>>>>>>>>>> Reported-by: Markus Carlstedt <markus.carlst...@windriver.com> > >>>>>>>>>>>> Signed-off-by: Christina Wang <christina.w...@windriver.com> > >>>>>>>>>>>> Signed-off-by: Bin Meng <bin.m...@windriver.com> > >>>>>>>>>>>> --- > >>>>>>>>>>>> > >>>>>>>>>>>> (no changes since v1) > >>>>>>>>>>>> > >>>>>>>>>>>> hw/net/e1000.c | 2 ++ > >>>>>>>>>>>> 1 file changed, 2 insertions(+) > >>>>>>>>>>>> > >>>>>>>>>>>> diff --git a/hw/net/e1000.c b/hw/net/e1000.c > >>>>>>>>>>>> index 4f75b44cfc..20cbba6411 100644 > >>>>>>>>>>>> --- a/hw/net/e1000.c > >>>>>>>>>>>> +++ b/hw/net/e1000.c > >>>>>>>>>>>> @@ -29,6 +29,7 @@ > >>>>>>>>>>>> #include "hw/pci/pci.h" > >>>>>>>>>>>> #include "hw/qdev-properties.h" > >>>>>>>>>>>> #include "migration/vmstate.h" > >>>>>>>>>>>> +#include "net/eth.h" > >>>>>>>>>>>> #include "net/net.h" > >>>>>>>>>>>> #include "net/checksum.h" > >>>>>>>>>>>> #include "sysemu/sysemu.h" > >>>>>>>>>>>> @@ -254,6 +255,7 @@ static const uint32_t mac_reg_init[] = { > >>>>>>>>>>>> [MANC] = E1000_MANC_EN_MNG2HOST | > >>>>>>>>>>>> E1000_MANC_RCV_TCO_EN | > >>>>>>>>>>>> E1000_MANC_ARP_EN | E1000_MANC_0298_EN | > >>>>>>>>>>>> E1000_MANC_RMCP_EN, > >>>>>>>>>>>> + [VET] = ETH_P_VLAN, > >>>>>>>>>>> I wonder if we need a compat flag for this, since we change the > >>>>>>>>>>> behavior. > >>>>>>>>>>> > >>>>>>>>>>> (See e1000_properties[]) > >>>>>>>>>>> > >>>>>>>>>> No we don't need to since it does not break migration. > >>>>>>>>> Ping? > >>>>>>>> I admit migration "works" but it doesn't mean it's not broken. It > >>>>>>>> changes the guest visible default value of VET register, so it may > >>>>>>>> break > >>>>>>>> things silently for the guest. > >>>>>>>> > >>>>>>>> For old machine types, we should stick the value to the one without > >>>>>>>> this > >>>>>>>> fix. > >>>>>>> Could you please propose a solution on how to handle such a scenario > >>>>>>> in a generic way in QEMU? (+Peter) > >>>>>> Well, I think I've suggested you to have a look at how things is done > >>>>>> in > >>>>>> for handling such compatibility in e1000_properties. > >>>>>> > >>>>>> > >>>>>>> The POR reset value is wrong in QEMU and has carried forward the wrong > >>>>>>> value for years, and correcting it to its right value needs to do > >>>>>>> what? > >>>>>> We should stick to the wrong behavior for old machine types. > >>>>>> > >>>>>> That's all. > >>>>> So that means the following SD patch is also wrong (+Philippe) which > >>>>> changes the default value of capability register. > >>>>> http://patchwork.ozlabs.org/project/qemu-devel/patch/20210623185921.24113-1-joanneko...@gmail.com/ > >>>> It should compat capareg for the old value for old machine types. > >>> Yeah, it's already a property for the SD controller model but someone > >>> views it as a bug because the model implements 64-bit but not > >>> reporting it in the capability register. > >>> > >>>>> Can we get some agreement among maintainers? > >>>> It's not about the agreement but about to have a stable ABI. I don't > >>>> know the case for sd but e1000 is used in various and we work hard to > >>>> unbreak the migration compatibility among downstream versions. Git log > >>>> on e1000.c will tell you more. > >>> Agreement or stable ABI, whatever we call, but we should be in some > >>> consistency. > >>> > >>> IMHO maintainers should reach an agreement to some extent on how > >>> compatibility should be achieved. I just found silly to add a property > >>> to fix a real bug in the model, and we preserve the bug all over > >>> releases. > >> > >> That's the price for the stable ABI. See one of my recent fix - > >> d83f46d189 virtio-pci: compat page aligned ATS. It keeps the "buggy" > >> behavior to unbreak the migration. > >> > > But this series does not break the migration, as we discussed in the > > previous thread. > > > It actually did, > > (qemu) qemu-kvm: get_pci_config_device: Bad config data: i=0x104 read: > 0 device: 20 cmask: ff wmask: 0 w1cmask:0 > > Since the register is RO.
Did you mean the VET register is read-only? I don't understand this as the manual says it's RW and Linux driver programs it to 0x8100. > > > > > >>> I can find plenty of examples in the current QEMU tree that were > >>> accepted that changed the bugous register behavior, but it was not > >>> asked to add new properties to keep the bugos behavior. > >>> > >>> e.g.: commit ce8e43760e8e ("hw/net: fsl_etsec: Reverse the RCTRL.RSF > >>> logic") > >> > >> I guess it's simply because fsl_etsec is not used in any > >> distributions/production environments or the maintainer may just not > >> notice things like this. > >> > >> But for e1000(e), we should stick to a stable ABI for consistency. > >> Otherwise it would be very tricky to fix them after we saw real issues. > >> We had learnt a lot during the past decade. > >> > > Okay, do we have such a kind of widely used device model list? And we > > should document such a process that we should keep compatibility on > > these devices as well. > > > I can only say for networking devices: > > - e1000(e) > - rtl8139 > - virtio-net > Please, at least have these documented as a development process. > > > > Regarding this VET register, do you know what guest relies on the POR > > value which is zero? > > > I don't know. > > > > Zero is not a valid ethernet VLAN type. I don't > > think changing this will break any guests. > > > You might be right here, but it would be late if we find it breaks any one. > > If it's not a lot of work, I tend to bother with compat stuffs for this. > Regards, Bin