Re: [Qemu-devel] [RFC PATCH 0/5] Introduce Intel 82574 GbE Controller Emulation (e1000e)

Jason Wang Mon, 02 Nov 2015 21:46:22 -0800


On 11/02/2015 03:49 PM, Dmitry Fleytman wrote:
>
>> On 2 Nov 2015, at 05:35 AM, Jason Wang <jasow...@redhat.com
>> <mailto:jasow...@redhat.com>> wrote:
>>
>>
>>
>> On 10/31/2015 01:52 PM, Dmitry Fleytman wrote:
>>> Hello Jason,
>>>
>>> Thanks for reviewing. See my answers inline.
>>>
>>>
>>>> On 30 Oct 2015, at 07:28 AM, Jason Wang <jasow...@redhat.com
>>>> <mailto:jasow...@redhat.com>
>>>> <mailto:jasow...@redhat.com>> wrote:
>>>>
>>>>
>>>>
>>>> On 10/28/2015 01:44 PM, Jason Wang wrote:
>>>>>
>>>>> On 10/26/2015 01:00 AM, Leonid Bloch wrote:
>>>>>> Hello qemu-devel,
>>>>>>
>>>>>> This patch series is an RFC for the new networking device emulation
>>>>>> we're developing for QEMU.
>>>>>>
>>>>>> This new device emulates the Intel 82574 GbE Controller and works
>>>>>> with unmodified Intel e1000e drivers from the Linux/Windows kernels.
>>>>>>
>>>>>> The status of the current series is "Functional Device Ready, work
>>>>>> on Extended Features in Progress".
>>>>>>
>>>>>> More precisely, these patches represent a functional device, which
>>>>>> is recognized by the standard Intel drivers, and is able to transfer
>>>>>> TX/RX packets with CSO/TSO offloads, according to the spec.
>>>>>>
>>>>>> Extended features not supported yet (work in progress):
>>>>>> 1. TX/RX Interrupt moderation mechanisms
>>>>>> 2. RSS
>>>>>> 3. Full-featured multi-queue (use of multiqueued network backend)
>>>>>>
>>>>>> Also, there will be some code refactoring and performance
>>>>>> optimization efforts.
>>>>>>
>>>>>> This series was tested on Linux (Fedora 22) and Windows (2012R2)
>>>>>> guests, using Iperf, with TX/RX and TCP/UDP streams, and various
>>>>>> packet sizes.
>>>>>>
>>>>>> More thorough testing, including data streams with different MTU
>>>>>> sizes, and Microsoft Certification (HLK) tests, are pending missing
>>>>>> features' development.
>>>>>>
>>>>>> See commit messages (esp. "net: Introduce e1000e device emulation")
>>>>>> for more information about the development approaches and the
>>>>>> architecture options chosen for this device.
>>>>>>
>>>>>> This series is based upon v2.3.0 tag of the upstream QEMU repository,
>>>>>> and it will be rebased to latest before the final submission.
>>>>>>
>>>>>> Please share your thoughts - any feedback is highly welcomed :)
>>>>>>
>>>>>> Best Regards,
>>>>>> Dmitry Fleytman.
>>>>> Thanks for the series. Will go through this in next few days.
>>>>
>>>> Have a quick glance at the series, got the following questions:
>>>>
>>>> - Though e1000e differs from e1000 in many places, I still see lots of
>>>> code duplications. We need consider to reuse e1000.c (or at least part
>>>> of). I believe we don't want to fix a bug twice in two places in the
>>>> future and I expect hundreds of lines could be saved through this way.
>>>
>>> That’s a good question :)
>>>
>>> This is how we started, we had a common “core” code base meant to
>>> implement all common logic (this split is still present in the patches
>>> - there are e1000e_core.c and e1000e.c files).
>>> Unfortunately at some point it turned out that there are more
>>> differences that commons. We noticed that the code becomes filled with
>>> many minor differences handling.
>>> This also made the code base more complicated and harder to follow.
>>>
>>> So at some point of time it was decided to split the code base and
>>> revert all changes done to the e1000 device (except a few
>>> fixes/improvements Leonid submitted a few days ago).
>>>
>>> Although there was common code between devices, total SLOC of e1000
>>> and e1000e devices became smaller after the split.
>>>
>>> Amount of code that may be shared between devices will be even smaller
>>> after we complete the implementation which still misses a few features
>>> (see cover letter) that will change many things.
>>>
>>> Still after the device implementation is done, we plan to review code
>>> similarities again to see if there are possibilities for code sharing.
>>
>> I see, but if we can try to re-use or unify the codes from beginning, it
>> would be a little bit easier. Looks like the differences were mainly:
>>
>> 1) MSI-X support
>> 2) offloading support through virtio-net header
>> 3) trace points
>> 4) other new functions through e1000e specific registers
>>
>> So we could first unify the code through implementing the support of 2
>> and 3 for e1000. For MSI-X and other e1000e specific new functions, it
>> could be done through:
>>
>> 1) model specific callbacks, e.g realize, transmission and reception
>> 2) A new register flags e.g PHY_RW_E1000E which means the register is
>> for e1000e only. Or even model specific wirteops and readops
>> 3) For other subtle differences, it could be done in the code by
>> checking the model explicitly.
>>
>> What's your opinion? (A good example of code sharing is freebsd's e1000
>> driver which covers both 8254x and 8257x).
>
>
> Hi Jason,
>
> This is exactly how we started.
>
> Issues that made us split the code base were following:
>
> 1. The majority of registers are different. Even same registers in
> many cases have different bits and require different processing logic.
> This introduced too much if’s into the code;


Then we can probably have different writeops and readops. This way, we
can at least save the codes of common registers.

> 2. The data path is totally different not just because of virtio
> headers but also because these devices use different descriptor
> layouts and require different logic in order to parse and fill those.
> There are legacy descriptors that look almost the same but of course
> we must support all descriptor types described by spec;

Yes, but looks like the only extend rx/tx descriptors is 8257x specific.
And 8257x fully supports both legacy and context descriptor of 8254x.
This give us another chance to reuse the code.

> 3. Interrupt handling logic is different because of MSI support;

Right, but this is not hard to address, probably a new helper.

> 4. Mutli-queue and RSS make the code even less similar to the old device;

Yes, this could be in 8275x specific file.

> 5. Changes required to isolate shared code base required changes in
> migration tables and fishy tricks to preserve compatibility with
> current implementation;

Since 8257x is a totally new device, it can has its own vmstate if it's
simpler to be implemented and we don't even need to care migration
compatibility.

> 6. e1000 code suffered from massive changes which are very hard to
> verify because spec is big and there are no drivers that use all
> device features.

Then we can try to change e1000 as mini as possible. E.g just factor out
the common logic to helpers to reuse it in e1000e.

>
> Overall, code for handling differences in device behaviours was bigger
> and more complicated then the device logic itself. The difference is
> not subtle when it comes to the full featured device implementation.
> As for FreeBSD’s driver, I’m not deeply familiar with its code but I
> suspect it works with device in legacy mode which is pretty similar to
> an old device indeed. Since we must support all modes our situation is
> different.

Yes, it does not use extended descriptor format.

>
> Again, I’m totally into shared code and would like to have some common
> code base anyway. Currently it looks like the best way to achieve this
> is to finish with all device features and then see what parts of logic
> may be shared between the old and the new devices. It’s better to have
> slight code duplication and smaller shared code base than to have
> bloated and tricky shared code that will introduce its own problems to
> both devices.

The code duplication is not slight in this rfc :). So the code has the
possibility to be unified. But I'm ok to evaluate this after all
features were developed.

Thanks

>
> Best Regards,
> Dmitry
>
>>
>>>
>>>> - For e1000e it self, since it was a new device, so no need to care
>>>> about compatibility stuffs (e.g auto negotiation and mit). We can just
>>>> enable them forever.
>>>
>>> Yes, we have this in plans.
>>>
>>>> - And for the generic packet abstraction layer, what's the
>>>> advantages of
>>>> this? If it has lot, maybe we can use it in other nic model (e.g
>>>> virtio-net)?
>>>
>>> These abstractions were initially developed by me as a part of vmxnet3
>>> device to be generic and re-usable. Their main advantage is support
>>> for virtio headers for virtio-enabled backends and emulation of
>>> network offloads in software for backends that do not support virtio.
>>>
>>> Of course they may be re-used by virtio, however I’m not sure if it
>>> will be really useful because virtio has feature negotiation
>>> facilities and do not require SW emulation for network task offloads.
>>>
>>> For other devices they are useful because each and every device that
>>> requires SW offloads implementation need to do exactly the same things
>>> and it doesn’t make sense to have a few implementations for this.
>>>
>>> Best Regards,
>>> Dmitry
>>
>> Ok, thanks for the explanation.
>>
>>>
>>>>
>>>> Thanks
>>>>
>>>>>
>>>>> Since 2.5 is in soft freeze, this looks a 2.6 material.
>

Re: [Qemu-devel] [RFC PATCH 0/5] Introduce Intel 82574 GbE Controller Emulation (e1000e)

Reply via email to