> On 3 Nov 2015, at 07:44 AM, Jason Wang <jasow...@redhat.com> wrote: > > > > On 11/02/2015 03:49 PM, Dmitry Fleytman wrote: >> >>> On 2 Nov 2015, at 05:35 AM, Jason Wang <jasow...@redhat.com >>> <mailto:jasow...@redhat.com> >>> <mailto:jasow...@redhat.com <mailto:jasow...@redhat.com>>> wrote: >>> >>> >>> >>> On 10/31/2015 01:52 PM, Dmitry Fleytman wrote: >>>> Hello Jason, >>>> >>>> Thanks for reviewing. See my answers inline. >>>> >>>> >>>>> On 30 Oct 2015, at 07:28 AM, Jason Wang <jasow...@redhat.com >>>>> <mailto:jasow...@redhat.com> >>>>> <mailto:jasow...@redhat.com <mailto:jasow...@redhat.com>> >>>>> <mailto:jasow...@redhat.com>> wrote: >>>>> >>>>> >>>>> >>>>> On 10/28/2015 01:44 PM, Jason Wang wrote: >>>>>> >>>>>> On 10/26/2015 01:00 AM, Leonid Bloch wrote: >>>>>>> Hello qemu-devel, >>>>>>> >>>>>>> This patch series is an RFC for the new networking device emulation >>>>>>> we're developing for QEMU. >>>>>>> >>>>>>> This new device emulates the Intel 82574 GbE Controller and works >>>>>>> with unmodified Intel e1000e drivers from the Linux/Windows kernels. >>>>>>> >>>>>>> The status of the current series is "Functional Device Ready, work >>>>>>> on Extended Features in Progress". >>>>>>> >>>>>>> More precisely, these patches represent a functional device, which >>>>>>> is recognized by the standard Intel drivers, and is able to transfer >>>>>>> TX/RX packets with CSO/TSO offloads, according to the spec. >>>>>>> >>>>>>> Extended features not supported yet (work in progress): >>>>>>> 1. TX/RX Interrupt moderation mechanisms >>>>>>> 2. RSS >>>>>>> 3. Full-featured multi-queue (use of multiqueued network backend) >>>>>>> >>>>>>> Also, there will be some code refactoring and performance >>>>>>> optimization efforts. >>>>>>> >>>>>>> This series was tested on Linux (Fedora 22) and Windows (2012R2) >>>>>>> guests, using Iperf, with TX/RX and TCP/UDP streams, and various >>>>>>> packet sizes. >>>>>>> >>>>>>> More thorough testing, including data streams with different MTU >>>>>>> sizes, and Microsoft Certification (HLK) tests, are pending missing >>>>>>> features' development. >>>>>>> >>>>>>> See commit messages (esp. "net: Introduce e1000e device emulation") >>>>>>> for more information about the development approaches and the >>>>>>> architecture options chosen for this device. >>>>>>> >>>>>>> This series is based upon v2.3.0 tag of the upstream QEMU repository, >>>>>>> and it will be rebased to latest before the final submission. >>>>>>> >>>>>>> Please share your thoughts - any feedback is highly welcomed :) >>>>>>> >>>>>>> Best Regards, >>>>>>> Dmitry Fleytman. >>>>>> Thanks for the series. Will go through this in next few days. >>>>> >>>>> Have a quick glance at the series, got the following questions: >>>>> >>>>> - Though e1000e differs from e1000 in many places, I still see lots of >>>>> code duplications. We need consider to reuse e1000.c (or at least part >>>>> of). I believe we don't want to fix a bug twice in two places in the >>>>> future and I expect hundreds of lines could be saved through this way. >>>> >>>> That’s a good question :) >>>> >>>> This is how we started, we had a common “core” code base meant to >>>> implement all common logic (this split is still present in the patches >>>> - there are e1000e_core.c and e1000e.c files). >>>> Unfortunately at some point it turned out that there are more >>>> differences that commons. We noticed that the code becomes filled with >>>> many minor differences handling. >>>> This also made the code base more complicated and harder to follow. >>>> >>>> So at some point of time it was decided to split the code base and >>>> revert all changes done to the e1000 device (except a few >>>> fixes/improvements Leonid submitted a few days ago). >>>> >>>> Although there was common code between devices, total SLOC of e1000 >>>> and e1000e devices became smaller after the split. >>>> >>>> Amount of code that may be shared between devices will be even smaller >>>> after we complete the implementation which still misses a few features >>>> (see cover letter) that will change many things. >>>> >>>> Still after the device implementation is done, we plan to review code >>>> similarities again to see if there are possibilities for code sharing. >>> >>> I see, but if we can try to re-use or unify the codes from beginning, it >>> would be a little bit easier. Looks like the differences were mainly: >>> >>> 1) MSI-X support >>> 2) offloading support through virtio-net header >>> 3) trace points >>> 4) other new functions through e1000e specific registers >>> >>> So we could first unify the code through implementing the support of 2 >>> and 3 for e1000. For MSI-X and other e1000e specific new functions, it >>> could be done through: >>> >>> 1) model specific callbacks, e.g realize, transmission and reception >>> 2) A new register flags e.g PHY_RW_E1000E which means the register is >>> for e1000e only. Or even model specific wirteops and readops >>> 3) For other subtle differences, it could be done in the code by >>> checking the model explicitly. >>> >>> What's your opinion? (A good example of code sharing is freebsd's e1000 >>> driver which covers both 8254x and 8257x). >> >> >> Hi Jason, >> >> This is exactly how we started. >> >> Issues that made us split the code base were following: >> >> 1. The majority of registers are different. Even same registers in >> many cases have different bits and require different processing logic. >> This introduced too much if’s into the code; > > Then we can probably have different writeops and readops. This way, we > can at least save the codes of common registers. > >> 2. The data path is totally different not just because of virtio >> headers but also because these devices use different descriptor >> layouts and require different logic in order to parse and fill those. >> There are legacy descriptors that look almost the same but of course >> we must support all descriptor types described by spec; > > Yes, but looks like the only extend rx/tx descriptors is 8257x specific. > And 8257x fully supports both legacy and context descriptor of 8254x. > This give us another chance to reuse the code. > >> 3. Interrupt handling logic is different because of MSI support; > > Right, but this is not hard to address, probably a new helper. > >> 4. Mutli-queue and RSS make the code even less similar to the old device; > > Yes, this could be in 8275x specific file. > >> 5. Changes required to isolate shared code base required changes in >> migration tables and fishy tricks to preserve compatibility with >> current implementation; > > Since 8257x is a totally new device, it can has its own vmstate if it's > simpler to be implemented and we don't even need to care migration > compatibility. > >> 6. e1000 code suffered from massive changes which are very hard to >> verify because spec is big and there are no drivers that use all >> device features. > > Then we can try to change e1000 as mini as possible. E.g just factor out > the common logic to helpers to reuse it in e1000e. > >> >> Overall, code for handling differences in device behaviours was bigger >> and more complicated then the device logic itself. The difference is >> not subtle when it comes to the full featured device implementation. >> As for FreeBSD’s driver, I’m not deeply familiar with its code but I >> suspect it works with device in legacy mode which is pretty similar to >> an old device indeed. Since we must support all modes our situation is >> different. > > Yes, it does not use extended descriptor format. > >> >> Again, I’m totally into shared code and would like to have some common >> code base anyway. Currently it looks like the best way to achieve this >> is to finish with all device features and then see what parts of logic >> may be shared between the old and the new devices. It’s better to have >> slight code duplication and smaller shared code base than to have >> bloated and tricky shared code that will introduce its own problems to >> both devices. > > The code duplication is not slight in this rfc :). So the code has the > possibility to be unified. But I'm ok to evaluate this after all > features were developed.
Hello Jason, I’m strongly into making code as much shared as possible :) Let’s see how it goes with the final code… Thanks for your comments, Dmitry. > > Thanks > >> >> Best Regards, >> Dmitry >> >>> >>>> >>>>> - For e1000e it self, since it was a new device, so no need to care >>>>> about compatibility stuffs (e.g auto negotiation and mit). We can just >>>>> enable them forever. >>>> >>>> Yes, we have this in plans. >>>> >>>>> - And for the generic packet abstraction layer, what's the >>>>> advantages of >>>>> this? If it has lot, maybe we can use it in other nic model (e.g >>>>> virtio-net)? >>>> >>>> These abstractions were initially developed by me as a part of vmxnet3 >>>> device to be generic and re-usable. Their main advantage is support >>>> for virtio headers for virtio-enabled backends and emulation of >>>> network offloads in software for backends that do not support virtio. >>>> >>>> Of course they may be re-used by virtio, however I’m not sure if it >>>> will be really useful because virtio has feature negotiation >>>> facilities and do not require SW emulation for network task offloads. >>>> >>>> For other devices they are useful because each and every device that >>>> requires SW offloads implementation need to do exactly the same things >>>> and it doesn’t make sense to have a few implementations for this. >>>> >>>> Best Regards, >>>> Dmitry >>> >>> Ok, thanks for the explanation. >>> >>>> >>>>> >>>>> Thanks >>>>> >>>>>> >>>>>> Since 2.5 is in soft freeze, this looks a 2.6 material.