On 11/02/2015 03:49 PM, Dmitry Fleytman wrote: > >> On 2 Nov 2015, at 05:35 AM, Jason Wang <jasow...@redhat.com >> <mailto:jasow...@redhat.com>> wrote: >> >> >> >> On 10/31/2015 01:52 PM, Dmitry Fleytman wrote: >>> Hello Jason, >>> >>> Thanks for reviewing. See my answers inline. >>> >>> >>>> On 30 Oct 2015, at 07:28 AM, Jason Wang <jasow...@redhat.com >>>> <mailto:jasow...@redhat.com> >>>> <mailto:jasow...@redhat.com>> wrote: >>>> >>>> >>>> >>>> On 10/28/2015 01:44 PM, Jason Wang wrote: >>>>> >>>>> On 10/26/2015 01:00 AM, Leonid Bloch wrote: >>>>>> Hello qemu-devel, >>>>>> >>>>>> This patch series is an RFC for the new networking device emulation >>>>>> we're developing for QEMU. >>>>>> >>>>>> This new device emulates the Intel 82574 GbE Controller and works >>>>>> with unmodified Intel e1000e drivers from the Linux/Windows kernels. >>>>>> >>>>>> The status of the current series is "Functional Device Ready, work >>>>>> on Extended Features in Progress". >>>>>> >>>>>> More precisely, these patches represent a functional device, which >>>>>> is recognized by the standard Intel drivers, and is able to transfer >>>>>> TX/RX packets with CSO/TSO offloads, according to the spec. >>>>>> >>>>>> Extended features not supported yet (work in progress): >>>>>> 1. TX/RX Interrupt moderation mechanisms >>>>>> 2. RSS >>>>>> 3. Full-featured multi-queue (use of multiqueued network backend) >>>>>> >>>>>> Also, there will be some code refactoring and performance >>>>>> optimization efforts. >>>>>> >>>>>> This series was tested on Linux (Fedora 22) and Windows (2012R2) >>>>>> guests, using Iperf, with TX/RX and TCP/UDP streams, and various >>>>>> packet sizes. >>>>>> >>>>>> More thorough testing, including data streams with different MTU >>>>>> sizes, and Microsoft Certification (HLK) tests, are pending missing >>>>>> features' development. >>>>>> >>>>>> See commit messages (esp. "net: Introduce e1000e device emulation") >>>>>> for more information about the development approaches and the >>>>>> architecture options chosen for this device. >>>>>> >>>>>> This series is based upon v2.3.0 tag of the upstream QEMU repository, >>>>>> and it will be rebased to latest before the final submission. >>>>>> >>>>>> Please share your thoughts - any feedback is highly welcomed :) >>>>>> >>>>>> Best Regards, >>>>>> Dmitry Fleytman. >>>>> Thanks for the series. Will go through this in next few days. >>>> >>>> Have a quick glance at the series, got the following questions: >>>> >>>> - Though e1000e differs from e1000 in many places, I still see lots of >>>> code duplications. We need consider to reuse e1000.c (or at least part >>>> of). I believe we don't want to fix a bug twice in two places in the >>>> future and I expect hundreds of lines could be saved through this way. >>> >>> That’s a good question :) >>> >>> This is how we started, we had a common “core” code base meant to >>> implement all common logic (this split is still present in the patches >>> - there are e1000e_core.c and e1000e.c files). >>> Unfortunately at some point it turned out that there are more >>> differences that commons. We noticed that the code becomes filled with >>> many minor differences handling. >>> This also made the code base more complicated and harder to follow. >>> >>> So at some point of time it was decided to split the code base and >>> revert all changes done to the e1000 device (except a few >>> fixes/improvements Leonid submitted a few days ago). >>> >>> Although there was common code between devices, total SLOC of e1000 >>> and e1000e devices became smaller after the split. >>> >>> Amount of code that may be shared between devices will be even smaller >>> after we complete the implementation which still misses a few features >>> (see cover letter) that will change many things. >>> >>> Still after the device implementation is done, we plan to review code >>> similarities again to see if there are possibilities for code sharing. >> >> I see, but if we can try to re-use or unify the codes from beginning, it >> would be a little bit easier. Looks like the differences were mainly: >> >> 1) MSI-X support >> 2) offloading support through virtio-net header >> 3) trace points >> 4) other new functions through e1000e specific registers >> >> So we could first unify the code through implementing the support of 2 >> and 3 for e1000. For MSI-X and other e1000e specific new functions, it >> could be done through: >> >> 1) model specific callbacks, e.g realize, transmission and reception >> 2) A new register flags e.g PHY_RW_E1000E which means the register is >> for e1000e only. Or even model specific wirteops and readops >> 3) For other subtle differences, it could be done in the code by >> checking the model explicitly. >> >> What's your opinion? (A good example of code sharing is freebsd's e1000 >> driver which covers both 8254x and 8257x). > > > Hi Jason, > > This is exactly how we started. > > Issues that made us split the code base were following: > > 1. The majority of registers are different. Even same registers in > many cases have different bits and require different processing logic. > This introduced too much if’s into the code;
Then we can probably have different writeops and readops. This way, we can at least save the codes of common registers. > 2. The data path is totally different not just because of virtio > headers but also because these devices use different descriptor > layouts and require different logic in order to parse and fill those. > There are legacy descriptors that look almost the same but of course > we must support all descriptor types described by spec; Yes, but looks like the only extend rx/tx descriptors is 8257x specific. And 8257x fully supports both legacy and context descriptor of 8254x. This give us another chance to reuse the code. > 3. Interrupt handling logic is different because of MSI support; Right, but this is not hard to address, probably a new helper. > 4. Mutli-queue and RSS make the code even less similar to the old device; Yes, this could be in 8275x specific file. > 5. Changes required to isolate shared code base required changes in > migration tables and fishy tricks to preserve compatibility with > current implementation; Since 8257x is a totally new device, it can has its own vmstate if it's simpler to be implemented and we don't even need to care migration compatibility. > 6. e1000 code suffered from massive changes which are very hard to > verify because spec is big and there are no drivers that use all > device features. Then we can try to change e1000 as mini as possible. E.g just factor out the common logic to helpers to reuse it in e1000e. > > Overall, code for handling differences in device behaviours was bigger > and more complicated then the device logic itself. The difference is > not subtle when it comes to the full featured device implementation. > As for FreeBSD’s driver, I’m not deeply familiar with its code but I > suspect it works with device in legacy mode which is pretty similar to > an old device indeed. Since we must support all modes our situation is > different. Yes, it does not use extended descriptor format. > > Again, I’m totally into shared code and would like to have some common > code base anyway. Currently it looks like the best way to achieve this > is to finish with all device features and then see what parts of logic > may be shared between the old and the new devices. It’s better to have > slight code duplication and smaller shared code base than to have > bloated and tricky shared code that will introduce its own problems to > both devices. The code duplication is not slight in this rfc :). So the code has the possibility to be unified. But I'm ok to evaluate this after all features were developed. Thanks > > Best Regards, > Dmitry > >> >>> >>>> - For e1000e it self, since it was a new device, so no need to care >>>> about compatibility stuffs (e.g auto negotiation and mit). We can just >>>> enable them forever. >>> >>> Yes, we have this in plans. >>> >>>> - And for the generic packet abstraction layer, what's the >>>> advantages of >>>> this? If it has lot, maybe we can use it in other nic model (e.g >>>> virtio-net)? >>> >>> These abstractions were initially developed by me as a part of vmxnet3 >>> device to be generic and re-usable. Their main advantage is support >>> for virtio headers for virtio-enabled backends and emulation of >>> network offloads in software for backends that do not support virtio. >>> >>> Of course they may be re-used by virtio, however I’m not sure if it >>> will be really useful because virtio has feature negotiation >>> facilities and do not require SW emulation for network task offloads. >>> >>> For other devices they are useful because each and every device that >>> requires SW offloads implementation need to do exactly the same things >>> and it doesn’t make sense to have a few implementations for this. >>> >>> Best Regards, >>> Dmitry >> >> Ok, thanks for the explanation. >> >>> >>>> >>>> Thanks >>>> >>>>> >>>>> Since 2.5 is in soft freeze, this looks a 2.6 material. >