Sorry for delayed response. From: Laszlo Ersek <ler...@redhat.com> Subject: Re: [grub PATCH] efinet: disable MNP background polling Date: Thu, 1 Oct 2015 13:50:31 +0200
> CC'ing Mark Salter, and edk2-devel, also updating the subject slightly > for better context. > > On 10/01/15 11:26, HATAYAMA Daisuke wrote: >> Currently, as of the commit f348aee7b33dd85e7da62b497a96a7319a0bf9dd, >> SNP is exclusively reopened to avoid slowdown or complete failure to >> load files due to races between MNP background polling and grub's >> receiving and transmitting packets. >> >> This exclusive SNP reopen affects some EFI applications/drivers that >> use SNP and causes PXE boot on such systems to fail. Hardware filter >> issue fixed by the commit 49426e9fd2e562c73a4f1206f32eff9e424a1a73 is >> one example. > > I think the above two commit references are in inverse order. That is, > commit 49426e9f is the one responsible for the (needed) exclusive open, > and commit f348aee7 fixes up the former by reconfiguring the receive > filters. > > In other words, the first two paragraphs here seem accurate, just please > reorder the commit hashes. > >> The difficulty here is that effects of the exclusive SNP reopen >> differs from system to system depending their UEFI firmware >> implementations. One system works well with the commit >> f348aee7b33dd85e7da62b497a96a7319a0bf9dd only, another system still >> needs the commit 49426e9fd2e562c73a4f1206f32eff9e424a1a73, and there >> could be other systems where PXE boot still fails. > > Here too I think the commit hashes should be switched around. > Sorry, I'll fix this in next path if I mention them. >> >> Actually, PXE boot fails on Fujitsu PRIMEQUEST 2000 series now. >> >> Thus, it seems to me that the exclusive SNP reopen makes grub >> maintenance difficult by affecting a variety of systems differently. > > (Admittedly, this is going to be a bit of speculation:) I think the > difference in behavior is not due to SNP implementations, but due to > differences in higher level protocol implementations -- i.e., the > behavior depends on what those drivers do to the underlying SNP that are > *closed*. > > According to the spec of OpenProtocol(), in case of a successful > exclusive open, the other BY_DRIVER reference is kicked off with > DisconnectController(), which in turn calls the other driver's > EFI_DRIVER_BINDING_PROTOCOL.Stop() function. > > So, the question is what that *other* (higher level) driver does in its > Stop() function, when it unbinds the handle with the underlying SNP > interface. Does it mess with SNP's receive filters? And so on. > >> >> Instead, the idea of this patch is to disable MNP background polling >> temporarily while grub uses SNP. Then, the race issue must be removed. > > This is not the right approach in my opinion. > > The original problem is that GRUB's direct access to SNP races with > *several* MNP instances to the same SNP. The MNP instances are correctly > arbitrated between each other (see more on this later), but the SNP > accesses from GRUB are not. > > SNP is meant as an exclusive-access interface to the NIC, so those > parallel clients won't work. > > One way to solve that is to kick out those other SNP accessors, by way > of exclusive open. This is correct from a UEFI driver model perspective, > but -- unless GRUB does full reconfiguration on the SNP level -- brittle > in practice, as you've experienced; apparently different implementations > of higher level protocols leave the SNP in different states when they go > away. (It's perfectly possible that some of those driver binding Stop() > functions have bugs.) > > However, the other way (because there is another way, yes) is different > from interfering with existing MNP instances (note: plural). MNP (and a > bunch of other networking related protocols) don't work like your usual > UEFI device drivers; they are child protocols (= protocol interfaces on > child handles) that are created *as needed* with the help of Service > Binding protocol instances. > > Please read the following sections of the UEFI spec (v2.5) carefully: > > - 2.5.8 EFI Services Binding > - 10.6 EFI Service Binding Protocol > - 24.1 EFI Managed Network Protocol > EFI_MANAGED_NETWORK_SERVICE_BINDING_PROTOCOL > >> This chapter defines the EFI Managed Network Protocol. It is split >> into the following two main sections: >> >> * Managed Network Service Binding Protocol (MNSBP) >> * Managed Network Protocol (MNP) >> >> The MNP provides raw (unformatted) asynchronous network packet I/O >> services. These services make it possible for multiple-event-driven >> drivers and applications to access and use the system network >> interfaces at the same time. > > The idea is that exactly one MNP *Service Binding* protocol sits on top > of SNP. Then this protocol can be used by several clients to create > client-private MNP instances -- i.e. to bind the networking *service*, > not the hardware controller handle. Those MNP instances can be then used > in parallel by different clients, and the MNPSB driver will > automatically arbitrate the exclusive SNP access between them. > > In other words, > - think of the SNP as an exclusive resource, > - think of MNPSB as a *proxy* (or multiplex) that *owns* SNP, > - and if you need async packet access, go to the MNPSB, and ask it to > give you your dedicated MNP instance, > - send and receive using your private MNP instance, > - MNPSB will nicely serialize / arbitrate that against other (= > sibling) MNP instances. > > And now it should be clear why this patch is not correct -- looking up > an existent MNP instance (or any other protocol instance that was > created with a Service Binding protocol's CreateChild() function) is > *never* correct. Such a protocol instance always belongs to *another* > client (unless you happen to find the MNP instance your own self created > earlier), and you have no business messing with that. > > Therefore, if the current forced-exclusive SNP access doesn't work > reliably in GRUB, then I can see only the following method: > > - identify NIC handles the same way you do now (I guess by enumerating > SNP interfaces) > - for each handle found, look for MNPSB *first*. > - If there is no MNPSB, then stick with the current strategy -- open > the SNP with exclusive access > - However, if there *is* an MNPSB, call the CreateChild() function to > extract a new MNP instance > - Use this MNP instance to send and receive packets. This MNP instance > will peacefully coexist with other (sibling) MNP clients. > > Here's an example. Boot OVMF, with a virtio-net-pci device enabled in > QEMU, and make sure that the NICs ROM BAR does not contain the iPXE > driver (-device virtio-net-pci,rombar=0). This will allow OVMF's builtin > VirtioNetDxe to bind the device (which I want to use now for > illustrating the question). Then, enter the UEFI shell, and issue the > following command: > >> Shell> dh -d -v -p SimpleNetwork >> A0: UnknownDevice >> LoadFile >> PXEBaseCode >> TCPv4ServiceBinding >> MTFTPv4ServiceBinding >> DHCPv4ServiceBinding >> UDPv4ServiceBinding >> IPv4Config2 >> IPv4ServiceBinding >> ARPServiceBinding >> UnknownDevice >> ManagedNetworkServiceBinding >> VlanConfig >> DevicePath PciRoot(0x0)/Pci(0x3,0x0)/MAC(52540036A382,0x1) >> SimpleNetwork >> >> Controller Name : Virtio Network Device >> Device Path : PciRoot(0x0)/Pci(0x3,0x0)/MAC(52540036A382,0x1) >> Controller Type : BUS >> Configuration : NO >> Diagnostics : NO >> Managed by : >> Drv[6F] : MNP Network Service Driver >> Drv[70] : VLAN Configuration Driver >> Drv[73] : IP4 Network Service Driver >> Parent Controllers : >> Parent[8C] : Virtio Network Device >> Child Controllers : >> Child[A2] : MNP (MAC=52-54-00-36-A3-82, ProtocolType=0x806, >> VlanId=0) >> Child[A3] : MNP (MAC=52-54-00-36-A3-82, ProtocolType=0x800, >> VlanId=0) >> Child[A1] : >> PciRoot(0x0)/Pci(0x3,0x0)/MAC(52540036A382,0x1)/VenHw(D79DF6B0-EF44-43BD-9797-43E93BCF5FA8) >> Child[A4] : >> PciRoot(0x0)/Pci(0x3,0x0)/MAC(52540036A382,0x1)/VenHw(9FB1A1F3-3B71-4324-B39A-745CBB015FFF) > > There is only one handle with an SNP instance on it in the system > (because I configured only one virtio-net NIC for the VM). The > underlying hardware controller handle (marked as A0 above) has a bunch > of other protocol interfaces installed on it. Note the many > ___ServiceBinding protocols! > > ( > For example, the UEFI spec says about UDPv4ServiceBinding: > > The EFI UDPv4 Service Binding Protocol is used to locate > communication devices that are supported by an EFI UDPv4 Protocol > driver and to create and destroy instances of the EFI UDPv4 Protocol > child protocol driver that can use the underlying communications > device. > > Multiple consumers could call UDPv4ServiceBinding.CreateChild(), and > then use their private UDPv4 protocol interfaces to transmit and receive > in parallel. > ) > > Also, note that there is *no* MNP instance directly installed on the > handle. > > Anyway, among other things, this handle is open by "MNP Network Service > Driver" (which directly provides MNPSB), marked as [6F]. And, > apparently, there have been two requests to MNPSB for children: see [A2] > and [A3] above. > > We can investigate those as well: > >> Shell> dh -d -v A2 >> A2: 7EF49B58 >> ManagedNetwork >> Controller Name : MNP (MAC=52-54-00-36-A3-82, ProtocolType=0x806, >> VlanId=0) >> Device Path : <None> >> Controller Type : BUS >> Configuration : NO >> Diagnostics : NO >> Managed by : >> Drv[71] : ARP Network Service Driver >> Parent Controllers : >> Parent[A0] : Virtio Network Device >> Child Controllers : >> Child[AB] : PXE Controller > > The first MNP child has been extracted / requested, from MNPSB, by the > ARP (SB) driver. > >> Shell> dh -d -v A3 >> A3: 7EF56AD8 >> ManagedNetwork >> Controller Name : MNP (MAC=52-54-00-36-A3-82, ProtocolType=0x800, >> VlanId=0) >> Device Path : <None> >> Controller Type : BUS >> Configuration : NO >> Diagnostics : NO >> Managed by : >> Drv[73] : IP4 Network Service Driver >> Parent Controllers : >> Parent[A0] : Virtio Network Device >> Child Controllers : >> Child[A5] : IPv4 (SrcIP=0.0.0.0) >> Child[A6] : IPv4 (SrcIP=0.0.0.0) >> Child[A8] : IPv4 (Not started) >> Child[AA] : IPv4 (SrcIP=0.0.0.0) >> Child[AD] : PXE Controller >> Child[AE] : IPv4 (Not started) >> Child[B1] : IPv4 (Not started) >> Child[B3] : IPv4 (Not started) >> Child[B7] : IPv4 (Not started) > > The other MNP child has been extracted / requested, from MNPSB, by the > IPv4 (SB) driver. > > Now, assuming GRUB wants Ethernet packet level access, this is exactly > what GRUB should do too: locate MNPSB, and ask it for another MNP > instance. That MNP instance will be dedicated to GRUB, and the MNP > Network Service Driver will multiplex it with other users (ARP Network > Service Driver, IP4 Network Service Driver). > > Alternatively, GRUB could look for higher level service binding > protocols as well, on EFI systems (see the list near A0 above), but I > doubt that would fit well into GRUB's net framework (which is supposed > to run on "legacy" BIOS systems too). > > To summarize: > - identify the level / abstraction of the network protocol that GRUB > needs, > - assuming it is "ethernet packet", look for MNPSB first, and if it's > there, call it to get a private-use MNP instance, in order to transmit > and receive, > - if MNPSB is not there, open SNP in exclusive mode, same as now. > > Or else, > - stick with the current exclusive SNP reopen, but make sure that all > aspects are reconfigured from the ground up. > > ... I'm not a GRUB contributor, and ideas are cheap :), so I'll leave it > to you how much importance you give to the above. But, since you CC'd me > on the patch, I thought I'd offer an opinion. > > Thanks > Laszlo > Thanks for detailed explanation. I'm beginner for UEFI field so this is very helpful to understand the details around this issue. Now I think I need to do this work and I want sample codes that uses NMP and MNPSB protocols if exists. Could you tell me such ones if you know? In EDK2 source code? -- Thanks. HATAYAMA, Daisuke _______________________________________________ Grub-devel mailing list Grub-devel@gnu.org https://lists.gnu.org/mailman/listinfo/grub-devel