On Tue, 2021-01-19 at 18:29 -0800, Jakub Kicinski wrote: > On Wed, 20 Jan 2021 02:13:36 +0000 Venkataramanan, Anirudh wrote: > > > > As per the current logic, if the driver does not get the number > > > > of > > > > MSI- > > > > X vectors it needs, it will immediately drop to "Do I have at > > > > least > > > > two > > > > (ICE_MIN_LAN_VECS) MSI-X vectors?". If yes, the driver will > > > > enable > > > > a > > > > single Tx/Rx traffic queue pair, bound to one of the two MSI-X > > > > vectors. > > > > > > > > This is a bit of an all-or-nothing type approach. There's a > > > > mid- > > > > ground > > > > that can allow more queues to be enabled (ex. driver asked for > > > > 300 > > > > vectors, but got 68 vectors, so enabled 64 data queues) and > > > > this > > > > patch > > > > implements the mid-ground logic. > > > > > > > > This mid-ground logic can also be implemented based on the > > > > return > > > > value > > > > of pci_enable_msix_range() but IMHO the implementation in this > > > > patch > > > > using pci_enable_msix_exact is better because it's always only > > > > enabling/reserving as many MSI-X vectors as required, not more, > > > > not > > > > less. > > > > > > What do you mean by "required" in the last sentence? > > > > .. as "required" in that particular iteration of the loop. > > > > > The driver > > > requests num_online_cpus()-worth of IRQs, so it must work with > > > any > > > number of IRQs. Why is num_cpus() / 1,2,4,8 "required"? > > > > Let me back up a bit here. > > > > Ultimately, the issue we are trying to solve here is "what happens > > when > > the driver doesn't get as many MSI-X vectors as it needs, and how > > it's > > interpreted by the end user" > > > > Let's say there are these two systems, each with 256 cores but the > > response to pci_enable_msix_range() is different: > > > > System 1: 256 cores, pci_enable_msix_range returns 75 vectors > > System 2: 256 cores, pci_enable_msix_range returns 220 vectors > > > > In this case, the number of queues the user would see enabled on > > each > > of these systems would be very different (73 on system 1 and 218 on > > system 2). This variabilty makes it difficult to define what the > > expected behavior should be, because it's not exactly obvious to > > the > > user how many free MSI-X vectors a given system has. Instead, if > > the > > driver reduced it's demand for vectors in a well defined manner > > (num_cpus() / 1,2,4,8), the user visible difference between the two > > systems wouldn't be so drastic. > > > > If this is plain wrong or if there's a preferred approach, I'd be > > happy > > to discuss further. > > Let's stick to the standard Linux way of handling IRQ exhaustion, and > rely on pci_enable_msix_range() to pick the number. If the current > behavior of pci_enable_msix_range() is now what users want we can > change it. Each driver creating its own heuristic is worst of all > choices as most brownfield deployments will have a mix of NICs.
Okay, we will rework the fallback improvement logic. Just so you are aware, we will be posting a couple of bug-fix patches that fix issues around the current fallback logic.