On Thu, May 01, 2025 at 03:56:48PM +0000, Michael Kelley wrote: > From: Shradha Gupta <shradhagu...@linux.microsoft.com> Sent: Thursday, May 1, > 2025 7:24 AM > > > > On Thu, May 01, 2025 at 05:27:49AM +0000, Michael Kelley wrote: > > > From: Shradha Gupta <shradhagu...@linux.microsoft.com> Sent: Friday, > > > April 25, > > 2025 3:55 AM > > > > > > > > Currently, the MANA driver allocates MSI-X vectors statically based on > > > > MANA_MAX_NUM_QUEUES and num_online_cpus() values and in some cases ends > > > > up allocating more vectors than it needs. This is because, by this time > > > > we do not have a HW channel and do not know how many IRQs should be > > > > allocated. > > > > > > > > To avoid this, we allocate 1 MSI-X vector during the creation of HWC and > > > > after getting the value supported by hardware, dynamically add the > > > > remaining MSI-X vectors. > > > > > > I have a top-level thought about the data structures used to manage a > > > dynamic number of MSI-X vectors. The current code allocates a fixed size > > > array of struct gdma_irq_context, with one entry in the array for each > > > MSI-X vector. To find the entry for a particular msi_index, the code can > > > just index into the array, which is nice and simple. > > > > > > The new code uses a linked list of struct gdma_irq_context entries, with > > > one entry in the list for each MSI-X vector. In the dynamic case, you can > > > start with one entry in the list, and then add to the list however many > > > additional entries the hardware will support. > > > > > > But this additional linked list adds significant complexity to the code > > > because it must be linearly searched to find the entry for a particular > > > msi_index, and there's the messiness of putting entries onto the list > > > and taking them off. A spin lock is required. Etc., etc. > > > > > > Here's an intermediate approach that would be simpler. Allocate a fixed > > > size array of pointers to struct gdma_irq_context. The fixed size is the > > > maximum number of possible MSI-X vectors for the device, which I > > > think is MANA_MAX_NUM_QUEUES, or 64 (correct me if I'm wrong > > > about that). Allocate a new struct gdma_irq_context when needed, > > > but store the address in the array rather than adding it onto a list. > > > Code can then directly index into the array to access the entry. > > > > > > Some entries in the array will be unused (and "wasted") if the device > > > uses fewer MSI-X vector, but each unused entry is only 8 bytes. The > > > max space unused is fewer than 512 bytes (assuming 64 entries in > > > the array), which is neglible in the grand scheme of things. With the > > > simpler code, and not having the additional list entry embedded in > > > each struct gmda_irq_context, you'll get some of that space back > > > anyway. > > > > > > Maybe there's a reason for the list that I missed in my initial > > > review of the code. But if not, it sure seems like the code could > > > be simpler, and having some unused 8 bytes entries in the array > > > is worth the tradeoff for the simplicity. > > > > > > Michael > > > > Hey Michael, > > > > Thanks for your inputs. We did think of this approach and in fact that > > was how this patch was implemented(fixed size array) in the v1 of our > > internal reviews. > > > > However, it came up in those reviews that we want to move away > > from the 64(MANA_MAX_NUM_QUEUES) as a hard limit for some new > > requirements, atleast for the dynamic IRQ allocation path. And now the > > new limit for all hardening purposes would be num_online_cpus(). > > > > Using this limit and the fixed array size approach creates problems, > > especially in machines with high number of vCPUs. It would lead to > > quite a bit of memory/resource wastage. > > > > Hence, we decided to go ahead with this design. > > > > Regards, > > Shradha. > > One other thought: Did you look at using an xarray? See > https://www.kernel.org/doc/html/latest/core-api/xarray.html. > It has most of or all the properties you need to deal with > a variable number of entries, while handling all the locking > automatically. Entries can be accessed with just a simple > index value. > > I don't have first-hand experience writing code using xarrays, > so I can't be sure that it would simplify things for MANA IRQ > allocation, but it seems to be a very appropriate abstraction > for this use case. > > Michael > Thanks Michael,
This does look promising for our usecase. I will try it with this patch, update the thread and then send out the next version as required. Regards, Shradha.