Hi Simon,
> Currently the rx/tx queue is allocated from the buffer pool on socket of: > - port's socket if --port-numa-config specified > - or ring-numa-config setting per port > > All the above will "bind" queue to single socket per port configuration. > But it can actually archieve better performance if one port's queue can > be spread across multiple NUMA nodes, and the rx/tx queue is allocated > per lcpu socket. > > With this patch, testpmd can utilize the PCI-e bus bandwidth on another > NUMA nodes. with 64bytes package, When running in PowerPC with Mellanox > CX-4 card, single port(100G), with 8 cores, fw mode: > - Without this patch: 52.5Mpps throughput > - With this patch: 66Mpps throughput > > Signed-off-by: Simon Guo <wei.guo.si...@gmail.com> > diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c > index fbe6284..d02059d 100644 > --- a/app/test-pmd/parameters.c > +++ b/app/test-pmd/parameters.c > @@ -130,6 +130,11 @@ > "(flag: 1 for RX; 2 for TX; 3 for RX and TX).\n"); > printf(" --socket-num=N: set socket from which all memory is allocated > " > "in NUMA mode.\n"); > + printf(" --ring-bind-lcpu: " > + "specify TX/RX rings will be allocated on local socket of lcpu." > + "It will overrridden ring-numa-config or port-numa-config if > success." > + "If local ring buffer is unavailable, it will use > --ring-numa-config or port-numa-config instead." > + "It allows one port binds to multiple NUMA nodes.\n"); I think it's a good patch to give the APP an example about how to choose the appropriate core. Just have some concern about the priority. Maybe ring-numa-config and port-numa-config should have higher priority. Becuase if APP assigned the specific socket by some purpose, it's not good to overwrite it silently. > printf(" --mbuf-size=N: set the data size of mbuf to N bytes.\n"); > printf(" --total-num-mbufs=N: set the number of mbufs to be allocated " > "in mbuf pools.\n");