On Mon, Mar 28, 2016 at 06:45:26PM -0700, Mohammad El-Shabani wrote: > Hi, > Looking into why it hurts performance, I see that ixgbe_dev_rx_queue_count > is implemented a scan of elements of rx descriptors, which is very > expensive. I am wondering why its implemented the way it is. Could it not > just read the head location from the driver? > > Thanks! > Mohammad El-Shabani
It's likely that reading the head location from the driver will be even slower than scanning the descriptor rings in memory. Access to PCI is very much slower than accessing memory - especially since on platforms with DDIO, many memory accesses will actually be cache reads. That being said, I haven't actually written a test to prove this out, so feel free to try out the head pointer read method instead and see if it improves things. The results may vary depending on how far ahead needs to be scanned, but certainly for the empty ring case, the descriptor scan method will be far faster than a head read. Regards, /Bruce