On 10/25/17 4:22 PM, Ferruh Yigit wrote:
On 10/25/2017 1:16 PM, Bruce Richardson wrote:
On Wed, Oct 25, 2017 at 11:11:08AM -0700, Ferruh Yigit wrote:
On 10/23/2017 10:42 AM, Roger B. Melton wrote:
On 10/20/17 3:04 PM, Ferruh Yigit wrote:
On 10/12/2017 10:24 AM, Roger B Melton wrote:
When copying VLAN tags from the RX descriptor to the vlan_tci field
in the mbuf header,  igb_rxtx.c:eth_igb_recv_pkts() and
eth_igb_recv_scattered_pkts() both assume that the VLAN tag is always
little endian.  While i350, i354 and /i350vf VLAN non-loopback
packets are stored little endian, VLAN tags in loopback packets for
those devices are big endian.

For i350, i354 and i350vf VLAN loopback packets, swap the tag when
copying from the RX descriptor to the mbuf header.  This will ensure
that the mbuf vlan_tci is always little endian.

Signed-off-by: Roger B Melton <rmel...@cisco.com>
<...>

@@ -946,9 +954,16 @@ eth_igb_recv_pkts(void *rx_queue, struct rte_mbuf 
**rx_pkts,
rxm->hash.rss = rxd.wb.lower.hi_dword.rss;
                hlen_type_rss = rte_le_to_cpu_32(rxd.wb.lower.lo_dword.data);
-               /* Only valid if PKT_RX_VLAN_PKT set in pkt_flags */
-               rxm->vlan_tci = rte_le_to_cpu_16(rxd.wb.upper.vlan);
-
+               /*
+                * The vlan_tci field is only valid when PKT_RX_VLAN_PKT is
+                * set in the pkt_flags field and must be in CPU byte order.
+                */
+               if ((staterr & rte_cpu_to_le_32(E1000_RXDEXT_STATERR_LB)) &&
+                       (rxq->flags & IGB_RXQ_FLAG_LB_BSWAP_VLAN)) {
This is adding more condition checks into Rx path.
What is the performance cost of this addition?
I have not measured the performance cost, but I can collect data. What
specifically are you looking for?

To be clear the current implementation incorrect as it does not
normalize the vlan tag to CPU byte order before copying it into mbuf and
applications have no visibility to determine if the tag in the mbuf is
big or little endian.

Do you have any suggestions for an alternative approach to avoid rx
patch checks?
No suggestion indeed. And correctness matters.

But this add a cost and I wonder how much it is, based on that result it may be
possible to do more investigation for alternate solutions or trade-offs.

Konstantin, Bruce, Wenzhuo,

What do you think, do you have any comment?

For a 1G driver, is performance really that big an issue?
I don't know. So is this an Ack from you for the patch?

I can tell you that from the perspective of my application the performance impact for 1G is not a concern.

FWIW, I did go through a few iterations with Wenzhou to minimize the performance impact before we settled on this implementation, and Wenzhou did Ack it btw.

I'm hoping we can get this into 17.11.

Thanks,
-Roger


Unless you
have a *lot* of 1G ports, I would expect most platforms not to notice an
extra couple of cycles when dealing with 1G line rates.

/Bruce

.


Reply via email to