Hi Adrian,

The send mapping table is an array with fixed the size of elements, say 
VRSS_TAB_SIZE. It contains the tx queue number on which TX packet should be 
sent. So the vCPU = Send_table[hash-value % VRSS_TAB_SIZE % number_of_tx_queue] 
is the way to choose the tx queue. Send_table is updated by the host every few 
minutes (on a busy system) or hours (on a light system).

Since the vNIC doesn't give guest VM the hash value for a rx packet, I am 
thinking maybe I can put the rx queue number in the m_pkthdr.flowid of the mbuf 
on the receiving path. So the queue number will be passed to the mbuf on the 
sending path. This way we choose the same queue to send the packet, and we 
don't need to calculate the hash value in the software. 

The other way is calculating the hash value on the send path, and choose the tx 
queue based on the send table, letting the host to decide which queue to send 
packet (since the send table is given by host). 

I may implement the both and see which one has better performance. 

Thanks,
Wei



-----Original Message-----
From: adrian.ch...@gmail.com [mailto:adrian.ch...@gmail.com] On Behalf Of 
Adrian Chadd
Sent: Tuesday, August 12, 2014 2:27 AM
To: Wei Hu
Cc: d...@delphij.net; freebsd-net@freebsd.org
Subject: Re: vRSS support on FreeBSD

On 11 August 2014 02:48, Wei Hu <w...@microsoft.com> wrote:
> CC freebsd-net@ for wider discussion.
>
> Hi Adrian,
>
> Many thanks for the explanation.  I checked the if_igb.c  and found the 
> flowid field was set in the RX side in igb_rxeof():
>
> Igb_rxeof()
> {
>  ...
> #ifdef  RSS
>                         /* XXX set flowtype once this works right */
>                         rxr->fmp->m_pkthdr.flowid =
>                             le32toh(cur->wb.lower.hi_dword.rss);
>                         rxr->fmp->m_flags |= M_FLOWID;  ...
> }
>
> I have two questions regarding this.
>
> 1. Is the RSS hash value stored in cur->wb.lower.hi_dword.rss set by the NIC 
> hardware?

Yup.

> 2. So the hash value and m_flags are stored in the mbuf related to the 
> received packet on the rx side(lgb_rxeof()). But we check the hash value and 
> m_flags in mbuf related to the send packet on the tx side (in 
> igb_mq_start()). Does the kernel re-use the same mbuf for tx? If so, how does 
> it know for the same network stream it should use the same mbuf got from the 
> rx for packet sending? If not, how does the kernel preserve the same hash 
> value across the rx mbuf and tx mbuf for same network stream? This seems 
> quite magical to me.

The mbuf flowid/flowtype ends up in the inpcb->inp_flowid /
inpcb->inp_flowtype as part of the TCP receive path.

Then whenever the TCP code outputs an mbuf, it copies the inpcb flow details 
out to outbound mbufs.

>
> For the Hyper-V case, the host controls which vCPU it wants to interrupt. And 
> the rule can change dynamically based on the load. For a non-busy VM, host 
> will send most packets to same vCPU for power saving purpose. For a busy VM, 
> host will distribute the packets evenly across all vCPUs. This means host 
> could change the RSS bucket mapping dynamically. Hyper-V does this by sending 
> a mapping table to VM whenever the it needs update. This also means we cannot 
> use FreeBSD's own bucket mapping which I believe is fixed. Also Hyper-V use 
> its own hash key. So do you think it is possible we still use the exisiting 
> RSS infrastructure built in FreeBSD in this purpose?

Eventually. Doing rebalancing in RSS is on the TODO list, after I get the rest 
of the basic packet handling / routing done.

How's vRSS notify the VM that the mapping table has changed? What's the format 
of it look like?


-a
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Reply via email to