On 12/26/2017 10:22 PM, David Miller wrote:
> From: Joao Martins <joao.m.mart...@oracle.com>
> Date: Thu, 21 Dec 2017 17:24:28 +0000
> 
>> Commit eb1723a29b9a ("xen-netback: refactor guest rx") refactored Rx
>> handling and as a result decreased max grant copy ops from 4352 to 64.
>> Before this commit it would drain the rx_queue (while there are
>> enough slots in the ring to put packets) then copy to all pages and write
>> responses on the ring. With the refactor we do almost the same albeit
>> the last two steps are done every COPY_BATCH_SIZE (64) copies.
>>
>> For big packets, the value of 64 means copying 3 packets best case scenario
>> (17 copies) and worst-case only 1 packet (34 copies, i.e. if all frags
>> plus head cross the 4k grant boundary) which could be the case when
>> packets go from local backend process.
>>
>> Instead of making it static to 64 grant copies, lets allow the user to
>> select its value (while keeping the current as default) by introducing
>> the `copy_batch_size` module parameter. This allows users to select
>> the higher batches (i.e. for better throughput with big packets) as it
>> was prior to the above mentioned commit.
>>
>> Signed-off-by: Joao Martins <joao.m.mart...@oracle.com>
>> ---
>> Changes since v1:
>>  * move rx_copy.{idx,op} reallocation to separate helper
>>  Addressed Paul's comments:
>>  * rename xenvif_copy_state#size field to batch_size
>>  * argument `size` should be unsigned int
>>  * vfree is safe with NULL
>>  * realloc rx_copy.{idx,op} after copy op flush
> 
> I truly dislike things of this nature.
> 
> When you give the user a numerical value to set, they have to pick
> something.  This in turn requires deep, weird, knowledge of how the
> driver implements RX packet processing.
> 
> That's asbolutely unacceptable.  Can you imagine being an admin and
> trying to figure out what random number to plug into this thing?
> 
> "maximum number of grant copies on RX"
> 
> I've been the networking maintainer for more than 2 decades and I
> have no idea whatsoever what kind of value I might want to set
> there.
> 
> Nobody should have to know this other than people working on the
> driver.
> 
Sorry, I didn't consider that.

> Instead, the issue is that the driver can optimize for throughput
> or something else (latency, RX packing, I don't know exactly what
> it is, but you're keeping the default value so it has some merit
> right?).  Therefore, what you need to export is a boolean which
> is self describing.
> 
I kept the default because of the recent work on tidy up
netback rx structures. But while it gives much less overhead when going
with bigger numbers of VMs/interfaces, it limits throughput per
interface (hence the parameter).

> "rx_optimize_throughput"
> 
> That's it.  And you, the smart person who knows what this grant
> copy mumbo jumbo means, can pick a specific value to use for
> high throughput.
>
OK.

Joao

Reply via email to