On Thu, Jun 22, 2006 at 03:31:22PM -0400, jamal wrote:
> 
> Your gut feeling is for #1 and my worry is for #2 ;->
> I actually think your change is obviously valuable for scenarios where
> the bus is slower and therefore transmits take longer - my feeling is it
> may not be beneficial for fast buses like PCI-E or high speed PCI/X
> where the possibility of getting access tx collision is lower.

Sure.  However, I still don't see the point of transmitting in parallel
even there.  The reason is that there is no work being done here by the
CPU between dequeueing the packet and obtaining the TX lock.  As such
the cost of doing it in parallel is going to be dominated by the cache
bouncing.

Obviously it is a little different for lockless drivers where we do
dev_queue_xmit_nit (and now GSO) without any locks.  However, you don't
want parallelism there because it breaks packet ordering.

> The other reason I mentioned earlier as justification to leave the
> granularity at the level where it was is for good qos clocking. i.e
> to allow incoming packets to be used to clock the tx path - otherwise
> you will be dependent on HZ for your egress rate accuracy. I am not sure
> if this later point made sense - I could elaborate.

I don't understand where HZ comes in.  If you find that qdisc_run is
already running, then the packet you've just queued will most likely
be processed by that qdisc_run immediately unless the device is full.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to