Hello, Alexey.

On Thu, Jul 27, 2006 at 08:33:35PM +0400, Alexey Kuznetsov ([EMAIL PROTECTED]) 
wrote:
> First, it was stated that suggested implementation performs better and even
> much better. I am asking why do we see such improvement?
> I am absolutely not satisifed with statement "It is better. Period."
> From all that I see, this particular implementation does not implement
> optimizations suggested by VJ, it implements only the things,
> which are not supposed to affect performance or to affect it negatively.

Just for clarifications: I showed that even using _existing_ stack
(using sk_backlog_rcv) performance in process context can exceed two
level processing. And after creating own TCP implemetation 
(which does not include two-level related overhead among other things)
performance different was even higher. I can agree that it is possible
that in second case part of the gain is obtained from the new TCP
implementation, but not 100% from process' context, but in first place 
existing socket code was used.

> > userspace), no dequeue lock is required.
> 
> And that was a part of the second question.
> 
> I do not see, how single threaded TCP is possible. In receiver path
> it has to ack with quite strict time bounds, to delack etc., in sender path
> it has to slow start, I am even not saying about "slow path" things:
> retransmit, probing window, lingering without process context etc.
> It looks like, VJ implies the protocol must be changed. We can't, we mustn't.
> 
> After we deidealize this idealization and recognize that some "slow path"
> should exist and some part of this "slow path" has to be executed
> with higher priority than the "fast" one, where do we arrive?
> Is not it exactly what we have right now? Clean fast path, separate slow path.
> Not good enough? Where? Let's find and fix this.

Slow path does exist, retransmits and friends are there too in new stack.
And my initial netchannel implementation used  _existing_ socket code
from process context. Again, there is no need to crate two levels
between fast and slow or softirq and process, and it was proven and
shown that it can perform faster.

Why don't you want to see, that existing model is just path enlargement:
there might also exist delayes between hard and soft irqs, so acks will
be delayed and so on... But stack works without problems even if some
kernel thread takes 100% cpu (with preemption), and there are very big
delays for ack generations, but userspace is not possible to get that
data. With netchannels it is essentially the same (heh, I said that
already a lot of times).

> Alexey

-- 
        Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to