Hi, On Fri, Nov 13, 2009 at 10:57:06AM +0800, Da Zheng wrote:
> Experiment 1: sending 100MB data from the host machine to the Hurd > with TCP. I only calculate data transferring rate (i.e., excluding the > size of packet headers). > in-kernel user-level average rate: > 2.3MB/s 2.4MB/s peak rate: > 3.9MB/s 2.6MB/s > > Experiment 2: sending 1,000,000 UDP packets of minimum size (60 bytes > including packet headers) within 30 seconds > in-kernel user-level > nb of received packets: 46,959 68,174 These results (both kernel and user space driver) are worse than I expected... Didn't think Mach would be *that* bad :-( I wonder whether it's also that bad without a VM?... But at least it shows that putting the actual driver in user space as well (in addition to the TCP/IP stack) has no negative impact with the current implementation -- on the contrary, the results are more stable, which actually suggests better characteristics. Of course it's hard to tell how the situation would look, if we could optimize the overall performance. However, it seems that Mach IPC (delivering all the packets from driver to TCP/IP stack) is the limiting factor in both cases; so the picture probably wouldn't change much. BTW, another interesting test might be sending packets at some constant rate below the maximum, and compare drop rates... > As we see, both have quite bad performance in Experiment 2, but the > result is still really unexpected. But I think it can be explain as > follows: The in-kernel driver puts received packets in a queue so the > software interrupt can handle it and deliver it to the user space. But > if the queue is full, received packets are simply dropped. In the case > that a large number of packets rush to the Ethernet card in a very > short time, most of CPU time is used by the interrupt handler to get > the packet from the card but the interrupt handler fails to put > packets in the queue for further delivery. As a result, most of CPU is > wasted. > My user-level driver doesn't put received packets in the queue, but > instead it calls mach_msg() to deliver messages to pfinet directly. > It's true that the user-level interrupt handler fails to get all > packets from the card and most packets are discarded by the card > directly. But as long as a packet is received by the driver, it's very > likely that the packet will be delivered to pfinet. Thus, CPU isn't > much wasted Yeah, this explanation sounds quite plausible :-) -antrik-