On Fri, Jul 15, 2005 at 05:51:41PM -0400, Jamal Hadi Salim ([EMAIL PROTECTED]) wrote: > Evgeniy, > > Some good ideas in there. > You should talk/sync to Max Krasnyansky (CCed). I think theres a lot of > stuff you are doing that he is trying to do as well with the new tuntap > that he is working on. I think put together - some cool ideas can be > implemented.
The latest version is available at http://tservice.net.ru/~s0mbre/archive/af_tlb It has several enhancements, fixed some stuff, many cleanups. Fixed/upgraded skb freeing mechanism. I think this zero-copy mechanism can be used in tun/tap devices too , although main idea was to implement sniffer module to grab whole traffic as fast as possible - current code requeues skb from prot_hook.func() into per-socket queue, so it could be used for tun/tap device queue too. > cheers, > jamal > > On Thu, 2005-14-07 at 14:21 +0400, Evgeniy Polyakov wrote: > > Hello, network developers. > > > > I'm pleased to announce first pre-alpha version > > of the Zero-copy sniffer "device". > > It acts as packet socket, i.e. gets all packets > > using prot_hook.func(), but never copy it. > > > > Basic idea behind zero-copy is remapping of the > > physical pages where skb->data lives to the > > userspace process. > > > > According to my tests, which can be found commented > > in the code (packet_mmap()), > > remapping of one page gets from 5 upto 20 > > times faster than copying the same amount of data > > (i.e. PAGE_SIZE). > > > > Since current VM code requires PTE to be unmapped, > > when remapping, but only exports unmap_mapping_range() > > and __flush_tlb(), I used them, although they are quite > > heavy monsters. > > It also required mm->mmap_sem to be held, > > so I placed main remapping code into workqueue. > > > > skbs are queued in prot_hook.func() and then workqueue > > is being scheduled, where skb is unlinked and remapped. > > It is not freed there, as it should be, since userspace > > will never found real data then, but instead > > some smart algo should be investigated to defer skb freeing, > > or simple defering using timer and redefined skb destructor. > > It also should remap several skbs at once, so rescheduling > > would not appeared very frequently. > > First mapped page is information page, where offset in page > > of the skb->data is placed, so userspace can detect > > where actual data lives on the next page. > > > > Such schema is very suitable for applications that > > do not require the whole data flow, but only select some data > > from the flow, based on packet content. > > I'm quite sure it will be slower than copying for small packets, > > so this two ideas must be combined to achieve > > the maximum sniffer performance. > > > > Current code is basically proof-of-concept, so > > it has tons of dirty quirks, and I'm not a VM hacker, > > so I would gladly listen your thoughts about the code and idea itself. > > > > Attached files: > > af_tlb.[ch] - kernel side sniffer implementation. > > tlb_test.c - userspace "sniffer". > > Makefile - build kernel side with "all" target and userspace > > with "test" target. > > > > Thank you. > > > > - > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Evgeniy Polyakov - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html