Evgeniy,

Some good ideas in there. 
You should talk/sync to  Max Krasnyansky (CCed). I think theres a lot of
stuff you are doing that he is trying to do as well with the new tuntap
that he is working on. I think put together - some cool ideas can be
implemented. 

cheers,
jamal

On Thu, 2005-14-07 at 14:21 +0400, Evgeniy Polyakov wrote:
> Hello, network developers.
> 
> I'm pleased to announce first pre-alpha version
> of the Zero-copy sniffer "device".
> It acts as packet socket, i.e. gets all packets 
> using prot_hook.func(), but never copy it.
> 
> Basic idea behind zero-copy is remapping of the 
> physical pages where skb->data lives to the
> userspace process.
> 
> According to my tests, which can be found commented
> in the code (packet_mmap()), 
> remapping of one page gets from 5 upto 20
> times faster than copying the same amount of data
> (i.e. PAGE_SIZE).
> 
> Since current VM code requires PTE to be unmapped,
> when remapping, but only exports unmap_mapping_range()
> and __flush_tlb(), I used them, although they are quite
> heavy monsters.
> It also required mm->mmap_sem to be held, 
> so I placed main remapping code into workqueue.
> 
> skbs are queued in prot_hook.func() and then workqueue
> is being scheduled, where skb is unlinked and remapped.
> It is not freed there, as it should be, since userspace
> will never found real data then, but instead
> some smart algo should be investigated to defer skb freeing,
> or simple defering using timer and redefined skb destructor.
> It also should remap several skbs at once, so rescheduling
> would not appeared very frequently.
> First mapped page is information page, where offset in page
> of the skb->data is placed, so userspace can detect
> where actual data lives on the next page.
> 
> Such schema is very suitable for applications that
> do not require the whole data flow, but only select some data
> from the flow, based on packet content.
> I'm quite sure it will be slower than copying for small packets, 
> so this two ideas must be combined to achieve 
> the maximum sniffer performance.
> 
> Current code is basically proof-of-concept, so
> it has tons of dirty quirks, and I'm not a VM hacker, 
> so I would gladly listen your thoughts about the code and idea itself.
> 
> Attached files:
> af_tlb.[ch] - kernel side sniffer implementation.
> tlb_test.c - userspace "sniffer".
> Makefile - build kernel side with "all" target and userspace
> with "test" target.
> 
> Thank you.
> 

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to