Hello, finally got some time to test...
Table w. 214k routes with full rDoS on two intrefaces on 2 x AMD64 processors,
speed 2814.43 MHz. Profiled with CPU_CLK_UNHALTED and rtstat
w/o latest patch fib_trie pathes. Tput ~233 kpps
samples %symbol name
109925 14.4513 fn_trie_lookup
Stephen Hemminger writes:
> Dumping by prefix is possible, but unless 32x slower. Dumping in
> address order is just as logical. Like I said, I'm investigating what
> quagga handles.
How about taking a snapshot to in address order (as you did) to some
allocated memory, returning from that
Stephen Hemminger writes:
> Time to handle a full BGP load (163K of routes).
>
> Before: LoadDumpFlush
>
> kmem_cache 3.8 13.07.2
> iter 3.9 12.36.9
> unordered3.1 11.9
or ~3 hours with high very high load and
interface up/down every 5:th sec. Without the patch the irq's gets
disabled within a couple of seconds
A resolute way of handling the semaphores. :)
Signed-off-by: Robert Olsson <[EMAIL PROTECTED]>
Cheers
David Miller writes:
> > eth0 e1000_irq_enable sem = 1<- ifconfig eth0 down
> > eth0 e1000_irq_disable sem = 2
> >
> > **e1000_open <- ifconfig eth0 up
> > eth0 e1000_irq_disable sem = 3 Dead. irq's can't be enabled
> > e1000_irq_enable miss
> > eth0 e1000_irq
David Miller writes:
> > On Wednesday 16 January 2008, David Miller wrote:
> > > Ok, here is the patch I'll propose to fix this. The goal is to make
> > > it as simple as possible without regressing the thing we were trying
> > > to fix.
> >
> > Looks good to me. Tested with -rc8.
>
> T
Stephen Hemminger writes:
> This is how I did it:
Yes looks like an elegant solution. Did you even test it?
Maybe we see some effects in just dumping a full table?
Anyway lookup should be tested in some way. We can a lot
of analyzing before getting to right entry, local_table
backtracking
Eric Dumazet writes:
>
> So you think that a leaf cannot have 2 infos, one 'embeded' and one in the
> list ?
Hello,
The model I thought of is to have either:
1) One leaf_info embedded in leaf. A fast-path leaf. FP-leaf
Or
2) The intct old leaf_info list with arbitrary number leaf
Stephen Hemminger writes:
> Okay, I would rather see the leaf_info explicit inside the leaf, also
> your scheme probably breaks if I add two prefixes and then delete the first.
> Let me have a go at it.
I took Eric's patch a bit further...
Support for delete and dump is needed before any
Eric Dumazet writes:
> > Thats 231173/241649 = 96% with the current Internet routing.
> >
> > How about if would have a fastpath and store one entry direct in the
> > leaf struct this to avoid loading the leaf_info list in most cases?
> >
> > One could believe that both lookup and dump cou
David Miller writes:
> > The revision element must of been part of an earlier design,
> > because currently it is set but never used.
> >
> > Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]>
>
> I suspect Robert wanted to play around with some generation
> ID optimizations but never
Thanks for hacking and improving and the trie... another idea that could
be also tested. If we look into routing table we see that most leafs
only has one prefix
Main:
Aver depth: 2.57
Max depth: 7
Leaves: 231173
ip route | wc -l
241649
Thats 231173/24
t) node size.
You see the comment in code is correct so unsigned short are some leftover
from
old testing which could have hit us hard as the routing table slowly grows.
Cheers
--ro
Signed-off-by: Robert Olsson <[EMAIL PROTECTED]>
>
David Miller writes:
> > Is the netif_running() check even required?
>
> No, it is not.
>
> When a device is brought down, one of the first things
> that happens is that we wait for all pending NAPI polls
> to complete, then block any new polls from starting.
Hello!
Yes but the reaso
quota.
I've found a bunch of drivers having this bug.
Cheers.
--ro
Signed-off-by: Robert Olsson <[EMAIL PROTECTED]>
diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
index cf39473..f4137ad 100644
--- a/driver
No it doesn't. besides napi_disable and napi_synchronize are identical.
I was trying to disarm interrupts this way too.
The patch I did send yesterday is the only cure so-far but I don't if
it's 100% bullet proof either.
I was stress-testing it patch but ran into new problems...(schedul
Stephen Hemminger writes:
> It is considered a driver bug in 2.6.24 to call netif_rx_complete (clear
> NAPI_STATE_SCHED)
> and do a full quota. That bug already had to be fixed in other drivers,
> look like e1000 has same problem.
From what I see the problem is not related to ->poll. But i
Kok, Auke writes:
>
> Robert, please give that patch a try (it fixes a crash that I had here as
> well)
> and let us know if it works for you.
No it doesn't cure the problem I've reported
Cheers.
--ro
BTW. You can try to verify the probl
... one variant is below:
Signed-off-by: Robert Olsson <[EMAIL PROTECTED]>
diff --git a/net/core/dev.c b/net/core/dev.c
index 043e2f8..1031233 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2207,7 +2207,7 @@ static void net_rx_action(struct softirq_action *h)
* still &quo
Hi,
Steve Wise writes:
> I think pktgen should be cloning the skbs using skb_clone(). Then it
> will work for all devices, eh?
pktgen assumes for "fastpath" sending exclusive ownership of
the skb. And does a skb_get to avoid final skb destruction so
the same skb can be sent over and over
this change for now.
Cheers
--ro
Signed-off-by: Robert Olsson <[EMAIL PROTECTED]>
diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
index 7b0bcdb..5cb883a 100644
--- a/drivers/net/e1000/e1000_main.c
+++ b/drivers/net/
jamal writes:
> On Tue, 2007-28-08 at 21:43 -0700, Mandeep Singh Baines wrote:
> I think its a good thing pktgen caught this; i am unsure however if it
> is doing the right thing. Hoping Robert would respond.
> One thing pktgen could do is restrict the amount of outstanding buffers
> by usin
Christoph Hellwig writes:
> > Hello, It's not a job for pktgen.
>
> Please also kill the do_softirq export while you're at it.
Right seems like pktgen luckily was the only user.
Cheers
--ro
Signed-off-by: Robert Olsson <
Hello, It's not a job for pktgen.
Cheers
--ro
Signed-off-by: Robert Olsson <[EMAIL PROTECTED]>
diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index 18601af..975e887 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -1
Hello,
Below some pktgen support to send into different TX queues.
This can of course be feed into input queues on other machines
Cheers
--ro
Signed-off-by: Robert Olsson <[EMAIL PROTECTED]>
diff --git a/net/core/pktgen.c b/net/core/pktgen.c
Hello,
Initially pkt_dev can be NULL this causes netif_subqueue_stopped to
oops. The patch below should cure it. But maybe the pktgen TX logic
should be reworked to better support the new multiqueue support.
Cheers
--ro
Signed-off-by: Robert Olsson
Stephen Hemminger writes:
> I don't have a machine with anywhere near enough routes to test this,
> so would someone with many routes give it a go and make sure nothing
> got busted in the process.
Hello!
It's not only the numbers of routes thats important...
Anyway I've done what can
David Miller writes:
> From: Stephen Hemminger <[EMAIL PROTECTED]>
> Date: Thu, 26 Jul 2007 09:46:48 +0100
>
> > Try this out:
> > * replace macro's with inlines
> > * get rid of places doing multiple evaluations of NODE_PARENT
>
> No objections from me.
>
> Robert?
Fine i
jamal writes:
> What is your kernel config in regards to HRES timers? Robert mentioned
> to me that the clock source maybe causing issues with pktgen (maybe even
> qos). Robert, insights?
pktgen heavily uses gettimeofday. I was using tsc as clock source with
our opterons in the lab. In lat
jamal writes:
> I think the one described by Leonid has not just 8 tx/rx rings but also
> a separate register set, MSI binding etc iirc. The only shared resources
> as far as i understood Leonid are the bus and the ethernet wire.
AFAIK most new NIC will look like this...
I still lack a
jamal writes:
> The key arguement i make (from day one actually) is to leave the
> majority of the work to the driver.
> My view of wireless WMM etc is it is a different media behavior
> (compared to wired ethernet) which means a different view of strategy
> for when it opens the valve to al
ECTED]>
Signed-off-by: Robert Olsson <[EMAIL PROTECTED]>
Cheers
--ro
>
> diff --git a/net/core/pktgen.c b/net/core/pktgen.c
> index bc4fb3b..bcec8e4 100644
> --- a/net/core/pktgen.c
> +++ b/net/core/pktgen.c
> @@ -152,6
ll guess the ipsec part is to be considered work-in-progress
and you're doing both the work and the progress.
Signed-off-by: Robert Olsson <[EMAIL PROTECTED]>
Cheers
--ro
> diff --git a/net/core/pktgen.c b/net/core/p
efault all flows in pktgen are randomly selected.
> This patch introduces ability to have all defined flows to
> be sent sequentially. Robert defined randomness to be the
> default behavior.
>
> Signed-off-by: Jamal Hadi Salim <[EMAIL PROTECTED
my area.....
Acked-by: Robert Olsson <[EMAIL PROTECTED]>
Cheers
--ro
>
> diff --git a/include/net/xfrm.h b/include/net/xfrm.h
> index 311f25a..79d2c37 100644
> --- a/include/net/xfrm.h
> +++ b/include/net/xfrm.h
&g
Paul E. McKenney writes:
> Those of use who dive into networking only occasionally would much
> appreciate this. ;-)
No problem here...
Cheers
--ro
Acked-by: Robert Olsson <[EMAIL PROTECTED]>
Signed-off-by: Paul E. McKenney <[
Paul E. McKenney writes:
> > We have two users of trie_leaf_remove, fn_trie_flush and fn_trie_delete
> > both are holding RTNL. So there shouldn't be need for this preempt stuff.
> > This is assumed to a leftover from an older RCU-take.
>
> True enough! One request -- would it be reasona
nchronize_rcu of course,
> but with tnode_free using call_rcu it seems to be completely
> unnecessary. So I guess we can simply remove it.
Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]>
Signed-off-by: Robert Olsson <[EMAIL PROTECTED]>
Cheers.
nodes: 48235
1: 23812 2: 10023 3: 8089 4: 3972 5: 2332 6: 6 19: 1
Pointers: 815276
Null ptrs: 561512
Total size: 8330 kB
Signed-off-by: Robert Olsson <[EMAIL PROTECTED]>
diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index 7ef5948..1560c54 100644
--- a/ne
Hello.
The patch below adds break condition for the resize operations. If
we don't achieve the desired fill factor a warning is printed. Trie
should still be operational but new thresholds should be considered.
Cheers
--ro
Signed-off-by: Robert O
David Miller writes:
> Even a nearly perfect hash has small lumps in distribution, and we
> should not penalize entries which fall into these lumps.
>
> Let us call T the threshold at which we would grow the routing hash
> table. As we approach T we start to GC. Let's assume hash table
>
Eric Dumazet writes:
> With 2^20 entries, your actual limit of 2^19 entries in root node will
> probably show us quite different numbers for order-1,2,3,4... tnodes
Yeep trie will get deeper and lookup more costly as insert and delete.
The 2^19 was that was getting memory alloction problem
Richard Kojedzinszky writes:
> Sorry for sending the tgz with .svn included. And i did not send
> instructions.
> To do a test with fib_trie, issue
> $ make clean all ROUTE_ALG=TRIE & ./try a
> with fib_radix:
> $ make clean all ROUTE_ALG=RADIX & ./try a
> with fib_lef:
> $ make clean al
Eric Dumazet writes:
> Indeed. It would be nice to see how it performs with say 2^20 elements...
> Because with your data, I wonder if the extra complexity of the trash is
> worth
> it (since most lookups are going to only hit the hash and give the answer
> without intermediate nodes)
I
Eric Dumazet writes:
> Well, maybe... but after looking robert's trash, I discovered its model is
> essentially a big (2^18 slots) root node (our hash table), and very few
> order:1,2,3 nodes.
It's getting "hashlike" yes. I guess all effective algorithms today is doing
some sort of "inde
David Miller writes:
Interesting.
> Actually, more accurately, the conflict exists in how this GC
> logic is implemented. The core issue is that hash table size
> guides the GC processing, and hash table growth therefore
> modifies those GC goals. So with the patch below we'll just
> k
Richard Kojedzinszky writes:
> traffic, and also update the routing table (from BGP), the route cache
> seemed to be the bottleneck, as upon every fib update the whole route
> cache is flushed, and sometimes it took as many cpu cycles to let some
> packets being dropped. Meanwhile i knew t
Michael K. Edwards writes:
> This, incidentally, seems very similar to the process that Robert
> Olsson and Stefan Nilsson have gone through with their trie/hash
> project. Although I haven't tried it out yet and don't have any basis
> for an independent opinion,
Yes it seems be handle dev name change. So configuration scripts should
use ifindex now :)
Signed-off-by: Robert Olsson <[EMAIL PROTECTED]>
Cheers.
--ro
Stephen Hemminger writes:
> Since devices can change name and other wierdness, don
OK!
Signed-off-by: Robert Olsson <[EMAIL PROTECTED]>
Cheers.
--ro
Stephen Hemminger writes:
> The existing htonl() macro is smart enough to do the same code as
> using __constant_htonl() and it looks cleaner.
>
> Signed-off-by:
Thanks!
It seems like network code has preference for net_random() but they
are the same now.
Signed-off-by: Robert Olsson <[EMAIL PROTECTED]>
Cheers.
--ro
Stephen Hemminger writes:
> Can use random32() now.
>
> Signed-off-by: St
Thanks!
Signed-off-by: Robert Olsson <[EMAIL PROTECTED]>
--ro
Stephen Hemminger writes:
> Remove private debug macro and replace with standard version
>
> Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]>
>
>
David Miller writes:
> But what about if tree lookup were free :-)
>
> This is why I consider Robert Olsson's trash work the most promising,
> if we stick sockets into his full flow identified routing cache trie
> entries, we can eliminate lookup altogether.
>
> Just like how he already
Andi Kleen writes:
> > If not, you loose.
>
> It all depends on if the higher levels on the trie are small
> enough to be kept in cache. Even with two cache misses it might
> still break even, but have better scalability.
Yes the trick to keep root large to allow a very flat tree and few
Hello!
K. Salah writes:
> I have a question about the quota per poll in NAPI. Any idea how the
> quota of 64 packets per poll per NIC was selected? Why 64, not any
> other number? Is there science behind this number.
The number comes from experimentation. The "science" is thats it's
po
just a single leaf this gets printed as belonging to the
local table in /proc/net/fib_trie. A fix is below.
Cheers.
--ro
Signed-off-by: Robert Olsson <[EMAIL PROTECTED]>
--- net-2.6.20/net/ipv4/fib_trie.c.orig 2007-01-26 13:18:13.
Hello!
Yes the case when the trie is just a single leaf got wrong with the iterator
and your patchs
cures it. I think we have a similar problem with /proc/net/fib_trie
Cheers
--ro
Signed-off-by: Robert Olsson <[EMAIL PROTECTED]>
Alexey Dobriyan writes:
>
> Confused now. Is my "t->control &= ~(T_TERMINATE);" fix deprecated by
> completions?
It's not needed with completion patch as this does the job a bit more
mainstream. The T_TERMINATE seems to work well I've tested on machine
with CPU:s. Only once I noticed tha
David Miller writes:
> Agreed.
>
> Robert, please fix this by using a completion so that we can
> wait for the threads to start up, something like this:
Included. It passes my test but Alexey and others test.
Cheers.
--ro
diff --git a/net/core/pktgen.c b/net
Hello!
Seems you found a race when rmmod is done before it's fully started
Try:
diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index 733d86d..ac0b4b1 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -160,7 +160,7 @@
#include /* do_div */
#include
-#define VERSION "p
Ben Greear writes:
> > Changes:
> > * use a nano-second timer based on the scheduler timer (TSC) for relative
> > times, instead of get_time_of_day.
Seems I missed to set tsc as clocksource. It makes a difference. Performance is
normal and I'm less confused.
e1000 82546GB @ 1.6 GHz Opteron.
jamal writes:
> If you are listening then start with:
>
> 1) Do a simple test with just udp traffic as above, doing simple
> accounting. This helps you to get a feel on how things work.
> 2) modify the matching rules to match your magic cookie
> 3) write a simple action invoked by your mat
Ben Greear writes:
> It requires a hook in dev.c, or at least that is the only way I can
> think to implement it.
Well the hook be placed along the packet path even in drivers. In tulip I
didn't
even take packet of the ring in some experiments.
And there plenty of existing hooks already i
Ben Greear writes:
> I'd be thrilled to have the receive logic go into pktgen, even if it was #if
> 0 with a comment
> showing how to patch dev.c to get it working. It would make my out-of-tree
> patch smaller
> and should help others who are doing research and driver development...
Ju
Ben Greear writes:
> I've completed the first pass of my changes to pktgen in 2.6.18.
> Many of these features are probably DOA based on previous conversations,
> but perhaps this will help someone
Thanks. Well sometimes there is a need to capture and drop pkts and various
points, so it
Ben Greear writes:
> I'm planning to re-merge my long-lost pktgen branch with the kernel
> tree's pktgen.
>
> I believe the main difference is that my out-of-tree pktgen does not do the
> busy-spin, but waits on a queue for the net-device to wake it's tx-queue
> when over-driving a NIC.
>
--ro
Signed-off-by: Francesco Fondelli <[EMAIL PROTECTED]>
Signed-off-by: Robert Olsson <[EMAIL PROTECTED]>
--- new/net/core/pktgen.c.vlan 2006-09-26 14:53:34.0 +0200
+++ new/net/core/pktgen.c 2006-09-26 14:54:49.0 +0200
@@ -160,7 +160
oks fine but haven't been able to test Q-in-Q. Below is a git diff.
We ask Dave to apply.
Cheers.
--ro
Signed-off-by: Francesco Fondelli <[EMAIL PROTECTED]>
Acked-by: Steven Whitehouse <[EMAIL PROTECTED]>
Signed-off-by: Robert Olsson <
And the docs.
Cheers.
--ro
Signed-off-by: Francesco Fondelli <[EMAIL PROTECTED]>
Signed-off-by: Robert Olsson <[EMAIL PROTECTED]>
diff --git a/Documentation/networking/pktgen.txt
b/Documentation/networking/pktgen.txt
index 278771c..60
Francesco Fondelli writes:
> The attached patch allows pktgen to produce 802.1Q and Q-in-Q tagged frames.
> I have used it for stress test a bridge and seems ok to me.
> Unfortunately I have no access to net-2.6.x git tree so the diff is against
> 2.6.17.13.
> If you have a moment look over
Andi Kleen writes:
> The reason I'm asking is that we still have trouble with the TCP hash tables
> taking far too much memory, and your new data structure might provide a nice
> alternative.
Yes it's dynamic and selftuning so no need reserve memory in advance and still
comparable perfor
Andi Kleen writes:
> On Monday 04 September 2006 13:43, Robert Olsson wrote:
> >
> > Hello.
> >
> > People on this list might find this paper interesting:
> > http://www.csc.kth.se/~snilsson/public/papers/trash/
>
> Looks nice. Have you look
Hello.
People on this list might find this paper interesting:
http://www.csc.kth.se/~snilsson/public/papers/trash/
Abstract is below. Feel free to redistribute.
Cheers.
--ro
TRASH - A dynamic LC-trie and hash data structure
Robert Olsson and Stefan
jamal writes:
> The default setup on the e1000 has rx flow control turned on.
> I was sending at wire rate gige from the device - which is about
> 1.48Mpps. The e1000 was in turn sending me flow control packets
> as per default/expected behavior. Unfortunately, it was sending
> a very large
Hello!
Yes seems the system is very loaded for some reason
> > sometimes a day) we get 100% usage on ksoftirqd/0 and following messages
in logs:
as all softirq's are run via ksoftirqd. That's still OK but why don't the
watchdog get any CPU share at all? Mismatch in priorities?
Herbert X
Stephen Hemminger writes:
> I also noticed that you really don't save much by doing TX cleaning at
> hardirq, because in hardirq you need to do dev_kfree_irq and that causes
> a softirq (for the routing case where users=1). So when routing it
> doesn't make much difference, both methods ca
jamal writes:
> Latency-wise: TX completion interrupt provides the best latency.
> Processing in the poll() -aka softirq- was almost close to the hardirq
> variant. So if you can make things run in a softirq such as transmit
> one, then the numbers will likely stay the same.
I don't rememb
Hello!
Seems like leaf (end-nodes) has been freed by __tnode_free_rcu and not by
__leaf_free_rcu. This fixes the problem. Only tnode_free is now used which
checks for appropriate node type. free_leaf can be removed.
Signed-off-by: Robert Olsson <[EMAIL PROTECTED]>
ss Dave will apply it.
Signed-off-by: Robert Olsson <[EMAIL PROTECTED]>
Thanks.
--ro
>
> Signed-off-by: Steven Whitehouse <[EMAIL PROTECTED]>
>
> diff --git a/Documentation/networking/pktgen.txt
> b/Documentation/network
Steven Whitehouse writes:
> I've been looking into MPLS recently and so one of the first things that
> would be useful is a testbed to generate test traffic, and hence the
> attached patch to pktgen.
>
> If you have a moment to look over it, then please let me know if you
> would give it y
Luiz Fernando Capitulino writes:
> Well, I wouldn't say it's _really_ needed. But it really avoids having
> too many thread entries in the pktgen's /proc directory, and as a good
> result, you will not have pending threads which will never run as well.
>
> Also note that the patch is triv
Hello!
Really needed?
If so -- Wouldn't a concept of a bitmask to control also which CPU's
that runs the threads be more general?
Cheers.
--ro
Luiz Fernando Capitulino writes:
>
> Currently, pktgen will create one thread for each online CPU in the
Jesse Brandeburg writes:
>
> I looked quickly at this on a couple different machines and wasn't
> able to reproduce, so don't let me block the patch. I think its a
> good patch FWIW
OK!
We ask Deve to apply it.
Cheers.
--ro
-
To unsubscribe from
fib_triestats has been buggy and caused oopses some platforms as openwrt.
The patch below should cure those problems.
Cheers.
--ro
Signed-off-by: Robert Olsson <[EMAIL PROTECTED]>
--- linux-2.6.16-rc4/net/ipv4/fib_trie.c.061021 2006-02
Hello!
In some kernel configs /proc functions seems to be accessed before the trie
is initialized. The patch below checks for this.
Cheers.
--ro
Signed-off-by: Robert Olsson <[EMAIL PROTECTED]>
--- linux-2.6.16-rc4/net/ipv4/fib_trie.c.orig
Arthur Kepner writes:
Tanks. These races should be cured now I've tested a some runs and
it works but I didn't see any problems before either. We'll hear from
Jesse if this cured his problems.
Cheers.
--ro
Signed-off-by: Robert
Arthur Kepner writes:
>
> Let's try this again. How does this look, Robert?
Yeep better
> if(remove) {
> +t->control |= T_REMDEV;
> +pkt_dev->removal_mark = 1;
>
Arthur Kepner writes:
> There's a race in pktgen which can lead to a double
> free of a pktgen_dev's skb. If a worker thread is in
> the midst of doing fill_packet(), and the controlling
> thread gets a "stop" message, the already freed skb
> can be freed once again in pktgen_stop_device().
Leonid Grossman writes:
> Right. Interrupt moderation is done on per channel basis.
> The only addition to the current NAPI mechanism I'd like to see is to
> have NAPI setting desired interrupt rate (once interrupts are ON),
> rather than use an interrupt per packet or a driver default. Argu
V1 are:
>
> 1. More fixes made by hand after Lindent run
> 2. Re-diffed agains't Dave's net-2.6.17 tree
Should be fine I've used the previous version of the patches for a
couple of days now. Thanks.
Signed-off-by:
nt to -1, and since the disable
> occurs inside the ifindex test, it is not protecting
> that either.
The preempts are removed and an updated version of the patch is enclosed.
Cheers.
--ro
Signed-off-by: Robert Olsson <[EMAIL PROTECTED]>
-
Benjamin LaHaise writes:
> Right. Btw, looking over your changes for skb reuse and how the e1000
> lays out its data structures, I think there is still some room for
> improvement: currently you still touch the skb that the e1000 used to
> allocate the receive buffers in. That cacheline
jamal writes:
> > Here we process (drop) about 13% packets more when skb'a get reued.
> >
> Very cool.
> Robert, it would be interesting to see something more interesting
> (longer code path) such as udp. You can still use a packetgen to shoot
> at the box (netperf is a wimp), send it to
Benjamin LaHaise writes:
> Instead of doing a completely separate skb reuse path, what happens if
> you remove the memset() from __alloc_skb() and instead do it in a slab
> ctor? I remember seeing that in the profiles for af_unix. Dave, could
> you refresh my memory why the slab ctor end
Hello!
We disussed the resue of skb's some time ago.
Below some code to examine how skb's can be reused if upper layer (RX softirq)
can consume the skb so we with in NAPI path can detect and reuse the skb. It
can give new possibilites for TCP optimization (davem), driver common copbreak
etc.
Luiz Fernando Capitulino writes:
> [PATCH 00/02] pktgen: Ports thread list to Kernel list implementation.
> [PATCH 00/03] pktgen: Fix kernel_thread() fail leak.
> [PATCH 00/04] pktgen: Fix Initialization fail leak.
>
> But I'm sending these patches first, just to know if I'm doing som
Aritz Bastida writes:
> I need to use pktgen for sending packets at very high speed to another
> machine, in order to test it under heavy network traffic. All my
> previous injection test were done with a dual Pentium III 800 MHz. As
> I needed a more powerful machine I got a Pentium 4 but th
David S. Miller writes:
> I think a new variant of netif_receive_skb() might be needed or
> maybe not. I don't see a need for a new ->poll() for example.
Yes poll should fine as-is for netif_receive_skb we have to see.
> On the other hand, nobody checks the return value from
> netif
jamal writes:
> Essentially the approach would be the same as Robert's old recycle patch
> where he doesnt recycle certain skbs - the only difference being in the
> case of forwarding, the recycle is done asynchronously at EOT whereas
> this is done synchronously upon return from host path.
jamal writes:
>
> Ok, this makes things more interesting
> What worked for a XEON doesnt work the same way for an opteron.
>
> For me, the copybreak (in its capacity as adding extra cycles that make
> the prefetch look good) made things look good. Also, #125 gave a best
> answer. Non
1 - 100 of 125 matches
Mail list logo