> On Aug 24, 2015, at 3:25 PM, Rick Macklem <rmack...@uoguelph.ca> wrote: > > Daniel Braniss wrote: >> >>> On 24 Aug 2015, at 10:22, Hans Petter Selasky <h...@selasky.org> wrote: >>> >>> On 08/24/15 01:02, Rick Macklem wrote: >>>> The other thing is the degradation seems to cut the rate by about half >>>> each time. >>>> 300-->150-->70 I have no idea if this helps to explain it. >>> >>> Might be a NUMA binding issue for the processes involved. >>> >>> man cpuset >>> >>> --HPS >> >> I can’t see how this is relevant, given that the same host, using the >> mellanox/mlxen >> behave much better. > Well, the "ix" driver has a bunch of tunables for things like "number of > queues" > and although I'll admit I don't understand how these queues are used, I think > they are related to CPUs and their caches. There is also something called > IXGBE_FDIR, > which others have recommended be disabled. (The code is #ifdef IXGBE_FDIR, > but I don't > know if it defined for your kernel?) There are also tunables for interrupt > rate and > something called hw.ixgbe_tx_process_limit, which appears to limit the number > of packets > to send or something like that? > (I suspect Hans would understand this stuff much better than I do, since I > don't understand > it at all.;-) > but how does this explain the fact that, at the same time, the throughput to the NetApp is about 70MG/s while to a FreeBSD it’s above 150MB/s? (window size negotiation?) switching off TSO evens out this diff.
> At a glance, the mellanox driver looks very different. > >> I’m getting different results with the intel/ix depending who is the nfs >> server >> > Who knows until you figure out what is actually going on. It could just be > the timing of > handling the write RPCs or when the different servers send acks for the TCP > segments or ... > that causes this for one server and not another. > > One of the principals used when investigating airplane accidents is to "never > assume anything" > and just try to collect the facts until the pieces of the puzzle fall in > place. I think the > same principal works for this kind of stuff. > I once had a case where a specific read of one NFS file would fail on certain > machines. > I won't bore you with the details, but after weeks we got to the point where > we had a lab > of identical machines (exactly the same hardware and exactly the same > software loaded on them) > and we could reproduce this problem on about half the machines and not the > other half. We > (myself and the guy I worked with) finally noticed the failing machines were > on network ports > for a given switch. We moved the net cables to another switch and the problem > went away. > --> This particular network switch was broken in such a way that it would > garble one specific > packet consistently, but worked fine for everything else. > My point here is that, if someone had suggested the "network switch might be > broken" at the > beginning of investigating this, I would have probably dismissed it, based on > "the network is > working just fine", but in the end, that was the problem. > --> I am not suggesting you have a broken network switch, just "don't take > anything off the > table until you know what is actually going on". > > And to be honest, you may never know, but it is fun to try and solve these > puzzles. one needs to find the clues … at the moment: when things go bad, they stay bad ix/nfs/tcp/tso and NetApp when things are ok, the numbers fluctuate, which is probably due to loads on the system, but they are far above the 70MB/s (100 to 200) > Beyond what I already suggested, I'd look at the "ix" driver's stats and > tunables and > see if any of the tunables has an effect. (And, yes, it will take time to > work through these.) > > Good luck with it, rick > >> >> danny >> >> _______________________________________________ >> freebsd-sta...@freebsd.org <mailto:freebsd-sta...@freebsd.org> mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-stable >> <https://lists.freebsd.org/mailman/listinfo/freebsd-stable> >> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org >> <mailto:freebsd-stable-unsubscr...@freebsd.org>" _______________________________________________ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"