IPv6 privacy extensions breaks kerberos

2013-09-22 Thread Martin Laabs
Hi,

I noticed that kerberos stops working when enabling the privacy extension.
This is caused by the changing outgoing IP that does not fit to the dns
name anymore (or do not have a dns record at all)
So every host enabling the privacy extension will be unable to use kerberos
and kerberos enabled services like nfs.
This is a very problematic behavior and I would like to know if there is a
way getting around this.

Thank you,
 Martin Laabs

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Exposing sysctls for ixgbe

2013-09-22 Thread hiren panchasara
$ sysctl hw.igb
hw.igb.rxd: 4096
hw.igb.txd: 4096
hw.igb.enable_aim: 1
hw.igb.enable_msix: 1
hw.igb.max_interrupt_rate: 8000
hw.igb.buf_ring_size: 4096
hw.igb.header_split: 0
hw.igb.num_queues: 1
hw.igb.rx_process_limit: 100
$ sysctl hw.ix
sysctl: unknown oid 'hw.ix': No such file or directory

I thought it would be nice to have these things exposed. So I copied them
from igb:
http://people.freebsd.org/~hiren/ixgbe_sysctls.txt

Changes for if_igb.c is to expose correct auto-tuned value for a running
system for "hw.igb.num_queues", which is not the case right now.

Thanks to markj@ for help/pointers.

Please let me know if the diffs look okay.

cheers,
Hiren
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Network stack changes

2013-09-22 Thread Alexander V. Chernikov

On 29.08.2013 02:24, Andre Oppermann wrote:

On 28.08.2013 20:30, Alexander V. Chernikov wrote:

Hello list!


Hello Alexander,

Hello Andre!
I'm very sorry to answer so late.


you sent quite a few things in the same email.  I'll try to respond
as much as I can right now.  Later you should split it up to have
more in-depth discussions on the individual parts.

If you could make it to the EuroBSDcon 2013 DevSummit that would be
even more awesome.  Most of the active network stack people will be
there too.
I've sent presentation describing nearly the same things to devsummit@ 
so I hope this can be discussed in Networking group.

I hope to attend DevSummit & EuroBSDcon.


There is a lot constantly raising discussions related to networking 
stack performance/changes.


I'll try to summarize current problems and possible solutions from my 
point of view.
(Generally this is one problem: stack is 
slooow, but we need to know why and

what to do).


Compared to others its not thaaat slow. ;)


Let's start with current IPv4 packet flow on a typical router:
http://static.ipfw.ru/images/freebsd_ipv4_flow.png

(I'm sorry I can't provide this as text since Visio don't have any 
'ascii-art' exporter).


Note that we are using process-to-completion model, e.g. process any 
packet in ISR until it is either

consumed by L4+ stack or dropped or put to egress NIC queue.

(There is also deferred ISR model implemented inside netisr but it 
does not change much:
it can help to do more fine-grained hashing (for GRE or other similar 
traffic), but

1) it uses per-packet mutex locking which kills all performance
2) it currently does not have _any_ hashing functions (see absence of 
flags in `netstat -Q`)
People using http://static.ipfw.ru/patches/netisr_ip_flowid.diff (or 
modified PPPoe/GRE version)

report some profit, but without fixing (1) it can't help much
)

So, let's start:

1) Ixgbe uses mutex to protect each RX ring which is perfectly fine 
since there is nearly no contention
(the only thing that can happen is driver reconfiguration which is 
rare and, more signifficant, we

do this once
for the batch of packets received in given interrupt). However, due 
to some (im)possible deadlocks

current code
does per-packet ring unlock/lock (see ixgbe_rx_input()).
There was a discussion ended with nothing:
http://lists.freebsd.org/pipermail/freebsd-net/2012-October/033520.html

1*) Possible BPF users. Here we have one rlock if there are any 
readers present
(and mutex for any matching packets, but this is more or less OK. 
Additionally, there is WIP to

implement multiqueue BPF
and there is chance that we can reduce lock contention there).


Rlock to rmlock?

Yes, probably.



There is also an "optimize_writers" hack permitting applications
like CDP to use BPF as writers but not registering them as receivers 
(which implies rlock)


I believe longer term we should solve this with a protocol type 
"ethernet"

so that one can send/receive ethernet frames through a normal socket.

Yes. AF_LINK or any similar.


2/3) Virtual interfaces (laggs/vlans over lagg and other simular 
constructions).
Currently we simply use rlock to make s/ix0/lagg0/ and, what is much 
more funny - we use complex

vlan_hash with another rlock to
get vlan interface from underlying one.

This is definitely not like things should be done and this can be 
changed more or less easily.


Indeed.

There are some useful terms/techniques in world of software/hardware 
routing: they have clear

'control plane' and 'data plane' separation.
Former one is for dealing control traffic (IGP, MLD, IGMP snooping, 
lagg hellos, ARP/NDP, etc..) and
some data traffic (packets with TTL=1, with options, destined to 
hosts without ARP/NDP record, and
similar). Latter one is done in hardware (or effective software 
implementation).
Control plane is responsible to provide data for efficient data plane 
operations. This is the point

we are missing nearly everywhere.


ACK.

What I want to say is: lagg is pure control-plane stuff and vlan is 
nearly the same. We can't apply
this approach to complex cases like 
lagg-over-vlans-over-vlans-over-(pppoe_ng0-and_wifi0)
but we definitely can do this for most common setups like (igb* or 
ix* in lagg with or without vlans

on top of lagg).


ACK.

We already have some capabilities like VLANHWFILTER/VLANHWTAG, we can 
add some more. We even have

per-driver hooks to program HW filtering.


We could.  Though for vlan it looks like it would be easier to remove the
hardware vlan tag stripping and insertion.  It only adds complexity in 
all

drivers for no gain.
No. Actually as far as I understand it helps driver to perform TSO. 
Anyway, IMO we should use HW capabilities if we can.
(this probably does not add much speed on 1G, but on 10/20/40G this can 
help much more).


One small step to do is to throw packet to vlan interface directly 
(P1), proof-of-concept(working in

production):
http://lists.freebsd.org/piperma

Re: Network stack changes

2013-09-22 Thread Alexander V. Chernikov

On 29.08.2013 05:32, Slawa Olhovchenkov wrote:

On Thu, Aug 29, 2013 at 12:24:48AM +0200, Andre Oppermann wrote:


..
while Intel DPDK claims 80MPPS (and 6windgate talks about 160 or so) on the 
same-class hardware and
_userland_ forwarding.

Those numbers sound a bit far out.  Maybe if the packet isn't touched
or looked at at all in a pure netmap interface to interface bridging
scenario.  I don't believe these numbers.

80*64*8 = 40.960 Gb/s
May be DCA? And use CPU with 40 PCIe lane and 4 memory chanell.
Intel introduces DDIO instead of DCA: 
http://www.intel.com/content/www/us/en/io/direct-data-i-o.html

(and it seems DCA does not help much):
https://www.myricom.com/software/myri10ge/790-how-do-i-enable-intel-direct-cache-access-dca-with-the-linux-myri10ge-driver.html
https://www.myricom.com/software/myri10ge/783-how-do-i-get-the-best-performance-with-my-myri-10g-network-adapters-on-a-host-that-supports-intel-data-direct-i-o-ddio.html

(However, DPDK paper notes DDIO is of signifficant helpers)
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Network stack changes

2013-09-22 Thread Alexander V. Chernikov

On 29.08.2013 15:49, Adrian Chadd wrote:

Hi,

Hello Adrian!
I'm very sorry for the looong reply.



There's a lot of good stuff to review here, thanks!

Yes, the ixgbe RX lock needs to die in a fire. It's kinda pointless to 
keep locking things like that on a per-packet basis. We should be able 
to do this in a cleaner way - we can defer RX into a CPU pinned 
taskqueue and convert the interrupt handler to a fast handler that 
just schedules that taskqueue. We can ignore the ithread entirely here.


What do you think?
Well, it sounds good :) But performance numbers and Jack opinion is more 
important :)


Are you going to Malta?


Totally pie in the sky handwaving at this point:

* create an array of mbuf pointers for completed mbufs;
* populate the mbuf array;
* pass the array up to ether_demux().

For vlan handling, it may end up populating its own list of mbufs to 
push up to ether_demux(). So maybe we should extend the API to have a 
bitmap of packets to actually handle from the array, so we can pass up 
a larger array of mbufs, note which ones are for the destination and 
then the upcall can mark which frames its consumed.


I specifically wonder how much work/benefit we may see by doing:

* batching packets into lists so various steps can batch process 
things rather than run to completion;
* batching the processing of a list of frames under a single lock 
instance - eg, if the forwarding code could do the forwarding lookup 
for 'n' packets under a single lock, then pass that list of frames up 
to inet_pfil_hook() to do the work under one lock, etc, etc.
I'm thinking the same way, but we're stuck with 'forwarding lookup' due 
to problem with egress interface pointer, as I mention earlier. However 
it is interesting to see how much it helps, regardless of locking.


Currently I'm thinking that we should try to change radix to something 
different (it seems that it can be checked fast) and see what happened.
Luigi's performance numbers for our radix are too awful, and there is a 
patch implementing alternative trie:

http://info.iet.unipi.it/~luigi/papers/20120601-dxr.pdf
http://www.nxlab.fer.hr/dxr/stable_8_20120824.diff




Here, the processing would look less like "grab lock and process to 
completion" and more like "mark and sweep" - ie, we have a list of 
frames that we mark as needing processing and mark as having been 
processed at each layer, so we know where to next dispatch them.


I still have some tool coding to do with PMC before I even think about 
tinkering with this as I'd like to measure stuff like per-packet 
latency as well as top-level processing overhead (ie, 
CPU_CLK_UNHALTED.THREAD_P / lagg0 TX bytes/pkts, RX bytes/pkts, NIC 
interrupts on that core, etc.)

That will be great to see!


Thanks,



-adrian



___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: kern/182297: [cm] ArcNet driver fails to detect the link address - and does not work at all

2013-09-22 Thread linimon
Old Synopsis: ArcNet driver fails to detect the link address - and does not 
work at all
New Synopsis: [cm] ArcNet driver fails to detect the link address - and does 
not work at all

Responsible-Changed-From-To: freebsd-i386->freebsd-net
Responsible-Changed-By: linimon
Responsible-Changed-When: Sun Sep 22 22:03:01 UTC 2013
Responsible-Changed-Why: 
Reclassify and assign.

Note to submitter: I haven't heard of any ArcNet cards in a long time.
Unfortunately, you may be the only person who is in a position to debug
and fix this.

http://www.freebsd.org/cgi/query-pr.cgi?pr=182297
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Network stack changes

2013-09-22 Thread Slawa Olhovchenkov
On Mon, Sep 23, 2013 at 12:01:17AM +0400, Alexander V. Chernikov wrote:

> On 29.08.2013 05:32, Slawa Olhovchenkov wrote:
> > On Thu, Aug 29, 2013 at 12:24:48AM +0200, Andre Oppermann wrote:
> >
> >>> ..
> >>> while Intel DPDK claims 80MPPS (and 6windgate talks about 160 or so) on 
> >>> the same-class hardware and
> >>> _userland_ forwarding.
> >> Those numbers sound a bit far out.  Maybe if the packet isn't touched
> >> or looked at at all in a pure netmap interface to interface bridging
> >> scenario.  I don't believe these numbers.
> > 80*64*8 = 40.960 Gb/s
> > May be DCA? And use CPU with 40 PCIe lane and 4 memory chanell.
> Intel introduces DDIO instead of DCA: 
> http://www.intel.com/content/www/us/en/io/direct-data-i-o.html
> (and it seems DCA does not help much):
> https://www.myricom.com/software/myri10ge/790-how-do-i-enable-intel-direct-cache-access-dca-with-the-linux-myri10ge-driver.html
> https://www.myricom.com/software/myri10ge/783-how-do-i-get-the-best-performance-with-my-myri-10g-network-adapters-on-a-host-that-supports-intel-data-direct-i-o-ddio.html
> 
> (However, DPDK paper notes DDIO is of signifficant helpers)

Ha, Intel paper say SMT is signifficant better HT. In real word --
same shit.

For network application, if buffring need more then L3 cache, what
happening? May be some bad things...
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Network stack changes

2013-09-22 Thread Alexander V. Chernikov

On 14.09.2013 22:49, Olivier Cochard-Labbé wrote:

On Sat, Sep 14, 2013 at 4:28 PM, Luigi Rizzo  wrote:

IXIA ? For the timescales we need to address we don't need an IXIA,
a netmap sender is more than enough


The great netmap generates only one IP flow (same src/dst IP and same
src/dst port).
This don't permit to test multi-queue NIC (or SMP packet-filter) on a
simple lab like this:
netmap sender => freebsd router => netmap receiver
I've got the variant which is capable on doing linerate pcap replays on 
single queue.

(However this is true for small pcaps only)


Regards,

Olivier


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Programmatically forwarding packets to outgoing interface

2013-09-22 Thread Ihsan Junaidi Ibrahim
Hi folks,

I'm trying to learn building a VPN-type application on FreeBSD and I'm 
currently stuck at trying to route packets to outgoing interface.

I've managed to push/pop IP packets in a tun(4) interface but now that I can 
read the inner packet header, I need to route the payload out of the box. I'm 
not quite sure which API I need to use to achieve this.

The inner packets can be of either IPv4 or IPv6.

Thanks.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Programmatically forwarding packets to outgoing interface

2013-09-22 Thread Julian Elischer

On 9/23/13 11:55 AM, Ihsan Junaidi Ibrahim wrote:

Hi folks,

I'm trying to learn building a VPN-type application on FreeBSD and I'm 
currently stuck at trying to route packets to outgoing interface.

I've managed to push/pop IP packets in a tun(4) interface but now that I can 
read the inner packet header, I need to route the payload out of the box. I'm 
not quite sure which API I need to use to achieve this.

The inner packets can be of either IPv4 or IPv6.

Thanks.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"



you can try use ipfw and its 'fwd' option to reroute packets
not sure if fwd works with ipv6.. I've never tried..
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Network stack changes

2013-09-22 Thread Adrian Chadd
Hi!



On 22 September 2013 13:12, Alexander V. Chernikov
wrote:


>  I'm thinking the same way, but we're stuck with 'forwarding lookup' due
> to problem with egress interface pointer, as I mention earlier. However it
> is interesting to see how much it helps, regardless of locking.
>
> Currently I'm thinking that we should try to change radix to something
> different (it seems that it can be checked fast) and see what happened.
> Luigi's performance numbers for our radix are too awful, and there is a
> patch implementing alternative trie:
> http://info.iet.unipi.it/~**luigi/papers/20120601-dxr.pdf
> http://www.nxlab.fer.hr/dxr/**stable_8_20120824.diff
>
>
So, I can make educated guesses about why this is better for forwarding
workloads. I'd like to characterize it though. So, what's it doing that's
better? better locking? better caching behaviour? less memory lookups? etc.

Thanks,



-adrian
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Programmatically forwarding packets to outgoing interface

2013-09-22 Thread Ihsan Junaidi Ibrahim
Thanks. Is there a specific C API I can use to call this?

On Sep 23, 2013, at 12:10 PM, Julian Elischer  wrote:

> On 9/23/13 11:55 AM, Ihsan Junaidi Ibrahim wrote:
>> Hi folks,
>> 
>> I'm trying to learn building a VPN-type application on FreeBSD and I'm 
>> currently stuck at trying to route packets to outgoing interface.
>> 
>> I've managed to push/pop IP packets in a tun(4) interface but now that I can 
>> read the inner packet header, I need to route the payload out of the box. 
>> I'm not quite sure which API I need to use to achieve this.
>> 
>> The inner packets can be of either IPv4 or IPv6.
>> 
>> Thanks.
>> ___
>> freebsd-net@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>> 
>> 
> you can try use ipfw and its 'fwd' option to reroute packets
> not sure if fwd works with ipv6.. I've never tried..
> ___
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Network stack changes

2013-09-22 Thread Luigi Rizzo
On Mon, Sep 23, 2013 at 6:42 AM, Adrian Chadd  wrote:

> Hi!
>
>
>
> On 22 September 2013 13:12, Alexander V. Chernikov <
> melif...@yandex-team.ru> wrote:
>
>
>>  I'm thinking the same way, but we're stuck with 'forwarding lookup' due
>> to problem with egress interface pointer, as I mention earlier. However it
>> is interesting to see how much it helps, regardless of locking.
>>
>> Currently I'm thinking that we should try to change radix to something
>> different (it seems that it can be checked fast) and see what happened.
>> Luigi's performance numbers for our radix are too awful, and there is a
>> patch implementing alternative trie:
>> http://info.iet.unipi.it/~**luigi/papers/20120601-dxr.pdf
>> http://www.nxlab.fer.hr/dxr/**stable_8_20120824.diff
>>
>>
> So, I can make educated guesses about why this is better for forwarding
> workloads. I'd like to characterize it though. So, what's it doing that's
> better? better locking? better caching behaviour? less memory lookups? etc.
>
>
locking affects scalability; but dxr and similar algorithms have much fewer
memory lookups, not to mention the huge memory footprint of
the freebsd radix tree code.

Anyways i'd really encourage you to read the dxr paper, it is short
and hopefully can give you a better idea of the details (and with data
supporting them) than these short notes.

cheers
luigi
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Network stack changes

2013-09-22 Thread Adrian Chadd
On 22 September 2013 21:52, Luigi Rizzo  wrote:


> locking affects scalability; but dxr and similar algorithms have much fewer
> memory lookups, not to mention the huge memory footprint of
> the freebsd radix tree code.
>
> Anyways i'd really encourage you to read the dxr paper, it is short
> and hopefully can give you a better idea of the details (and with data
> supporting them) than these short notes.
>
>
I read the paper. :-)

I believe it! It's not the first paper that I've read that packed a FIB
into a sensibly cacheable structure. I'm just as interested however in
making sure that we actually give people the tools to inspect this stuff
for themselves, rather than all of us hacking up something from scratch
every time we want to profile this kind of thing.

The other side of this coin is locking, and the paper didn't go into that.
Eliminating the radix tree overhead is great; now we just have to avoid
grabbing all those locks all the damned time for each frame..



-adrian
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"