Strange behavior of packet scheduling in ipfw3

2010-10-05 Thread Николай Дмуха
Hello!

The system is:
FreeBSD mysystem 8.0-STABLE-201005 FreeBSD 8.0-STABLE-201005 #0: Wed Jul 28
12:04:29 MSD 2010 r...@mysystem:/usr/src/sys/amd64/compile/MYKERNEL
amd64

There is firewall "ipfw3" from Luigi Rizzo with packet scheduling.
There is part of firewall config (tariff with 1Mbit/s speed, for example),
below (the rules for another speeds are the same):
$IPFW pipe 11 config bw 1040Kbit/s mask dst-ip 0x
$IPFW pipe 12 config bw 1040Kbit/s mask src-ip 0x
pipe 11
$IPFW sched 11 config type QFQ mask dst-ip 0xff00
$IPFW queue 111 config sched 11 weight 10
$IPFW queue 112 config sched 11 weight 8
$IPFW queue 113 config sched 11 weight 4
$IPFW queue 114 config sched 11 weight 1
$IPFW add queue 111 ip from any to table\(10\) via igb0 out proto udp
src-port 5060
$IPFW add queue 112 ip from any to table\(10\) via igb0 out proto tcp
src-port 80,443,8080
$IPFW add queue 113 ip from any to table\(10\) via igb0 out proto tcp
src-port 5223, 2009, 2106, 3724, 6112, 6881-6999, , 27000-27050, 42292
$IPFW add queue 113 ip from any to table\(10\) via igb0 out proto icmp
$IPFW add queue 114 ip from any to table\(10\) via igb0 out
$IPFW add queue 111 ip from any to table\(10\) via igb2 out proto udp
src-port 5060
$IPFW add queue 112 ip from any to table\(10\) via igb2 out proto tcp
src-port 80,443,8080
$IPFW add queue 113 ip from any to table\(10\) via igb2 out proto tcp
src-port 5223, 2009, 2106, 3724, 6112, 6881-6999, , 27000-27050, 42292
$IPFW add queue 113 ip from any to table\(10\) via igb2 out proto icmp
$IPFW add queue 114 ip from any to table\(10\) via igb2 out
pipe 12
$IPFW sched 12 config type QFQ mask src-ip 0xff00
$IPFW queue 121 config sched 12 weight 10
$IPFW queue 122 config sched 12 weight 8
$IPFW queue 123 config sched 12 weight 4
$IPFW queue 124 config sched 12 weight 1
$IPFW add queue 1210 ip from table\(11\) to any via igb1 out proto udp
dst-port 5060
$IPFW add queue 122 ip from table\(11\) to any via igb1 out proto tcp
dst-port 80,443,8080
$IPFW add queue 123 ip from table\(11\) to any via igb1 out proto tcp
dst-port 5223, 2009, 2106, 3724, 6112, 6881-6999, , 27000-27050, 42292
$IPFW add queue 123 ip from table\(11\) to any via igb1 out proto icmp
$IPFW add queue 124 ip from table\(11\) to any via igb1 out
$IPFW add queue 121 ip from table\(11\) to any via igb3 out proto udp
dst-port 5060
$IPFW add queue 122 ip from table\(11\) to any via igb3 out proto tcp
dst-port 80,443,8080
$IPFW add queue 123 ip from table\(11\) to any via igb3 out proto tcp
dst-port 5223, 2009, 2106, 3724, 6112, 6881-6999, , 27000-27050, 42292
$IPFW add queue 123 ip from table\(11\) to any via igb3 out proto icmp
$IPFW add queue 124 ip from table\(11\) to any via igb3 out

Firstly, we have been tested firewall by ourself. And we had no any bad
results or any problems or maybe we have not seen them in our synthetic
tests. After that we have started this firewall in production. A few months
later we received a message from our subscriber with speed 1Mbit/s. He had a
problems with online game (big answer delay from the server). We spent a lot
of time to solve this problem. Finaly we solved it. The reason was in packet
scheduling:
1. we`ve tried to give to subscriber another channel (4Mbit/s) with packet
scheduling - there are no such problems;
2. we`ve tried to "turn off" the packet scheduling on 1Mbit channel - there
are no such problems.

The utilization of subscibers channel was always 0.4Mbit/s. But the traffic
from this subscriber was go on under the packet scheduling rules. That`s
very strange because of:
1. net.inet.ip.dummynet.io_fast=1;
2. subscribers channel utilization 0.4Mbit/s.

As I know with this option, with this firewall config and with this channel
utilization (0.4Mbit/s) traffic should bypass the pipe without packet
scheduling.

Why subscribers traffic with all these conditions doesn`t bypass through
pipe without any delays? Why his traffic was on packet scheduling rules?
Thanks.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: mbuf changes

2010-10-05 Thread Karim Fodil-Lemelin

 On 03/10/2010 9:13 AM, Luigi Rizzo wrote:

On Sun, Oct 03, 2010 at 12:29:21AM +0100, Rui Paulo wrote:

On 2 Oct 2010, at 21:35, Juli Mallett wrote:


On Sat, Oct 2, 2010 at 12:07, Rui Paulo  wrote:

On 2 Oct 2010, at 16:29, Robert Watson wrote:

On Thu, 30 Sep 2010, Julian Elischer wrote:

On 9/30/10 10:49 AM, Ryan Stone wrote:

It's not a big thing but it would be nice to replace the m_next and m_nextpkt 
fields with queue.h macros.

funny, I've never even thought of that..

I have, and it's a massive change touching code all over the kernel in vast 
quantities.  While in principle it's a good idea (consistently avoid 
hand-crafted linked lists), it's something I'd discourage on the basis that it 
probably won't significant reduce the kernel bug count, but will make it even 
harder for vendors with large local changes to the network stack to keep up.

I think it could also increase the kernel bug count. Unfortunately, we can't do 
this incrementally.

Can't we?  What about a union, so that we can gradually convert things
but keep ABI and API compatibility?  I mean, as long as we use the
right queue.h type, anyway, it should be consistent?  STAILQ,
presumably.

Well, I don't have the layout of the mbuf struct offhand, but it's an idea 
worth investigating.

what is the point of refactoring part of a struct that no new code is
touching ?

I'd like to keep this discussion on the original topics,
i.e. performance-related issues (make room to embed mtags and other
metadata such as FIB; have flexible per-socket initial padding so
we don't always waste 100+ bytes just because ipv6+ipsec is compiled
in; and so on).
Please open another thread if you want to propose cosmetics or
code refactoring or other unrelated changes


Hi,

I will share some of the experience I had doing embed mtags. Hopefully 
its relevant :)


The idea of carrying a certain amount of mbuf tags within the mbuf 
structure is somewhat similar but much cleaner, imo, then Linux's skbuff 
char cb[40 - 48] (it was 40bytes in 2.4.x ...). Now this idea is not new 
although as you know the devil is in the details...


What we did for BSD is create a container in the mbuf and extend the API 
with functions we (pompously) called m_tag_fast_alloc() and 
m_tag_fast_free(). This means the standard m_tag_alloc() is still 
supported across the system and the old behavior is unchanged (list of 
allocated struct attached to the packet header). Whats different is the 
availability of a 'fast' call that directly uses the container within 
the mbuf, effectively avoiding those malloc and cache misses. I'll 
explain later how we effectively support calling m_tag_delete on a 
'fast' tag.


The trick to save CPU cycles was also to quickly revert back to the 
standard tag mechanism if some component in the system is manipulating 
the tag list by deleting elements. Effectively, the m_tag_fast_free is a 
NOP and fast tags are not deleted once allocated (unless m_free is 
called on the mbuf of course). When m_tag_delete is called the container 
simply becomes 'fast tag' invalid for further additions. This is not 
flexible but has the merit of reducing the overall number of operations 
given that almost no components are deleting tags without deleting the 
mbuf (loopback does but its a special case).


One last thing we did is perform various operational tests to come up 
with the most statistically optimized container size. Now this is much 
easier to do on a proprietary system then for a general purpose OS but 
its certainly possible.


Finally, we did see speed increase for our application and if someone is 
interested I could provide a patch although I would have to rewrite it 
without the proprietary bits in it.


Best regards,

Karim.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


ndis: fix ugly code

2010-10-05 Thread Paul B Mahol
Hi,

If clang did not complain, I would probbaly never spot it.

Patch attached.
diff --git a/sys/compat/ndis/subr_ntoskrnl.c b/sys/compat/ndis/subr_ntoskrnl.c
index ba1e49f..714fcd8 100644
--- a/sys/compat/ndis/subr_ntoskrnl.c
+++ b/sys/compat/ndis/subr_ntoskrnl.c
@@ -274,7 +274,6 @@ ntoskrnl_libinit()
kdpc_queue  *kq;
callout_entry   *e;
int i;
-   charname[64];
 
mtx_init(&ntoskrnl_dispatchlock,
"ntoskrnl dispatch lock", MTX_NDIS_LOCK, MTX_DEF|MTX_RECURSE);
@@ -321,9 +320,8 @@ ntoskrnl_libinit()
 #endif
kq = kq_queues + i;
kq->kq_cpu = i;
-   sprintf(name, "Windows DPC %d", i);
error = kproc_create(ntoskrnl_dpc_thread, kq, &p,
-   RFHIGHPID, NDIS_KSTACK_PAGES, name);
+   RFHIGHPID, NDIS_KSTACK_PAGES, "Windows DPC %d", i);
if (error)
panic("failed to launch DPC thread");
}
@@ -334,9 +332,8 @@ ntoskrnl_libinit()
 
for (i = 0; i < WORKITEM_THREADS; i++) {
kq = wq_queues + i;
-   sprintf(name, "Windows Workitem %d", i);
error = kproc_create(ntoskrnl_workitem_thread, kq, &p,
-   RFHIGHPID, NDIS_KSTACK_PAGES, name);
+   RFHIGHPID, NDIS_KSTACK_PAGES, "Windows Workitem %d", i);
if (error)
panic("failed to launch workitem thread");
}
@@ -3382,7 +3379,6 @@ PsCreateSystemThread(handle, reqaccess, objattrs, phandle,
void*thrctx;
 {
int error;
-   chartname[128];
thread_context  *tc;
struct proc *p;
 
@@ -3393,9 +3389,8 @@ PsCreateSystemThread(handle, reqaccess, objattrs, phandle,
tc->tc_thrctx = thrctx;
tc->tc_thrfunc = thrfunc;
 
-   sprintf(tname, "windows kthread %d", ntoskrnl_kth);
error = kproc_create(ntoskrnl_thrfunc, tc, &p,
-   RFHIGHPID, NDIS_KSTACK_PAGES, tname);
+   RFHIGHPID, NDIS_KSTACK_PAGES, "Windows Kthread %d", ntoskrnl_kth);
 
if (error) {
free(tc, M_TEMP);
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: ndis: fix ugly code

2010-10-05 Thread Julian Elischer

 On 10/5/10 1:19 PM, Paul B Mahol wrote:

Hi,

If clang did not complain, I would probbaly never spot it.

Patch attached.


personally I think you could use kproc_kthread_add so that a single 
NDIS process had three threads.




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: ndis: fix ugly code

2010-10-05 Thread Andrew Thompson
On 6 October 2010 09:19, Paul B Mahol  wrote:
> Hi,
>
> If clang did not complain, I would probbaly never spot it.
>
> Patch attached.

Committed.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: ndis: fix ugly code

2010-10-05 Thread Paul B Mahol
On 10/5/10, Julian Elischer  wrote:
>   On 10/5/10 1:19 PM, Paul B Mahol wrote:
>> Hi,
>>
>> If clang did not complain, I would probbaly never spot it.
>>
>> Patch attached.
>
> personally I think you could use kproc_kthread_add so that a single
> NDIS process had three threads.

Patch attached. Now we have single "ndis" kernel process with own threads.
diff --git a/sys/compat/ndis/subr_ntoskrnl.c b/sys/compat/ndis/subr_ntoskrnl.c
index 714fcd8..eafbb7c 100644
--- a/sys/compat/ndis/subr_ntoskrnl.c
+++ b/sys/compat/ndis/subr_ntoskrnl.c
@@ -254,6 +254,7 @@ static int32_t KeDelayExecutionThread(uint8_t, uint8_t, 
int64_t *);
 static int32_t KeSetPriorityThread(struct thread *, int32_t);
 static void dummy(void);
 
+static struct proc *ndisproc;
 static struct mtx ntoskrnl_dispatchlock;
 static struct mtx ntoskrnl_interlock;
 static kspin_lock ntoskrnl_cancellock;
@@ -270,7 +271,7 @@ ntoskrnl_libinit()
 {
image_patch_table   *patch;
int error;
-   struct proc *p;
+   struct thread   *t;
kdpc_queue  *kq;
callout_entry   *e;
int i;
@@ -320,8 +321,9 @@ ntoskrnl_libinit()
 #endif
kq = kq_queues + i;
kq->kq_cpu = i;
-   error = kproc_create(ntoskrnl_dpc_thread, kq, &p,
-   RFHIGHPID, NDIS_KSTACK_PAGES, "Windows DPC %d", i);
+   error = kproc_kthread_add(ntoskrnl_dpc_thread, kq,
+   &ndisproc, &t, RFHIGHPID, NDIS_KSTACK_PAGES, "ndis",
+   "Windows DPC %d", i);
if (error)
panic("failed to launch DPC thread");
}
@@ -332,8 +334,9 @@ ntoskrnl_libinit()
 
for (i = 0; i < WORKITEM_THREADS; i++) {
kq = wq_queues + i;
-   error = kproc_create(ntoskrnl_workitem_thread, kq, &p,
-   RFHIGHPID, NDIS_KSTACK_PAGES, "Windows Workitem %d", i);
+   error = kproc_kthread_add(ntoskrnl_workitem_thread, kq,
+   &ndisproc, &t, RFHIGHPID, NDIS_KSTACK_PAGES, "ndis",
+   "Windows Workitem %d", i);
if (error)
panic("failed to launch workitem thread");
}
@@ -2701,7 +2704,7 @@ ntoskrnl_workitem_thread(arg)
 #if __FreeBSD_version < 502113
mtx_lock(&Giant);
 #endif
-   kproc_exit(0);
+   kthread_exit();
return; /* notreached */
 }
 
@@ -3380,7 +3383,7 @@ PsCreateSystemThread(handle, reqaccess, objattrs, phandle,
 {
int error;
thread_context  *tc;
-   struct proc *p;
+   struct thread   *t;
 
tc = malloc(sizeof(thread_context), M_TEMP, M_NOWAIT);
if (tc == NULL)
@@ -3389,15 +3392,16 @@ PsCreateSystemThread(handle, reqaccess, objattrs, 
phandle,
tc->tc_thrctx = thrctx;
tc->tc_thrfunc = thrfunc;
 
-   error = kproc_create(ntoskrnl_thrfunc, tc, &p,
-   RFHIGHPID, NDIS_KSTACK_PAGES, "Windows Kthread %d", ntoskrnl_kth);
+   error = kproc_kthread_add(ntoskrnl_thrfunc, tc, &ndisproc, &t,
+   RFHIGHPID, NDIS_KSTACK_PAGES, "ndis",
+   "Windows Kthread %d", ntoskrnl_kth);
 
if (error) {
free(tc, M_TEMP);
return (STATUS_INSUFFICIENT_RESOURCES);
}
 
-   *handle = p;
+   *handle = t;
ntoskrnl_kth++;
 
return (STATUS_SUCCESS);
@@ -3432,7 +3436,7 @@ PsTerminateSystemThread(status)
 #if __FreeBSD_version < 502113
mtx_lock(&Giant);
 #endif
-   kproc_exit(0);
+   kthread_exit();
return (0); /* notreached */
 }
 
@@ -3740,7 +3744,7 @@ ntoskrnl_dpc_thread(arg)
 #if __FreeBSD_version < 502113
mtx_lock(&Giant);
 #endif
-   kproc_exit(0);
+   kthread_exit();
return; /* notreached */
 }
 
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: ndis: fix ugly code

2010-10-05 Thread Julian Elischer

 On 10/5/10 5:27 PM, Paul B Mahol wrote:

On 10/5/10, Julian Elischer  wrote:

   On 10/5/10 1:19 PM, Paul B Mahol wrote:

Hi,

If clang did not complain, I would probbaly never spot it.

Patch attached.

personally I think you could use kproc_kthread_add so that a single
NDIS process had three threads.

Patch attached. Now we have single "ndis" kernel process with own threads.
I don't know how ndis works. Is it possible that each ndis driver 
would have it's own process? or would each instance?
I don't even know if it's possible to run two different ndis drivers 
in the same kernel.
if that was the case we'd want to have a different name for each one 
so you can tell them,

but I just don't know enough about it.


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: ndis: fix ugly code

2010-10-05 Thread Paul B Mahol
On 10/6/10, Julian Elischer  wrote:
>   On 10/5/10 5:27 PM, Paul B Mahol wrote:
>> On 10/5/10, Julian Elischer  wrote:
>>>On 10/5/10 1:19 PM, Paul B Mahol wrote:
 Hi,

 If clang did not complain, I would probbaly never spot it.

 Patch attached.
>>> personally I think you could use kproc_kthread_add so that a single
>>> NDIS process had three threads.
>> Patch attached. Now we have single "ndis" kernel process with own threads.
> I don't know how ndis works. Is it possible that each ndis driver
> would have it's own process? or would each instance?
> I don't even know if it's possible to run two different ndis drivers
> in the same kernel.
> if that was the case we'd want to have a different name for each one
> so you can tell them,
> but I just don't know enough about it.

Nothing have changed in funcionality. We are just using kernel thread
instead of kernel process.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"