Re: Giant-free polling [PATCH]

2005-03-11 Thread Gleb Smirnoff
  Collegues,

On Tue, Mar 01, 2005 at 04:29:49PM -0800, Luigi Rizzo wrote:
L> [cc-ing [EMAIL PROTECTED] to get more opinions]

this good, that you have CC'ed net@, otherwise we would continue
working in parallel without collaboration.

Here is attached patch made in a collaboration by ru, pjd and me.

We use separate mutex for polling, and we drop it before calling the
poll handler to avoid LOR and contention on this mutex.

We have definite evidence that now idle_poll and ISR poll can run
in parallel: if we remove PRF_RUNNING flag check, we got panics
because poll handlers of many drivers (e.g. if_em) are not reentrable.

I think this patch is a step forward in direction of parallel polling
on SMP, since we now have two polling instances (ISR + idle) working
in parallel.

-- 
Totus tuus, Glebius.
GLEBIUS-RIPN GLEB-RIPE
Index: kern_poll.c
===
RCS file: /home/ncvs/src/sys/kern/kern_poll.c,v
retrieving revision 1.19
diff -u -r1.19 kern_poll.c
--- kern_poll.c 25 Feb 2005 22:07:50 -  1.19
+++ kern_poll.c 11 Mar 2005 10:49:47 -
@@ -33,6 +33,7 @@
 #include 
 #include /* needed by net/if.h   
*/
 #include 
+#include 
 
 #include /* for IFF_* flags  */
 #include /* for NETISR_POLL  
*/
@@ -144,7 +145,7 @@
 SYSCTL_INT(_kern_polling, OID_AUTO, residual_burst, CTLFLAG_RW,
&residual_burst, 0, "# of residual cycles in burst");
 
-static u_int32_t poll_handlers; /* next free entry in pr[]. */
+static u_int32_t poll_handlers;
 SYSCTL_UINT(_kern_polling, OID_AUTO, handlers, CTLFLAG_RD,
&poll_handlers, 0, "Number of registered poll handlers");
 
@@ -169,20 +170,33 @@
&idlepoll_sleeping, 0, "idlepoll is sleeping");
 
 
-#define POLL_LIST_LEN  128
 struct pollrec {
poll_handler_t  *handler;
struct ifnet*ifp;
+#definePRF_RUNNING 0x1
+#definePRF_PINNED  0x2
+   uint32_tflags;
+   LIST_ENTRY(pollrec) next;
 };
 
-static struct pollrec pr[POLL_LIST_LEN];
+#definePR_VALID(pr)((pr)->handler != NULL &&   
\
+   !((pr)->flags & PRF_RUNNING) && \
+   ((pr)->ifp->if_flags & (IFF_UP|IFF_RUNNING)) == \
+   (IFF_UP|IFF_RUNNING))
+
+static LIST_HEAD(, pollrec) poll_list = LIST_HEAD_INITIALIZER(&poll_list);
+
+static struct mtx  poll_mtx;
 
 static void
 init_device_poll(void)
 {
 
-   netisr_register(NETISR_POLL, (netisr_t *)netisr_poll, NULL, 0);
-   netisr_register(NETISR_POLLMORE, (netisr_t *)netisr_pollmore, NULL, 0);
+   mtx_init(&poll_mtx, "polling", NULL, MTX_DEF);
+   netisr_register(NETISR_POLL, (netisr_t *)netisr_poll, NULL,
+   NETISR_MPSAFE);
+   netisr_register(NETISR_POLLMORE, (netisr_t *)netisr_pollmore, NULL,
+   NETISR_MPSAFE);
 }
 SYSINIT(device_poll, SI_SUB_CLOCKS, SI_ORDER_MIDDLE, init_device_poll, NULL)
 
@@ -244,17 +258,26 @@
 void
 ether_poll(int count)
 {
-   int i;
-
-   mtx_lock(&Giant);
+   struct pollrec *pr, *pr1;
 
if (count > poll_each_burst)
count = poll_each_burst;
-   for (i = 0 ; i < poll_handlers ; i++)
-   if (pr[i].handler && (IFF_UP|IFF_RUNNING) ==
-   (pr[i].ifp->if_flags & (IFF_UP|IFF_RUNNING)) )
-   pr[i].handler(pr[i].ifp, 0, count); /* quick check */
-   mtx_unlock(&Giant);
+
+   mtx_lock(&poll_mtx);
+   LIST_FOREACH_SAFE(pr, &poll_list, next, pr1) {
+   if (PR_VALID(pr)) {
+   pr->flags |= PRF_RUNNING;
+   if (pr1 != NULL)
+   pr1->flags |= PRF_PINNED;
+   mtx_unlock(&poll_mtx);
+   pr->handler(pr->ifp, 0, count);
+   mtx_lock(&poll_mtx);
+   if (pr1 != NULL)
+   pr1->flags &= ~PRF_PINNED;
+   pr->flags &= ~PRF_RUNNING;
+   }
+   }
+   mtx_unlock(&poll_mtx);
 }
 
 /*
@@ -320,17 +343,17 @@
 
 /*
  * netisr_poll is scheduled by schednetisr when appropriate, typically once
- * per tick. It is called at splnet() so first thing to do is to upgrade to
- * splimp(), and call all registered handlers.
+ * per tick.
  */
 static void
 netisr_poll(void)
 {
+   struct pollrec *pr, *pr1;
static int reg_frac_count;
-   int i, cycles;
+   int cycles;
enum poll_cmd arg = POLL_ONLY;
-   mtx_lock(&Giant);
 
+   mtx_lock(&poll_mtx);
phase = 3;
if (residual_burst == 0) { /* first call in this tick */
microuptime(&poll_start_t);
@@ -372,25 +395,36 @@
residual_burst -= cycles;
 
if (polling) {
-   for (i = 0 ; i < poll_handlers ; i++)
-   if (pr[i].handler && (IFF_UP|IFF_RUNNI

Re: FreeBSD 4.x and OS-X tcp performance

2005-03-11 Thread Noritoshi Demizu
> The server is not telling the client that a packet has been lost.
> The first two ACKs are correct duplicate ACKs, but the remaining
> ACKs coming from he server have window adjustments, so the
> client does not treat them as duplicate ACKs coming from a packet
> loss.

I made a list of ACKs with ack=4195629532.  I added differences with
their previous window sizes in parenthesises for each line. (See below)

I guess if the difference < 2 * 1448, it would be an ACK sent in reply
to an out-of-order data segment (i.e., it really is a dup ACK).
Otherwise, it would be a pure window update.  I added my guess for each
line below. There are 12 dup ACKs, which is equal to the number of data
segment beyond the lost packet.  And there are 12 window updates.

  ack 4195629532 win 5792 (-)   <- Original ACK
  ack 4195629532 win 6576 (+784)<- dup   ACK (with window update)
  ack 4195629532 win 6576 (0)   <- dup ACK
  ack 4195629532 win 7240 (+664)<- dup   ACK (with window update)
  ack 4195629532 win 10672 (+3432)  <- window update
  ack 4195629532 win 12720 (+2048)  <- dup   ACK (with window update)
  ack 4195629532 win 14768 (+2048)  <- dup   ACK (with window update)
  ack 4195629532 win 14768 (0)  <- dup ACK
  ack 4195629532 win 16816 (+2048)  <- dup   ACK (with window update)
  ack 4195629532 win 18864 (+2048)  <- dup   ACK (with window update)
  ack 4195629532 win 18864 (0)  <- dup ACK
  ack 4195629532 win 20912 (+2048)  <- dup   ACK (with window update)
  ack 4195629532 win 22960 (+2048)  <- dup   ACK (with window update)
  ack 4195629532 win 22960 (0)  <- dup ACK
  ack 4195629532 win 26032 (+3072)  <- window update
  ack 4195629532 win 29104 (+3072)  <- window update
  ack 4195629532 win 32176 (+3072)  <- window update
  ack 4195629532 win 35248 (+3072)  <- window update
  ack 4195629532 win 38320 (+3072)  <- window update
  ack 4195629532 win 41392 (+3072)  <- window update
  ack 4195629532 win 44464 (+3072)  <- window update
  ack 4195629532 win 47536 (+3072)  <- window update
  ack 4195629532 win 50608 (+3072)  <- window update
  ack 4195629532 win 53680 (+3072)  <- window update
  ack 4195629532 win 56752 (+3072)  <- window update

There are four dup ACKs whose window field is the same as its previous
ACK.  If the sender counted them as dup ACKs, the sender was able to do
FastRetransmit.  But the sender did not.  The reason would be that the
sender might clear t_dupacks when receiving a window update.
In FreeBSD current, line 1909 of tcp_input.c rev 1.268 does this.
I am not sure it is a correct procedure.

By the way, the receiver application seems to call read() with bufsize=1024.
If the bufsize was larger than or equal to 2 * 1448, e.g., 4096, window
updates would be sent independently from dup ACKs.  Since the number of
dup ACKs would be increased, the chance the receiver sends consecutive
three dup ACKs might be increaced.

Or if the receiver machine was replaced with a faster machine, the
receiver application ran faster and the receive buffer would be always
almost empty.  In this case, the number of window updates would be
decreased and the chance the receiver sends consecutive three dup ACKs
would be increaced.

These may improve the TCP performance in this particular case.

Regards,
Noritoshi Demizu


ps. In tcpdump.osx-fbsd, I saw six retransmission timeouts.
Retransmission timeout values are from 1 second to 1.5 seconds.
They occupies almost all the total time of the transfer (8.6 seconds).
Outstanding window seems less than 10 MSS, and in most cases, it is
less than 6 MSS due to advertized window size.  So, if Fast Retransmit
worked, the TCP performance would be drastically improved in this
particular case.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re[2]: Giant-free polling [PATCH]

2005-03-11 Thread dima
>   Collegues,
> 
> On Tue, Mar 01, 2005 at 04:29:49PM -0800, Luigi Rizzo wrote:
> L> [cc-ing [EMAIL PROTECTED] to get more opinions]
> 
> this good, that you have CC'ed net@, otherwise we would continue
> working in parallel without collaboration.

It's a pity you didn't read the complete thread,
thus the original version of my patch isn't valid anymore.

> 
> Here is attached patch made in a collaboration by ru, pjd and me.

I thought about using list also, but considered it to bring
too much overhead to the code. The original idea of handling arrays
seems to be very elegant.

> 
> We use separate mutex for polling, and we drop it before calling the
> poll handler to avoid LOR and contention on this mutex.

The recent version of the patch uses sx lock to protect pr[] and
the array of per-interface mutexes iface_locks[].
So, all CPUs can poll different interfaces at the same time now.
See the complete thread for details.

> 
> We have definite evidence that now idle_poll and ISR poll can run
> in parallel: if we remove PRF_RUNNING flag check, we got panics
> because poll handlers of many drivers (e.g. if_em) are not reentrable.
> 
> I think this patch is a step forward in direction of parallel polling
> on SMP, since we now have two polling instances (ISR + idle) working
> in parallel.

I also mentioned that poll_idle() isn't called from anywhere in the kernel.
Thus we have only 2 possible sources for polling: hardclock (per-CPU) and traps.
The polling code in trap handler should also be changed a bit to check for
the trap's source.

I attach the most recent version of the patch.
The only difference is locking in ether_poll_register() suggested by Luigi.

> 
> -- 
> Totus tuus, Glebius.
> GLEBIUS-RIPN GLEB-RIPE



kern_poll.patch
Description: Binary data
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Giant-free polling [PATCH]

2005-03-11 Thread Gleb Smirnoff
On Fri, Mar 11, 2005 at 04:55:25PM +0300, dima wrote:
d> I also mentioned that poll_idle() isn't called from anywhere in the kernel.
d> Thus we have only 2 possible sources for polling: hardclock (per-CPU) and 
traps.

poll_idle is a process created on startup, so it is not called from anywhere. 
But
it is source of polling.

P.S. More comments later. Thanks!

-- 
Totus tuus, Glebius.
GLEBIUS-RIPN GLEB-RIPE
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Giant-free polling [PATCH]

2005-03-11 Thread Pawel Jakub Dawidek
On Fri, Mar 11, 2005 at 04:55:25PM +0300, dima wrote:
+> I thought about using list also, but considered it to bring
+> too much overhead to the code. The original idea of handling arrays
+> seems to be very elegant.

Overhead? Did you run any benchmarks to prove it?
I find list-version much more elegant that using an array.

I also don't like the idea of calling handler method with two locks
held (one sx and one mutex)...

There is still an unresolved problem (in your and our patch as well) of
using ifnet structure fields without synchronization, as we don't have
access tointerface's internal mutex, which protects those fields.

-- 
Pawel Jakub Dawidek   http://www.wheel.pl
[EMAIL PROTECTED]   http://www.FreeBSD.org
FreeBSD committer Am I Evil? Yes, I Am!


pgptRhaa0bjn4.pgp
Description: PGP signature


Re: Giant-free polling [PATCH]

2005-03-11 Thread Gleb Smirnoff
On Fri, Mar 11, 2005 at 03:14:50PM +0100, Pawel Jakub Dawidek wrote:
P> On Fri, Mar 11, 2005 at 04:55:25PM +0300, dima wrote:
P> +> I thought about using list also, but considered it to bring
P> +> too much overhead to the code. The original idea of handling arrays
P> +> seems to be very elegant.
P> 
P> Overhead? Did you run any benchmarks to prove it?
P> I find list-version much more elegant that using an array.

It is also a small cookie for future. Now we have IFF_POLLING flag and
IFCAP_POLLING, which indicate whether interface support polling and whether
it actually does polling. This is not nice, from my viewpoint. I'd like
to see only IFCAP_POLLING present and turning polling on/off for particular
interface should be done by inserting/removing iface from polling list.

This will also remove an extra unlocked check of interface flags (?).

P> I also don't like the idea of calling handler method with two locks
P> held (one sx and one mutex)...

I agree with Pawel. We have LOR here between sx lock and driver lock:

normal polling: (get sx shared) -> (get driver mutex)
driver stop:(get driver mutex) -> (get sx exclusive)

We will have deadlock if this two things process in parallel.

And the per-interface mutex protects only reentrancy of interface poll
method, is that right?

P> There is still an unresolved problem (in your and our patch as well) of
P> using ifnet structure fields without synchronization, as we don't have
P> access tointerface's internal mutex, which protects those fields.

This is unresolved in our patch, too, and I believe throughout many
other places in kernel.

-- 
Totus tuus, Glebius.
GLEBIUS-RIPN GLEB-RIPE
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re[2]: Giant-free polling [PATCH]

2005-03-11 Thread dima
> On Fri, Mar 11, 2005 at 04:55:25PM +0300, dima wrote:
> +> I thought about using list also, but considered it to bring
> +> too much overhead to the code. The original idea of handling arrays
> +> seems to be very elegant.
> 
> Overhead? Did you run any benchmarks to prove it?
> I find list-version much more elegant that using an array.

It was an assumption in fact.
So, I didn't try to implement a list-based version.
We should merge our efforts anyway and both versions
should naturally be tested and benchmarked.

> 
> I also don't like the idea of calling handler method with two locks
> held (one sx and one mutex)...

This gives the highest possible granularity though...

> 
> There is still an unresolved problem (in your and our patch as well) of
> using ifnet structure fields without synchronization, as we don't have
> access tointerface's internal mutex, which protects those fields.

I guess iface_locks[] should be removed then. We actually can get
the interface's internal mutex as ifp->ifq_mtx; I will think about
that on weekend.

> 
> -- 
> Pawel Jakub Dawidek   http://www.wheel.pl
> [EMAIL PROTECTED]   http://www.FreeBSD.org
> FreeBSD committer Am I Evil? Yes, I Am!
> 
> ATTACHMENT: application/pgp-signature
> 
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Giant-free polling [PATCH]

2005-03-11 Thread Gleb Smirnoff
On Fri, Mar 11, 2005 at 05:31:16PM +0300, dima wrote:
d> We should merge our efforts anyway and both versions
d> should naturally be tested and benchmarked.

Yes, I'll benchmark both patches in next 3-4 days.

-- 
Totus tuus, Glebius.
GLEBIUS-RIPN GLEB-RIPE
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: FreeBSD 4.x and OS-X tcp performance

2005-03-11 Thread Mark Tinguely
on Fri, 11 Mar 2005 20:20:02 +0900 (JST), Noritoshi Demizu said:

>ack 4195629532 win 5792 (-)<- Original ACK
>ack 4195629532 win 6576 (+784) <- dup   ACK (with window update)
>ack 4195629532 win 6576 (0)<- dup ACK
>ack 4195629532 win 7240 (+664) <- dup   ACK (with window update)
>ack 4195629532 win 10672 (+3432)   <- window update

The Apple machine may be rate limiting their transmissions. The Apple is
sending only 2 packets per round trip time. After the packet is lost,
that gap allows the FreeBSD to consume some data before sending the 3rd
duplicate ACKs and so it has a window changes.

--Mark.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Giant-free polling [PATCH]

2005-03-11 Thread Luigi Rizzo
while i am happy to see interest on having this supported, and while
I understand the excitement in firing the editor and writing code,
i don't think this approach of hacking up some patch that allows
multiple poll* instances to run without panicing the box, or
discussing the performance of lists vs arrays vs IFF_* flags will
lead to anything.

As I asked this already, consider this msg a rant, for archival
purposes, calling for a sane approach to the problem.

Polling in the UP case is beneficial because it has some features
listed e.g. in

http://docs.freebsd.org/cgi/getmsg.cgi?fetch=128950+0+/usr/local/www/db/text/2005/freebsd-net/20050306.freebsd-net

For the SMP case there are the basic issues to handle (proper locking,
preventing deadlocks) but also performance issues e.g. to avoid
that multiple polling instances just fight for the lock but do
nothing useful, see e.g. the case described in

http://docs.freebsd.org/cgi/getmsg.cgi?fetch=71980+0+/usr/local/www/db/text/2005/freebsd-net/20050306.freebsd-net


Now, if you want to approach the issue seriously, you need to:

1) _design_ a locking strategy for SMP polling -- design meaning
   provide a description in english, even a sketchy one,
   which will eventually end up in a manpage such as polling(4),
   or in the source file as a comment.
   Certainly a design is not C code or patches without comments from which
   one should figure out the overall picture.

2) evaluate how the design compares in terms of features (e.g. those listed
   in the first reference above) to the UP case. I am not
   saying one should implement all of them, but at least it must
   be clear how the two things are different, if they are.

3) evaluate how the design addresses (in terms of performance)
   the contention problems, of which the second reference above is
   probably just one but at least a very obvious one;

4) implement it -- and if you want it to be reviewed, please post
   the full code, not just differences -- the code will be terribly
   different to be human readable in terms of a diff;

5) optimise it -- e.g. deal with arrays vs lists, implicit vs
   explicit flags, etc.

It seems to me that this whole discussion is proceding from step 4
and ignoring the first 3 steps. Please, step back from your keyboards
and move to the whiteboard, if nothing else to prove me that you
have already addressed all the issues.

cheers
luigi

On Fri, Mar 11, 2005 at 02:02:34PM +0300, Gleb Smirnoff wrote:

>   Collegues,
> 
> On Tue, Mar 01, 2005 at 04:29:49PM -0800, Luigi Rizzo wrote:
> L> [cc-ing [EMAIL PROTECTED] to get more opinions]
> 
> this good, that you have CC'ed net@, otherwise we would continue
> working in parallel without collaboration.
> 
> Here is attached patch made in a collaboration by ru, pjd and me.
> 
> We use separate mutex for polling, and we drop it before calling the
> poll handler to avoid LOR and contention on this mutex.
> 
> We have definite evidence that now idle_poll and ISR poll can run
> in parallel: if we remove PRF_RUNNING flag check, we got panics
> because poll handlers of many drivers (e.g. if_em) are not reentrable.
> 
> I think this patch is a step forward in direction of parallel polling
> on SMP, since we now have two polling instances (ISR + idle) working
> in parallel.
> 
> -- 
> Totus tuus, Glebius.
> GLEBIUS-RIPN GLEB-RIPE

> Index: kern_poll.c
> ===
> RCS file: /home/ncvs/src/sys/kern/kern_poll.c,v
> retrieving revision 1.19
> diff -u -r1.19 kern_poll.c
> --- kern_poll.c   25 Feb 2005 22:07:50 -  1.19
> +++ kern_poll.c   11 Mar 2005 10:49:47 -
> @@ -33,6 +33,7 @@
>  #include 
>  #include   /* needed by net/if.h   
> */
>  #include 
> +#include 
>  
>  #include   /* for IFF_* flags  */
>  #include   /* for NETISR_POLL  
> */
> @@ -144,7 +145,7 @@
>  SYSCTL_INT(_kern_polling, OID_AUTO, residual_burst, CTLFLAG_RW,
>   &residual_burst, 0, "# of residual cycles in burst");
>  
> -static u_int32_t poll_handlers; /* next free entry in pr[]. */
> +static u_int32_t poll_handlers;
>  SYSCTL_UINT(_kern_polling, OID_AUTO, handlers, CTLFLAG_RD,
>   &poll_handlers, 0, "Number of registered poll handlers");
>  
> @@ -169,20 +170,33 @@
>   &idlepoll_sleeping, 0, "idlepoll is sleeping");
>  
>  
> -#define POLL_LIST_LEN  128
>  struct pollrec {
>   poll_handler_t  *handler;
>   struct ifnet*ifp;
> +#define  PRF_RUNNING 0x1
> +#define  PRF_PINNED  0x2
> + uint32_tflags;
> + LIST_ENTRY(pollrec) next;
>  };
>  
> -static struct pollrec pr[POLL_LIST_LEN];
> +#define  PR_VALID(pr)((pr)->handler != NULL &&   
> \
> + !((pr)->flags & PRF_RUNNING) && \
> + ((pr)->ifp->if_flags & (IFF_UP|IFF_RUNNING)) == \
> + (

Re: FreeBSD 4.x and OS-X tcp performance

2005-03-11 Thread Noritoshi Demizu
> The Apple machine may be rate limiting their transmissions.
> The Apple is sending only 2 packets per round trip time.

I think (acknowledgment number + advertized window) of ACKs sent by
the FreeBSD machine limits how much data the Mac can inject into the
network.

Regards,
Noritoshi Demizu
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Giant-free polling [PATCH]

2005-03-11 Thread Gleb Smirnoff
On Fri, Mar 11, 2005 at 07:44:48AM -0800, Luigi Rizzo wrote:
L> while i am happy to see interest on having this supported, and while
L> I understand the excitement in firing the editor and writing code,
L> i don't think this approach of hacking up some patch that allows
L> multiple poll* instances to run without panicing the box, or
L> discussing the performance of lists vs arrays vs IFF_* flags will
L> lead to anything.
L> 
L> As I asked this already, consider this msg a rant, for archival
L> purposes, calling for a sane approach to the problem.
L> 
L> Polling in the UP case is beneficial because it has some features
L> listed e.g. in
L> 
L> 
http://docs.freebsd.org/cgi/getmsg.cgi?fetch=128950+0+/usr/local/www/db/text/2005/freebsd-net/20050306.freebsd-net
L> 
L> For the SMP case there are the basic issues to handle (proper locking,
L> preventing deadlocks) but also performance issues e.g. to avoid
L> that multiple polling instances just fight for the lock but do
L> nothing useful, see e.g. the case described in
L> 
L> 
http://docs.freebsd.org/cgi/getmsg.cgi?fetch=71980+0+/usr/local/www/db/text/2005/freebsd-net/20050306.freebsd-net
L> 
L> 
L> Now, if you want to approach the issue seriously, you need to:
L> 
L> 1) _design_ a locking strategy for SMP polling -- design meaning
L>provide a description in english, even a sketchy one,
L>which will eventually end up in a manpage such as polling(4),
L>or in the source file as a comment.
L>Certainly a design is not C code or patches without comments from which
L>one should figure out the overall picture.
L> 
L> 2) evaluate how the design compares in terms of features (e.g. those listed
L>in the first reference above) to the UP case. I am not
L>saying one should implement all of them, but at least it must
L>be clear how the two things are different, if they are.
L> 
L> 3) evaluate how the design addresses (in terms of performance)
L>the contention problems, of which the second reference above is
L>probably just one but at least a very obvious one;
L> 
L> 4) implement it -- and if you want it to be reviewed, please post
L>the full code, not just differences -- the code will be terribly
L>different to be human readable in terms of a diff;
L> 
L> 5) optimise it -- e.g. deal with arrays vs lists, implicit vs
L>explicit flags, etc.
L> 
L> It seems to me that this whole discussion is proceding from step 4
L> and ignoring the first 3 steps. Please, step back from your keyboards
L> and move to the whiteboard, if nothing else to prove me that you
L> have already addressed all the issues.

Luigi,

at this moment we are doing step 0 - removing Giant lock from polling,
to make its locking independent from other Giant-locked subsystems.
There is no code for SMP polling yet, and we will take into account all
what you say as soon as we approach SMP polling.

In both our patches polling logic and implementation is not touched, it is
only surrounded with locking.

-- 
Totus tuus, Glebius.
GLEBIUS-RIPN GLEB-RIPE
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


natd core dumps on FreeBSD 5.3-stable

2005-03-11 Thread hydros
Hello, All.
Natd dumping its core without any additional messages\logs exclude
this
Mar 11 22:31:54 gw kernel: pid 217 (natd), uid 0: exited on signal 11 (core 
dumped)-/var/log/messages
I`m using following configuration
FreeBSD 5.3-stable
3 interfaces: ed0 to my ISP
rl0-to my internal LAN-1 and LAN-2
LAN-2 has real ip adresses

rl1 to my LAN-3

- /etc/rc.conf --
# External interface address to my ISP
ifconfig_ed0="inet 213.x.x.38  netmask 255.255.255.252"

#Internal for my LAN-1 and LAN-2 rl0
ifconfig_rl0="inet 192.168.1.193 netmask 255.255.255.0" #LAN-1
ifconfig_rl0_alias0="inet 194.x.x.193 netmask 255.255.255.240"#LAN-2

#Internal for my LAN-3 rl1
ifconfig_rl1="inet 194.x.x.222  netmask 255.255.255.240"

defaultrouter="213.x.x.37"
hostname="gw"


moused_enable="YES"
sshd_enable="YES"
gateway_enable="YES"
firewall_enable="YES"
firewall_type="OPEN"
firewall_script="/etc/rc.firewall"
natd_enable="YES"
natd_program="/sbin/natd"
natd_interface="rl0" #i`m using internal interface
natd_flags="-reverse -unregistered_only -log"


--- ifconfig -a--
ed0: flags=108843 mtu 1500
inet 213.x.x.38 netmask 0xfffc broadcast 213.x.x.39
inet6 fe80::x:x:fede:683e%ed0 prefixlen 64 scopeid 0x1
ether 00:80:48:de:68:3e
rl0: flags=8843 mtu 1500
options=8
inet 192.168.1.193 netmask 0xff00 broadcast 192.168.1.255
inet6 fe80::x:x:x:ed02%rl0 prefixlen 64 scopeid 0x2
inet 194.x.x.193 netmask 0xfff0 broadcast 194.x.x.207
ether 00:80:48:14:ed:02
media: Ethernet autoselect (100baseTX )
status: active
rl1: flags=8843 mtu 1500
options=8
inet 194.x.x.222 netmask 0xfff0 broadcast 194.x.x.223
inet6 fe80::x:x:fe14:dfe9%rl1 prefixlen 64 scopeid 0x3
ether 00:80:48:14:df:e9
media: Ethernet autoselect (none)
status: no carrier
plip0: flags=108810 mtu 1500
lo0: flags=8049 mtu 16384
inet 127.0.0.1 netmask 0xff00
inet6 ::1 prefixlen 128
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x5


 ipfw show 
00050  157   10434 divert 8668 ip from 192.168.1.0/24 to any via rl0
000550   0 divert 8668 ip from any to 192.168.1.0/24 via rl0
001000   0 allow ip from any to any via lo0
002000   0 deny ip from any to 127.0.0.0/8
003000   0 deny ip from 127.0.0.0/8 to any
65000 8533 1392765 allow ip from any to any
655350   0 allow ip from any to any
---



Feel free to answer directly to my e-mail.
-- 
Best regards,
 hydros  mailto:[EMAIL PROTECTED]

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Giant-free polling [PATCH]

2005-03-11 Thread Julian Elischer

Gleb Smirnoff wrote:
On Fri, Mar 11, 2005 at 03:14:50PM +0100, Pawel Jakub Dawidek wrote:
P> On Fri, Mar 11, 2005 at 04:55:25PM +0300, dima wrote:
P> +> I thought about using list also, but considered it to bring
P> +> too much overhead to the code. The original idea of handling arrays
P> +> seems to be very elegant.
P> 
P> Overhead? Did you run any benchmarks to prove it?
P> I find list-version much more elegant that using an array.

It is also a small cookie for future. Now we have IFF_POLLING flag and
IFCAP_POLLING, which indicate whether interface support polling and whether
it actually does polling. This is not nice, from my viewpoint. I'd like
to see only IFCAP_POLLING present and turning polling on/off for particular
interface should be done by inserting/removing iface from polling list.
This will also remove an extra unlocked check of interface flags (?).
P> I also don't like the idea of calling handler method with two locks
P> held (one sx and one mutex)...
I agree with Pawel. We have LOR here between sx lock and driver lock:
normal polling: (get sx shared) -> (get driver mutex)
driver stop:(get driver mutex) -> (get sx exclusive)
We will have deadlock if this two things process in parallel.
And the per-interface mutex protects only reentrancy of interface poll
method, is that right?
P> There is still an unresolved problem (in your and our patch as well) of
P> using ifnet structure fields without synchronization, as we don't have
P> access tointerface's internal mutex, which protects those fields.
 

you need to add an interface method that has access to it..
This is unresolved in our patch, too, and I believe throughout many
other places in kernel.
 

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Giant-free polling [PATCH]

2005-03-11 Thread Pawel Jakub Dawidek
On Fri, Mar 11, 2005 at 01:14:38PM -0800, Julian Elischer wrote:
+> >P> There is still an unresolved problem (in your and our patch as well) of
+> >P> using ifnet structure fields without synchronization, as we don't have
+> >P> access tointerface's internal mutex, which protects those fields.
+> > 
+> >
+> 
+> you need to add an interface method that has access to it..

I was thinking more about moving interface mutex into ifnet structure,
but Robert has some objections IIRC.

-- 
Pawel Jakub Dawidek   http://www.wheel.pl
[EMAIL PROTECTED]   http://www.FreeBSD.org
FreeBSD committer Am I Evil? Yes, I Am!


pgp1t7daT1OoP.pgp
Description: PGP signature


Re: Giant-free polling [PATCH]

2005-03-11 Thread Sam Leffler
Pawel Jakub Dawidek wrote:
On Fri, Mar 11, 2005 at 01:14:38PM -0800, Julian Elischer wrote:
+> >P> There is still an unresolved problem (in your and our patch as well) of
+> >P> using ifnet structure fields without synchronization, as we don't have
+> >P> access tointerface's internal mutex, which protects those fields.
+> > 
+> >
+> 
+> you need to add an interface method that has access to it..

I was thinking more about moving interface mutex into ifnet structure,
but Robert has some objections IIRC.
I don't know what Robert's objections are but I've considered doing it 
for a while to deal with some locking issues in net80211-based drivers. 
 The only issue I can see is if this mutex boxes drivers into a locking 
model that interlocks the rx+tx paths.

Sam
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Problems stopping pptp...

2005-03-11 Thread Eric Schuele
Hello,
This was originally posted to -Questions.  I thought it would be a no 
brainer and get an immediate response.  But I haven't heard from anyone 
yet, so I thought I'd post here too.

I have some keybindings on my laptop that allow me to easily start and
stop a pptp connection to my office.  The binding are of no concern 
other than I would like to be able to start/stop pptp as a non-root user.

It looks something like:
  Control Shift V opens the vpn using
 sudo pptp x.x.x.x OFFICE
  Alt Shift V closes the connection
 sudo killall -TERM ppp
When I do this (stopping pptp) I get a pptp.core in my home dir.  In
fact no matter what I try... I allways end up with a core.
I have tried:
  # as myself
  sudo killall -TERM ppp
  sudo kill -TERM `cat /var/run/tun0.pid`
  sudo killall -TERM pptp
Also tried those as root (without sudo).
I have tried all of the above after issuing the pptp command from CLI as
root.  Still no luck
I have also tried other signals (QUIT, ABRT).
So... the question is.. How am I supposed to shut down a pptp
connection?  I would like to be able to do it with sudo, or at least
some way to bind it to keys of non-root users.
Thanks,
--
Regards,
Eric
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"