from:"Andrew"

Re: re0: Unknown H/W revision: 0x28000000 device_attach: re0 attach returned 6

2008-12-01 Thread Andrew

2008/12/1 Pyun YongHyeon <[EMAIL PROTECTED]>:
> On Sun, Nov 30, 2008 at 03:18:41AM +0000, Andrew Tulloch wrote:
>  > I've just installed from the FreeBSD 7.1-BETA1 iso and get the
>  > following when the re driver attempts to attach to the two onboard
>  > NICs found on a Gigabyte GA-EX58-UD5 motherboard:
>  >
>  > re0:   > Ethernet> port 0x9e00-0x9eff mem
>  > 0xfd3ff000-0xfd3f,0xfd3f8000-0xfd3fbfff irq 16 at device 0.0 on
>  > pci8
>  > re0: Chip rev. 0x2800
>  > re0: MAC rev. 0x0010
>  > re0: Unknown H/W revision: 0x2800
>  > device_attach: re0 attach returned 6
>  > pcib9:  irq 17 at device 28.5 on pci0
>  > pci9:  on pcib9
>  > re1:   > Ethernet> port 0x8e00-0x8eff mem
>  > 0xfd1ff000-0xfd1f,0xfd1f8000-0xfd1fbfff irq 17 at device 0.0 on
>  > pci9
>  > re1: Chip rev. 0x2800
>  > re1: MAC rev. 0x0010
>  > re1: Unknown H/W revision: 0x2800
>  > device_attach: re1 attach returned 6
>  >
>  > pciconf -lvc extract:
>  > [EMAIL PROTECTED]:8:0:0:  class=0x02 card=0xe0001458 
> chip=0x816810ec rev=0x03 hdr=0x00
>  > vendor = 'Realtek Semiconductor'
>  > device = 'RTL8168/8111 PCI-E Gigabit Ethernet NIC'
>  > class  = network
>  > subclass   = ethernet
>  > cap 01[40] = powerspec 3  supports D0 D1 D2 D3  current D0
>  > cap 05[50] = MSI supports 1 message, 64 bit
>  > cap 10[70] = PCI-Express 2 endpoint IRQ 0
>  > cap 11[ac] = MSI-X supports 4 messages in map 0x20
>  > cap 03[cc] = VPD
>  > [EMAIL PROTECTED]:9:0:0:  class=0x02 card=0xe0001458 
> chip=0x816810ec rev=0x03 hdr=0x00
>  > vendor = 'Realtek Semiconductor'
>  > device = 'RTL8168/8111 PCI-E Gigabit Ethernet NIC'
>  > class  = network
>  > subclass   = ethernet
>  > cap 01[40] = powerspec 3  supports D0 D1 D2 D3  current D0
>  > cap 05[50] = MSI supports 1 message, 64 bit
>  > cap 10[70] = PCI-Express 2 endpoint IRQ 0
>  > cap 11[ac] = MSI-X supports 4 messages in map 0x20
>  > cap 03[cc] = VPD
>  >
>  >
>  > Is there any simple patch I can apply to get the driver to attach,
>  > assuming it should work?
>  >
>
> This controller seems to support MSI-X with 4 messages.
> Unfortunately previous PCIe controllers from RealTek were notorious
> for MSI issues so it's hard to know this revision really works with
> MSI-X. I guess it was added to support RSS(receive-side scaling of
> MS NDIS 6.0).
> As sephe said if the controller configuration is the same as 8168C
> family, the attached patch would make re(4) work as expected.
>
> --
> Regards,
> Pyun YongHyeon
>

Pyun,
I applied the patch, but it didn't attach initially, I added an extra
entry to re_hwrevs as that seemed to be what was missing and it
attached and seems to function (as far as a quick ping test and make
update). Changes I made to if_re.c attached. If you have anything to
try for MSI-X I can probably test those.

Thanks,
Andrew


re.patch
Description: Binary data
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: ipv6 connection hash function wanted ...

2006-11-14 Thread Andrew


On Tue, 14 Nov 2006 17:09:20 +0100
 Max Laier <[EMAIL PROTECTED]> wrote:

Hello,

this one is something for people who know their math.

Input: 2x128bit of address (lower ~80bit selectable by user) and 
2x16bit 
of ports (more or less selectable by user).  Note that the "flow_id" 
is 
not useable as several broken stack implementations do not set it 
consistently - and it is user settable as well.

Output: "int" hash value - by default we use the lower 8bit of it.

Problems: Most of the input can be selected by a user meaning it is 
easy 
to produce collisions.  For legal connections, the lower 64bit are 
the 
one with the highest entropy - in fact the upper 64bit might be the 
same 
for many connections coming from/going to the same subnet.  This 
function 
will be used for every packet that is passed to a dynamic IPFW rule, 
so 
efficiency is a concern.


Any ideas?  Any papers that deal with this problem?

ref: sys/netinet/ip_fw2.c::hash_packet6


 May be the Rsync algorithm is suitable partially..

Here is the discription: http://samba.anu.edu.au/rsync/tech_report/

 Andrew.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Packet Capturing on GWs but don't let them go out.

2002-11-22 Thread Andrew



On Fri, 22 Nov 2002, soheil soheil wrote:

> just i want that all of the packet from the sockets that are created by me
> travels through my server

Have you looked at ipfw divert?

Andrew


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-net" in the body of the message

[patch] [1/6] sfxge: fix mbuf leak if it does not fit in software queue

2014-03-15 Thread Andrew Rybchenko



sfxge: fix mbuf leak if it does not fit in software queue

mbuf should be owned by if_transmit function in any case.

Submitted-by:   Andrew Rybchenko 
Sponsored by:   Solarflare Communications, Inc.

diff -r e2bc8f64f1b2 -r ff9f5d3dbafe src/driver/freebsd/sfxge_tx.c
--- a/head/sys/dev/sfxge/sfxge_tx.c	Tue Mar 04 13:13:05 2014 +0400
+++ b/head/sys/dev/sfxge/sfxge_tx.c	Tue Mar 04 13:15:13 2014 +0400
@@ -536,6 +536,7 @@
 	return (0);
 
 fail:
+	m_freem(m);
 	return (rc);
 	
 }
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

[PATCH 1/6] sfxge: fix mbuf leak if it does not fit in software queue

2014-03-18 Thread Andrew Rybchenko



sfxge: fix mbuf leak if it does not fit in software queue

mbuf should be owned by if_transmit function in any case.

Submitted-by:   Andrew Rybchenko 
Sponsored by:   Solarflare Communications, Inc.

diff -r e2bc8f64f1b2 -r ff9f5d3dbafe src/driver/freebsd/sfxge_tx.c
--- a/head/sys/dev/sfxge/sfxge_tx.cTue Mar 04 13:13:05 2014 +0400
+++ b/head/sys/dev/sfxge/sfxge_tx.cTue Mar 04 13:15:13 2014 +0400
@@ -536,6 +536,7 @@
 return (0);

 fail:
+m_freem(m);
 return (rc);

 }

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

[PATCH 2/6] sfxge: limit software Tx queue size

2014-03-18 Thread Andrew Rybchenko


sfxge: limit software Tx queue size

Previous implementation limits put queue size only (when Tx lock can't 
be acquired),
but get queue may grow unboundedly which results in mbuf pools 
exhaustion and

latency growth.

Submitted-by:   Andrew Rybchenko 
Sponsored by:   Solarflare Communications, Inc.

diff -r ff9f5d3dbafe -r 7632a3355224 src/driver/freebsd/sfxge_tx.c
--- a/head/sys/dev/sfxge/sfxge_tx.cTue Mar 04 13:15:13 2014 +0400
+++ b/head/sys/dev/sfxge/sfxge_tx.cWed Mar 05 09:06:01 2014 +0400
@@ -461,6 +461,9 @@

 sfxge_tx_qdpl_swizzle(txq);

+if (stdp->std_count >= SFXGE_TX_DPL_GET_PKT_LIMIT_DEFAULT)
+return ENOBUFS;
+
 *(stdp->std_getp) = mbuf;
 stdp->std_getp = &mbuf->m_nextpkt;
 stdp->std_count++;
@@ -480,7 +483,7 @@
 old_len = mp->m_pkthdr.csum_data;
 } else
 old_len = 0;
-if (old_len >= SFXGE_TX_MAX_DEFERRED)
+if (old_len >= SFXGE_TX_DPL_PUT_PKT_LIMIT_DEFAULT)
 return ENOBUFS;
 mbuf->m_pkthdr.csum_data = old_len + 1;
 mbuf->m_nextpkt = (void *)old;
@@ -507,12 +510,9 @@
  */
 locked = mtx_trylock(&txq->lock);

-/*
- * Can only fail if we weren't able to get the lock.
- */
 if (sfxge_tx_qdpl_put(txq, m, locked) != 0) {
-KASSERT(!locked,
-("sfxge_tx_qdpl_put() failed locked"));
+if (locked)
+mtx_unlock(&txq->lock);
 rc = ENOBUFS;
 goto fail;
 }
diff -r ff9f5d3dbafe -r 7632a3355224 src/driver/freebsd/sfxge_tx.h
--- a/head/sys/dev/sfxge/sfxge_tx.hTue Mar 04 13:15:13 2014 +0400
+++ b/head/sys/dev/sfxge/sfxge_tx.hWed Mar 05 09:06:01 2014 +0400
@@ -75,7 +75,8 @@
 enum sfxge_tx_buf_flagsflags;
 };

-#define SFXGE_TX_MAX_DEFERRED 64
+#define SFXGE_TX_DPL_GET_PKT_LIMIT_DEFAULT64
+#define SFXGE_TX_DPL_PUT_PKT_LIMIT_DEFAULT64

 /*
  * Deferred packet list.

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

[PATCH 3/6] sfxge: return error when packet is dropped because of link down

2014-03-18 Thread Andrew Rybchenko


sfxge: return error when packet is dropped because of link down

Submitted-by:   Boris Misenov 
Sponsored by:   Solarflare Communications, Inc.

diff -r d292c9f51d36 -r 53935db50f8a src/driver/freebsd/sfxge_tx.c
--- a/head/sys/dev/sfxge/sfxge_tx.cThu Mar 06 13:38:55 2014 +
+++ b/head/sys/dev/sfxge/sfxge_tx.cMon Mar 10 11:37:12 2014 +0400
@@ -589,7 +589,7 @@

 if (!SFXGE_LINK_UP(sc)) {
 m_freem(m);
-return (0);
+return (ENETDOWN);
 }

 /* Pick the desired transmit queue. */

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

[PATCH 4/6] sfxge: add counter for Tx errors returned from if_transmit

2014-03-18 Thread Andrew Rybchenko


sfxge: add counter for Tx errors returned from if_transmit

Submitted-by:   Boris Misenov 
Sponsored by:   Solarflare Communications, Inc.

diff -r 53935db50f8a -r af2586a023d8 src/driver/freebsd/sfxge_tx.c
--- a/head/sys/dev/sfxge/sfxge_tx.cMon Mar 10 11:37:12 2014 +0400
+++ b/head/sys/dev/sfxge/sfxge_tx.cMon Mar 10 11:37:12 2014 +0400
@@ -503,6 +503,11 @@
 int locked;
 int rc;

+if (!SFXGE_LINK_UP(txq->sc)) {
+rc = ENETDOWN;
+goto fail;
+}
+
 /*
  * Try to grab the txq lock.  If we are able to get the lock,
  * the packet will be appended to the "get list" of the deferred
@@ -537,6 +542,7 @@

 fail:
 m_freem(m);
+atomic_add_long(&txq->early_drops, 1);
 return (rc);

 }
@@ -587,11 +593,6 @@

 KASSERT(ifp->if_flags & IFF_UP, ("interface not up"));

-if (!SFXGE_LINK_UP(sc)) {
-m_freem(m);
-return (ENETDOWN);
-}
-
 /* Pick the desired transmit queue. */
 if (m->m_pkthdr.csum_flags & (CSUM_DELAY_DATA | CSUM_TSO)) {
 int index = 0;
@@ -1391,6 +1392,7 @@
 SFXGE_TX_STAT(tso_long_headers, tso_long_headers),
 SFXGE_TX_STAT(tx_collapses, collapses),
 SFXGE_TX_STAT(tx_drops, drops),
+SFXGE_TX_STAT(tx_early_drops, early_drops),
 };

 static int
diff -r 53935db50f8a -r af2586a023d8 src/driver/freebsd/sfxge_tx.h
--- a/head/sys/dev/sfxge/sfxge_tx.hMon Mar 10 11:37:12 2014 +0400
+++ b/head/sys/dev/sfxge/sfxge_tx.hMon Mar 10 11:37:12 2014 +0400
@@ -160,6 +160,7 @@
 unsigned longtso_long_headers;
 unsigned longcollapses;
 unsigned longdrops;
+unsigned longearly_drops;

 /* The following fields change more often, and are used mostly
  * on the completion path

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

[PATCH 5/6] sfxge: access statistics buffers under port lock

2014-03-18 Thread Andrew Rybchenko


sfxge: access statistics buffers under port lock

Allow access to statistics data not only from sysctl handlers.

Submitted-by:   Boris Misenov 
Sponsored by:   Solarflare Communications, Inc.

diff -r af2586a023d8 -r 7f58b1a5ea60 src/driver/freebsd/sfxge_port.c
--- a/head/sys/dev/sfxge/sfxge_port.cMon Mar 10 11:37:12 2014 +0400
+++ b/head/sys/dev/sfxge/sfxge_port.cMon Mar 10 11:37:12 2014 +0400
@@ -48,7 +48,7 @@
 unsigned int count;
 int rc;

-mtx_lock(&port->lock);
+mtx_assert(&port->lock, MA_OWNED);

 if (port->init_state != SFXGE_PORT_STARTED) {
 rc = 0;
@@ -82,7 +82,6 @@

 rc = ETIMEDOUT;
 out:
-mtx_unlock(&port->lock);
 return rc;
 }

@@ -93,12 +92,16 @@
 unsigned int id = arg2;
 int rc;

+mtx_lock(&sc->port.lock);
 if ((rc = sfxge_mac_stat_update(sc)) != 0)
-return rc;
+goto out;

-return SYSCTL_OUT(req,
+rc = SYSCTL_OUT(req,
   (uint64_t *)sc->port.mac_stats.decode_buf + id,
   sizeof(uint64_t));
+out:
+mtx_unlock(&sc->port.lock);
+return rc;
 }

 static void
@@ -442,7 +445,7 @@
 unsigned int count;
 int rc;

-mtx_lock(&port->lock);
+mtx_assert(&port->lock, MA_OWNED);

 if (port->init_state != SFXGE_PORT_STARTED) {
 rc = 0;
@@ -476,7 +479,6 @@

 rc = ETIMEDOUT;
 out:
-mtx_unlock(&port->lock);
 return rc;
 }

@@ -487,12 +489,16 @@
 unsigned int id = arg2;
 int rc;

+mtx_lock(&sc->port.lock);
 if ((rc = sfxge_phy_stat_update(sc)) != 0)
-return rc;
+goto out;

-return SYSCTL_OUT(req,
+rc = SYSCTL_OUT(req,
   (uint32_t *)sc->port.phy_stats.decode_buf + id,
   sizeof(uint32_t));
+out:
+mtx_unlock(&sc->port.lock);
+return rc;
 }

 static void

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

[PATCH 6/6] sfxge: implement interface statistics shown by netstat

2014-03-18 Thread Andrew Rybchenko


sfxge: implement interface statistics shown by netstat

netstat directly reads interface statistics collected
in the ifnet structure members: if_ipackets, if_ierrors, if_iqdrops,
if_opackets, if_oerrors, if_collisions.
The if_oerrors counter should include both errors reported by hardware
and errors happened in software before posting to hardware.
Since statistics is retrieved periodically its counters could be smaller
than counters provided by kernel for IP addresses.
Report IFCAP_HWSTATS capability since driver manages if_ibytes.

Submitted-by:   Boris Misenov 
Sponsored by:   Solarflare Communications, Inc.

diff -r 7f58b1a5ea60 -r a3ab0749ffa3 src/driver/freebsd/sfxge.c
--- a/head/sys/dev/sfxge/sfxge.cMon Mar 10 11:37:12 2014 +0400
+++ b/head/sys/dev/sfxge/sfxge.cMon Mar 10 11:37:12 2014 +0400
@@ -60,10 +60,10 @@
 #define SFXGE_CAP (IFCAP_VLAN_MTU | \
IFCAP_HWCSUM | IFCAP_VLAN_HWCSUM | IFCAP_TSO |\
IFCAP_JUMBO_MTU | IFCAP_LRO |\
-   IFCAP_VLAN_HWTSO | IFCAP_LINKSTATE)
+   IFCAP_VLAN_HWTSO | IFCAP_LINKSTATE | IFCAP_HWSTATS)
 #define SFXGE_CAP_ENABLE SFXGE_CAP
 #define SFXGE_CAP_FIXED (IFCAP_VLAN_MTU | IFCAP_HWCSUM | 
IFCAP_VLAN_HWCSUM | \

- IFCAP_JUMBO_MTU | IFCAP_LINKSTATE)
+ IFCAP_JUMBO_MTU | IFCAP_LINKSTATE | IFCAP_HWSTATS)

 MALLOC_DEFINE(M_SFXGE, "sfxge", "Solarflare 10GigE driver");

@@ -274,10 +274,23 @@
 }

 static void
+sfxge_tick(void *arg)
+{
+struct sfxge_softc *sc = arg;
+
+sfxge_port_update_stats(sc);
+sfxge_tx_update_stats(sc);
+
+callout_reset(&sc->tick_callout, SFXGE_CALLOUT_TICKS, sfxge_tick, sc);
+}
+
+static void
 sfxge_ifnet_fini(struct ifnet *ifp)
 {
 struct sfxge_softc *sc = ifp->if_softc;

+callout_drain(&sc->tick_callout);
+
 sx_xlock(&sc->softc_lock);
 sfxge_stop(sc);
 sx_xunlock(&sc->softc_lock);
@@ -321,9 +334,12 @@
 mtx_init(&sc->tx_lock, "txq", NULL, MTX_DEF);
 #endif

+callout_init(&sc->tick_callout, TRUE);
+
 if ((rc = sfxge_port_ifmedia_init(sc)) != 0)
 goto fail;

+callout_reset(&sc->tick_callout, SFXGE_CALLOUT_TICKS, sfxge_tick, sc);
 return 0;

 fail:
diff -r 7f58b1a5ea60 -r a3ab0749ffa3 src/driver/freebsd/sfxge.h
--- a/head/sys/dev/sfxge/sfxge.hMon Mar 10 11:37:12 2014 +0400
+++ b/head/sys/dev/sfxge/sfxge.hMon Mar 10 11:37:12 2014 +0400
@@ -61,6 +61,9 @@
 #ifndef IFCAP_VLAN_HWTSO
 #define IFCAP_VLAN_HWTSO 0
 #endif
+#ifndef IFCAP_HWSTATS
+#define IFCAP_HWSTATS 0
+#endif
 #ifndef IFM_10G_T
 #define IFM_10G_T IFM_UNKNOWN
 #endif
@@ -100,6 +103,8 @@

 #defineSFXGE_EV_BATCH16384

+#defineSFXGE_CALLOUT_TICKS 10
+
 struct sfxge_evq {
 struct sfxge_softc*sc  __aligned(CACHE_LINE_SIZE);
 struct mtxlock __aligned(CACHE_LINE_SIZE);
@@ -241,6 +246,7 @@
 #ifndef SFXGE_HAVE_MQ
 struct mtxtx_lock __aligned(CACHE_LINE_SIZE);
 #endif
+struct callouttick_callout;
 };

 #define SFXGE_LINK_UP(sc) ((sc)->port.link_mode != EFX_LINK_DOWN)
@@ -298,6 +304,7 @@
 efx_link_mode_t mode);
 extern int sfxge_mac_filter_set(struct sfxge_softc *sc);
 extern int sfxge_port_ifmedia_init(struct sfxge_softc *sc);
+extern void sfxge_port_update_stats(struct sfxge_softc *sc);

 #define SFXGE_MAX_MTU (9 * 1024)

diff -r 7f58b1a5ea60 -r a3ab0749ffa3 src/driver/freebsd/sfxge_port.c
--- a/head/sys/dev/sfxge/sfxge_port.cMon Mar 10 11:37:12 2014 +0400
+++ b/head/sys/dev/sfxge/sfxge_port.cMon Mar 10 11:37:12 2014 +0400
@@ -85,6 +85,40 @@
 return rc;
 }

+void
+sfxge_port_update_stats(struct sfxge_softc *sc)
+{
+struct ifnet *ifp;
+uint64_t *mac_stats;
+
+mtx_lock(&sc->port.lock);
+/* Ignore error and use old values */
+(void)sfxge_mac_stat_update(sc);
+
+ifp = sc->ifnet;
+mac_stats = (uint64_t *)sc->port.mac_stats.decode_buf;
+
+ifp->if_ipackets = mac_stats[EFX_MAC_RX_PKTS];
+ifp->if_ierrors = mac_stats[EFX_MAC_RX_ERRORS];
+ifp->if_opackets = mac_stats[EFX_MAC_TX_PKTS];
+ifp->if_oerrors = mac_stats[EFX_MAC_TX_ERRORS];
+ifp->if_collisions =
+mac_stats[EFX_MAC_TX_SGL_COL_PKTS] +
+mac_stats[EFX_MAC_TX_MULT_COL_PKTS] +
+mac_stats[EFX_MAC_TX_EX_COL_PKTS] +
+mac_stats[EFX_MAC_TX_LATE_COL_PKTS];
+ifp->if_ibytes = mac_stats[EFX_MAC_RX_OCTETS];
+ifp->if_obytes = mac_stats[EFX_MAC_TX_OCTETS];
+/* if_imcasts is maintained in net/if_ethersubr.c */
+ifp->if_omcasts =
+mac_stats[EFX_MAC_TX_MULTICST_PKTS] +
+mac_stats[EFX_MAC_TX_BRDCST_PKTS];
+/* if_iqdrops is maintained in net/if_ethersubr.c */
+/* if_noproto is maintained in net/if_ethersubr.c */
+
+mtx_unlock(&sc->port.lock);
+}
+
 static int
 sfxge_mac_stat_handler(SYSCTL_HANDLER_ARGS)
 {
diff -r 7f58b1a5ea60 -r a3ab0749ffa3 src/driver/freebsd/sfxge_tx.c
--- a/head/sys/dev/sfxge/sfxge_tx.cMon Mar 10 11:37:12 2014 +0400
+++ b/head/sys/dev/sfxge/sfxge_tx.cMon Mar 10 11:37:12 2014 +0400
@@ -1436,6 +1436,28 @@

Re: [PATCH 1/6] sfxge: fix mbuf leak if it does not fit in software queue

2014-03-18 Thread Andrew Rybchenko


Gleb,

On 03/18/2014 04:46 PM, Gleb Smirnoff wrote:

   Andrew,

On Tue, Mar 18, 2014 at 01:11:15PM +0400, Andrew Rybchenko wrote:
A>
A> sfxge: fix mbuf leak if it does not fit in software queue
A>
A> mbuf should be owned by if_transmit function in any case.
A>
A> Submitted-by:   Andrew Rybchenko 
A> Sponsored by:   Solarflare Communications, Inc.

Can we simplify the function while here?
One of the next patches (4/6) moves link down check to the function and 
uses "fail" label to increment early drops statistics and free mbuf. 
IMHO, it is really nice to have single place to do it.


Thanks,
Andrew.

--
Andrew Rybchenko
OKTET Labs, St.-Petersburg, RussiaWeb: www.oktetlabs.ru
Office: +7 812 7832191  Fax: +7 812 7846591  Mobile: +7 921 7479683

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: [PATCH 4/6] sfxge: add counter for Tx errors returned from if_transmit

2014-03-18 Thread Andrew Rybchenko


Gleb,

On 03/18/2014 04:59 PM, Gleb Smirnoff wrote:

  Andrew, Boris,

>
> On Tue, Mar 18, 2014 at 01:58:40PM +0400, Andrew Rybchenko wrote:
> A> sfxge: add counter for Tx errors returned from if_transmit
> A>
> A> Submitted-by:   Boris Misenov 
> A> Sponsored by:   Solarflare Communications, Inc.
>
>   I'd suggest not to use atomic(9) increment there, since it locks
> memory bus and thus has performance impact. Using ++ would be fine,
> since we probably don't care about absolute precision of this
> counter.
We think that usage of atomic here is appropriate since it is not on 
fast path. CPU has already overfilled both HW and SW Tx queues.



  However, if you are  interested in precision and also in performance,

> I'd suggest you to convert all statistics in sfxge(4) that are shared
> between CPUs to the counter(9) framework.
Yes, we have seen the framework. It looks really good and would use it 
in the case of fast path counter. Other problem that it is available 
since 10.0 only.



  More info on it can be found  in manual page and some measurements

> are available here:
>
> http://lists.freebsd.org/pipermail/freebsd-arch/2013-April/014204.html
Thanks a lot. We'll have a look.


  If you insist, I can apply  your patch as is. The only problem is that

> when you send patches inlined into email, your MUA mangles all TABs
> to spaces, so I can't apply your patches. If it possible, next time
> send them as attachments.
I didn't know it. I guess you have seen that I've sent the first patch 
twice (I'm sorry for that). The first one has patch in attachment, but 
it is not shown in mailman web interface and I need to download the 
patch to have a look and I've decided that it is not  I think it is very 
inconvenient. I can include both inline and attached patch, but it will 
double mail size. Is the any best practice how to send patches to 
mailing list?


Thanks a lot,
Andrew.

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: [PATCH 2/6] sfxge: limit software Tx queue size

2014-03-22 Thread Andrew Rybchenko


Gleb,

On 03/18/2014 05:24 PM, Gleb Smirnoff wrote:

   Andrew,

On Tue, Mar 18, 2014 at 01:55:01PM +0400, Andrew Rybchenko wrote:
A> sfxge: limit software Tx queue size
A>
A> Previous implementation limits put queue size only (when Tx lock can't
A> be acquired),
A> but get queue may grow unboundedly which results in mbuf pools
A> exhaustion and
A> latency growth.
A>
A> Submitted-by:   Andrew Rybchenko 
A> Sponsored by:   Solarflare Communications, Inc.

The interaction between sfxge_tx_qdpl_put() and sfxge_tx_packet_add()
is quite complex and I couldn't resist from suggesting you to
simplify the code.

Can you please look into attached patch?

- Inline sfxge_tx_qdpl_put() into sfxge_tx_packet_add().
- Simplify the 'locked' logic.
- Add your PATCH 1/6, the mbuf leak fix.
- Add your PATCH 2/6, the SFXGE_TX_DPL_GET_PKT_LIMIT_DEFAULT check.

I don't like "locked" flag passed to qdpl_put() function as well.
However, I prefer to keep patches granular and avoid mixing of different 
changes in single patch. If the initial patch is OK, please, submit it 
to repository. Then, I'll rebase your patch, discuss it locally and come 
back to you.


BTW, I see that many drivers use drbr for software Tx queue. What do you 
think, would it be beneficial to use it instead of the list implemented 
here?


Please, find initial patch with few minor fixes (TAB after #define and 
@->" at " in suggested commit message) attached.


Thanks a lot,
Andrew.

sfxge: limit software Tx queue size

Previous implementation limits put queue size only (when Tx lock can't
be acquired), but get queue may grow unboundedly which results in mbuf
pools exhaustion and latency growth.

Submitted-by:   Andrew Rybchenko 
Sponsored by:   Solarflare Communications, Inc.

diff -r ff9f5d3dbafe -r 7632a3355224 src/driver/freebsd/sfxge_tx.c
--- a/head/sys/dev/sfxge/sfxge_tx.c	Tue Mar 04 13:15:13 2014 +0400
+++ b/head/sys/dev/sfxge/sfxge_tx.c	Wed Mar 05 09:06:01 2014 +0400
@@ -461,6 +461,9 @@
 
 		sfxge_tx_qdpl_swizzle(txq);
 
+		if (stdp->std_count >= SFXGE_TX_DPL_GET_PKT_LIMIT_DEFAULT)
+			return ENOBUFS;
+
 		*(stdp->std_getp) = mbuf;
 		stdp->std_getp = &mbuf->m_nextpkt;
 		stdp->std_count++;
@@ -480,7 +483,7 @@
 old_len = mp->m_pkthdr.csum_data;
 			} else
 old_len = 0;
-			if (old_len >= SFXGE_TX_MAX_DEFERRED)
+			if (old_len >= SFXGE_TX_DPL_PUT_PKT_LIMIT_DEFAULT)
 return ENOBUFS;
 			mbuf->m_pkthdr.csum_data = old_len + 1;
 			mbuf->m_nextpkt = (void *)old;
@@ -507,12 +510,9 @@
 	 */
 	locked = mtx_trylock(&txq->lock);
 
-	/*
-	 * Can only fail if we weren't able to get the lock.
-	 */
 	if (sfxge_tx_qdpl_put(txq, m, locked) != 0) {
-		KASSERT(!locked,
-		("sfxge_tx_qdpl_put() failed locked"));
+		if (locked)
+			mtx_unlock(&txq->lock);
 		rc = ENOBUFS;
 		goto fail;
 	}
diff -r ff9f5d3dbafe -r 7632a3355224 src/driver/freebsd/sfxge_tx.h
--- a/head/sys/dev/sfxge/sfxge_tx.h	Tue Mar 04 13:15:13 2014 +0400
+++ b/head/sys/dev/sfxge/sfxge_tx.h	Wed Mar 05 09:06:01 2014 +0400
@@ -75,7 +75,8 @@
 	enum sfxge_tx_buf_flags	flags;
 };
 
-#define SFXGE_TX_MAX_DEFERRED 64
+#define	SFXGE_TX_DPL_GET_PKT_LIMIT_DEFAULT	64
+#define	SFXGE_TX_DPL_PUT_PKT_LIMIT_DEFAULT	64
 
 /*
  * Deferred packet list.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

[PATCH 2/3] sfxge: TXQ index (not label) comes from FW in flush done event

2014-04-12 Thread Andrew Rybchenko


Change the second argument name of the efx_txq_flush_done_ev_t prototype to
highlight that TXQ index (not label) comes from FW in flush done event.
sfxge: TXQ index (not label) comes from FW in flush done event

Change the second argument name of the efx_txq_flush_done_ev_t prototype to
highlight that TXQ index (not label) comes from FW in flush done event.

Submitted by:   Andrew Rybchenko 
Sponsored by:   Solarflare Communications, Inc.

diff -r 74ea9e0f7842 -r 42f27b037ebb sys/dev/sfxge/common/efx.h
--- a/sys/dev/sfxge/common/efx.h	Thu Apr 10 14:23:34 2014 +0400
+++ b/sys/dev/sfxge/common/efx.h	Thu Apr 10 14:23:36 2014 +0400
@@ -1389,7 +1389,7 @@
 typedef	__checkReturn	boolean_t
 (*efx_txq_flush_done_ev_t)(
 	__in_opt	void *arg,
-	__in		uint32_t label);
+	__in		uint32_t txq_index);
 
 typedef	__checkReturn	boolean_t
 (*efx_software_ev_t)(
diff -r 74ea9e0f7842 -r 42f27b037ebb sys/dev/sfxge/common/efx_ev.c
--- a/sys/dev/sfxge/common/efx_ev.c	Thu Apr 10 14:23:34 2014 +0400
+++ b/sys/dev/sfxge/common/efx_ev.c	Thu Apr 10 14:23:36 2014 +0400
@@ -406,16 +406,16 @@
 
 	switch (EFX_QWORD_FIELD(*eqp, FSF_AZ_DRIVER_EV_SUBCODE)) {
 	case FSE_AZ_TX_DESCQ_FLS_DONE_EV: {
-		uint32_t label;
+		uint32_t txq_index;
 
 		EFX_EV_QSTAT_INCR(eep, EV_DRIVER_TX_DESCQ_FLS_DONE);
 
-		label = EFX_QWORD_FIELD(*eqp, FSF_AZ_DRIVER_EV_SUBDATA);
+		txq_index = EFX_QWORD_FIELD(*eqp, FSF_AZ_DRIVER_EV_SUBDATA);
 
-		EFSYS_PROBE1(tx_descq_fls_done, uint32_t, label);
+		EFSYS_PROBE1(tx_descq_fls_done, uint32_t, txq_index);
 
 		EFSYS_ASSERT(eecp->eec_txq_flush_done != NULL);
-		should_abort = eecp->eec_txq_flush_done(arg, label);
+		should_abort = eecp->eec_txq_flush_done(arg, txq_index);
 
 		break;
 	}
diff -r 74ea9e0f7842 -r 42f27b037ebb sys/dev/sfxge/sfxge_ev.c
--- a/sys/dev/sfxge/sfxge_ev.c	Thu Apr 10 14:23:34 2014 +0400
+++ b/sys/dev/sfxge/sfxge_ev.c	Thu Apr 10 14:23:36 2014 +0400
@@ -260,16 +260,17 @@
 }
 
 static boolean_t
-sfxge_ev_txq_flush_done(void *arg, uint32_t label)
+sfxge_ev_txq_flush_done(void *arg, uint32_t txq_index)
 {
 	struct sfxge_evq *evq;
 	struct sfxge_softc *sc;
 	struct sfxge_txq *txq;
+	unsigned int label;
 	uint16_t magic;
 
 	evq = (struct sfxge_evq *)arg;
 	sc = evq->sc;
-	txq = sc->txq[label];
+	txq = sc->txq[txq_index];
 
 	KASSERT(txq != NULL, ("txq == NULL"));
 	KASSERT(txq->init_state == SFXGE_TXQ_INITIALIZED,
@@ -278,6 +279,7 @@
 	/* Resend a software event on the correct queue */
 	evq = sc->evq[txq->evq_index];
 
+	label = txq_index;
 	KASSERT((label & SFXGE_MAGIC_DMAQ_LABEL_MASK) == label,
 	("(label & SFXGE_MAGIC_DMAQ_LABEL_MASK) != label"));
 	magic = SFXGE_MAGIC_TX_QFLUSH_DONE | label;
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

[PATCH 3/3] sfxge: use TXQ type as label to support more than 32 TXQs

2014-04-12 Thread Andrew Rybchenko


There are 3 TXQs in event queue 0 and 1 TXQ (with TCP/UDP checksum offload)
in all other event queues.
sfxge: use TXQ type as label to support more than 32 TXQs

There are 3 TXQs in event queue 0 and 1 TXQ (with TCP/UDP checksum offload)
in all other event queues.

Submitted by:   Andrew Rybchenko 
Sponsored by:   Solarflare Communications, Inc.

diff -r 42f27b037ebb -r ab60166f0df9 sys/dev/sfxge/sfxge_ev.c
--- a/sys/dev/sfxge/sfxge_ev.c	Thu Apr 10 14:23:36 2014 +0400
+++ b/sys/dev/sfxge/sfxge_ev.c	Fri Apr 11 15:40:47 2014 +0400
@@ -218,18 +218,27 @@
 	return (B_FALSE);
 }
 
+static struct sfxge_txq *
+sfxge_get_txq_by_label(struct sfxge_evq *evq, enum sfxge_txq_type label)
+{
+	unsigned int index;
+
+	KASSERT((evq->index == 0 && label < SFXGE_TXQ_NTYPES) ||
+	(label == SFXGE_TXQ_IP_TCP_UDP_CKSUM), ("unexpected txq label"));
+	index = (evq->index == 0) ? label : (evq->index - 1 + SFXGE_TXQ_NTYPES);
+	return evq->sc->txq[index];
+}
+
 static boolean_t
 sfxge_ev_tx(void *arg, uint32_t label, uint32_t id)
 {
 	struct sfxge_evq *evq;
-	struct sfxge_softc *sc;
 	struct sfxge_txq *txq;
 	unsigned int stop;
 	unsigned int delta;
 
 	evq = (struct sfxge_evq *)arg;
-	sc = evq->sc;
-	txq = sc->txq[label];
+	txq = sfxge_get_txq_by_label(evq, label);
 
 	KASSERT(txq != NULL, ("txq == NULL"));
 	KASSERT(evq->index == txq->evq_index,
@@ -279,7 +288,7 @@
 	/* Resend a software event on the correct queue */
 	evq = sc->evq[txq->evq_index];
 
-	label = txq_index;
+	label = txq->type;
 	KASSERT((label & SFXGE_MAGIC_DMAQ_LABEL_MASK) == label,
 	("(label & SFXGE_MAGIC_DMAQ_LABEL_MASK) != label"));
 	magic = SFXGE_MAGIC_TX_QFLUSH_DONE | label;
@@ -336,7 +345,7 @@
 		break;
 	}
 	case SFXGE_MAGIC_TX_QFLUSH_DONE: {
-		struct sfxge_txq *txq = sc->txq[label];
+		struct sfxge_txq *txq = sfxge_get_txq_by_label(evq, label);
 
 		KASSERT(txq != NULL, ("txq == NULL"));
 		KASSERT(evq->index == txq->evq_index,
diff -r 42f27b037ebb -r ab60166f0df9 sys/dev/sfxge/sfxge_tx.c
--- a/sys/dev/sfxge/sfxge_tx.c	Thu Apr 10 14:23:36 2014 +0400
+++ b/sys/dev/sfxge/sfxge_tx.c	Fri Apr 11 15:40:47 2014 +0400
@@ -27,6 +27,21 @@
  * SUCH DAMAGE.
  */
 
+/* Theory of operation:
+ *
+ * Tx queues allocation and mapping
+ *
+ * One Tx queue with enabled checksum offload is allocated per Rx channel
+ * (event queue).  Also 2 Tx queues (one without checksum offload and one
+ * with IP checksum offload only) are allocated and bound to event queue 0.
+ * sfxge_txq_type is used as Tx queue label.
+ *
+ * So, event queue plus label mapping to Tx queue index is:
+ *	if event queue index is 0, TxQ-index = TxQ-label * [0..SFXGE_TXQ_NTYPES)
+ *	else TxQ-index = SFXGE_TXQ_NTYPES + EvQ-index - 1
+ * See sfxge_get_txq_by_label() sfxge_ev.c
+ */
+
 #include 
 __FBSDID("$FreeBSD$");
 
@@ -1179,7 +1194,7 @@
 	}
 
 	/* Create the common code transmit queue. */
-	if ((rc = efx_tx_qcreate(sc->enp, index, index, esmp,
+	if ((rc = efx_tx_qcreate(sc->enp, index, txq->type, esmp,
 	SFXGE_NDESCS, txq->buf_base_id, flags, evq->common,
 	&txq->common)) != 0)
 		goto fail;
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

[PATCH 1/3] sfxge: RXQ index (not label) comes from FW in flush done/failed events

2014-04-12 Thread Andrew Rybchenko


sfxge: RXQ index (not label) comes from FW in flush done/failed events

Change the second argument name of the efx_rxq_flush_done_ev_t and
efx_rxq_flush_failed_ev_t prototypes to highlight that RXQ index (not label)
comes from FW in flush done and failed events.

Submitted by:   Andrew Rybchenko 
Sponsored by:   Solarflare Communications, Inc.
sfxge: RXQ index (not label) comes from FW in flush done/failed events

Change the second argument name of the efx_rxq_flush_done_ev_t and
efx_rxq_flush_failed_ev_t prototypes to highlight that RXQ index (not label)
comes from FW in flush done and failed events.

Submitted by:   Andrew Rybchenko 
Sponsored by:   Solarflare Communications, Inc.

diff -r 303a89ceaefe -r 74ea9e0f7842 sys/dev/sfxge/common/efx.h
--- a/sys/dev/sfxge/common/efx.h	Wed Mar 19 14:41:05 2014 +
+++ b/sys/dev/sfxge/common/efx.h	Thu Apr 10 14:23:34 2014 +0400
@@ -1379,12 +1379,12 @@
 typedef	__checkReturn	boolean_t
 (*efx_rxq_flush_done_ev_t)(
 	__in_opt	void *arg,
-	__in		uint32_t label);
+	__in		uint32_t rxq_index);
 
 typedef	__checkReturn	boolean_t
 (*efx_rxq_flush_failed_ev_t)(
 	__in_opt	void *arg,
-	__in		uint32_t label);
+	__in		uint32_t rxq_index);
 
 typedef	__checkReturn	boolean_t
 (*efx_txq_flush_done_ev_t)(
diff -r 303a89ceaefe -r 74ea9e0f7842 sys/dev/sfxge/common/efx_ev.c
--- a/sys/dev/sfxge/common/efx_ev.c	Wed Mar 19 14:41:05 2014 +
+++ b/sys/dev/sfxge/common/efx_ev.c	Thu Apr 10 14:23:34 2014 +0400
@@ -420,10 +420,10 @@
 		break;
 	}
 	case FSE_AZ_RX_DESCQ_FLS_DONE_EV: {
-		uint32_t label;
+		uint32_t rxq_index;
 		uint32_t failed;
 
-		label = EFX_QWORD_FIELD(*eqp, FSF_AZ_DRIVER_EV_RX_DESCQ_ID);
+		rxq_index = EFX_QWORD_FIELD(*eqp, FSF_AZ_DRIVER_EV_RX_DESCQ_ID);
 		failed = EFX_QWORD_FIELD(*eqp, FSF_AZ_DRIVER_EV_RX_FLUSH_FAIL);
 
 		EFSYS_ASSERT(eecp->eec_rxq_flush_done != NULL);
@@ -432,15 +432,15 @@
 		if (failed) {
 			EFX_EV_QSTAT_INCR(eep, EV_DRIVER_RX_DESCQ_FLS_FAILED);
 
-			EFSYS_PROBE1(rx_descq_fls_failed, uint32_t, label);
+			EFSYS_PROBE1(rx_descq_fls_failed, uint32_t, rxq_index);
 
-			should_abort = eecp->eec_rxq_flush_failed(arg, label);
+			should_abort = eecp->eec_rxq_flush_failed(arg, rxq_index);
 		} else {
 			EFX_EV_QSTAT_INCR(eep, EV_DRIVER_RX_DESCQ_FLS_DONE);
 
-			EFSYS_PROBE1(rx_descq_fls_done, uint32_t, label);
+			EFSYS_PROBE1(rx_descq_fls_done, uint32_t, rxq_index);
 
-			should_abort = eecp->eec_rxq_flush_done(arg, label);
+			should_abort = eecp->eec_rxq_flush_done(arg, rxq_index);
 		}
 
 		break;
diff -r 303a89ceaefe -r 74ea9e0f7842 sys/dev/sfxge/sfxge_ev.c
--- a/sys/dev/sfxge/sfxge_ev.c	Wed Mar 19 14:41:05 2014 +
+++ b/sys/dev/sfxge/sfxge_ev.c	Thu Apr 10 14:23:34 2014 +0400
@@ -155,17 +155,18 @@
 }
 
 static boolean_t
-sfxge_ev_rxq_flush_done(void *arg, uint32_t label)
+sfxge_ev_rxq_flush_done(void *arg, uint32_t rxq_index)
 {
 	struct sfxge_evq *evq;
 	struct sfxge_softc *sc;
 	struct sfxge_rxq *rxq;
 	unsigned int index;
+	unsigned int label;
 	uint16_t magic;
 
 	evq = (struct sfxge_evq *)arg;
 	sc = evq->sc;
-	rxq = sc->rxq[label];
+	rxq = sc->rxq[rxq_index];
 
 	KASSERT(rxq != NULL, ("rxq == NULL"));
 
@@ -173,6 +174,7 @@
 	index = rxq->index;
 	evq = sc->evq[index];
 
+	label = rxq_index;
 	KASSERT((label & SFXGE_MAGIC_DMAQ_LABEL_MASK) == label,
 	("(label & SFXGE_MAGIC_DMAQ_LABEL_MASK) != level"));
 	magic = SFXGE_MAGIC_RX_QFLUSH_DONE | label;
@@ -185,17 +187,18 @@
 }
 
 static boolean_t
-sfxge_ev_rxq_flush_failed(void *arg, uint32_t label)
+sfxge_ev_rxq_flush_failed(void *arg, uint32_t rxq_index)
 {
 	struct sfxge_evq *evq;
 	struct sfxge_softc *sc;
 	struct sfxge_rxq *rxq;
 	unsigned int index;
+	unsigned int label;
 	uint16_t magic;
 
 	evq = (struct sfxge_evq *)arg;
 	sc = evq->sc;
-	rxq = sc->rxq[label];
+	rxq = sc->rxq[rxq_index];
 
 	KASSERT(rxq != NULL, ("rxq == NULL"));
 
@@ -203,6 +206,7 @@
 	index = rxq->index;
 	evq = sc->evq[index];
 
+	label = rxq_index;
 	KASSERT((label & SFXGE_MAGIC_DMAQ_LABEL_MASK) == label,
 	("(label & SFXGE_MAGIC_DMAQ_LABEL_MASK) != label"));
 	magic = SFXGE_MAGIC_RX_QFLUSH_FAILED | label;
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Solarflare LACP bug?

2014-04-18 Thread Andrew Rybchenko


Hi,

I can repeat the bug. I'll investigate and return with the patch to fix it.

Thanks for the report and sorry for delay with reply,
Andrew.

On 04/16/2014 11:00 PM, aurfalien wrote:

Hi,

I’ve a Solarflare SFN5162F dual port 10Gb ethernet adapter.

While the card works fine as individual ports, upon configuring LACP the 
machine suddenly reboots.

Here are my commands;

ifconfig sfxge0 up
ifconfig sfxge1 up
ifconfig lagg0 create
* ifconfig lagg0 up laggproto lacp laggport sfxge0 laggport sfxge1 10.0.10.99/16

* This is were the system reboots.

I believe this to be a bug, should i post this on freebsd-b...@freebsd.org

The only thing in /var/crash is minfree.

- aurf

"Janitorial Services"

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"



--
Andrew Rybchenko
OKTET Labs, St.-Petersburg, RussiaWeb: www.oktetlabs.ru
Office: +7 812 7832191  Fax: +7 812 7846591  Mobile: +7 921 7479683

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Solarflare LACP bug?

2014-04-19 Thread Andrew Rybchenko


Hi,

On 04/16/2014 11:00 PM, aurfalien wrote:

Hi,

I’ve a Solarflare SFN5162F dual port 10Gb ethernet adapter.

While the card works fine as individual ports, upon configuring LACP the 
machine suddenly reboots.

Here are my commands;

ifconfig sfxge0 up
ifconfig sfxge1 up
ifconfig lagg0 create
* ifconfig lagg0 up laggproto lacp laggport sfxge0 laggport sfxge1 10.0.10.99/16

* This is were the system reboots.

please, find patch attached. It solves the problem for me.

I'll discuss it with Solarflare and then submit patch to be pushed to 
subversion.


Regards,
Andrew.

I believe this to be a bug, should i post this on freebsd-b...@freebsd.org

The only thing in /var/crash is minfree.

- aurf

"Janitorial Services"


sfxge: check that port is started when MAC filter is set

MAC filter set may be called without softc_lock held in the case of
SIOCADDMULTI and SIOCDELMULTI ioctls. ioctl handler checks IFF_DRV_RUNNING
flag which implies port started, but it is not guaranteed to remain.
softc_lock shared lock can't be held in the case of these ioctls processing,
since it results in failure where kernel complains that non-sleepable
lock is held in sleeping thread.

Both problems are repeatable on LAG with LACP proto bring up.

Submitted by:   Andrew Rybchenko 
Sponsored by:   Solarflare Communications, Inc.

diff -r 8dc01b10eb64 sys/dev/sfxge/sfxge_port.c
--- a/sys/dev/sfxge/sfxge_port.c	Tue Apr 15 10:32:43 2014 +0100
+++ b/sys/dev/sfxge/sfxge_port.c	Sat Apr 19 14:49:46 2014 +0400
@@ -357,10 +357,21 @@
 	struct sfxge_port *port = &sc->port;
 	int rc;
 
-	KASSERT(port->init_state == SFXGE_PORT_STARTED, ("port not started"));
-
 	mtx_lock(&port->lock);
-	rc = sfxge_mac_filter_set_locked(sc);
+	/*
+	 * The function may be called without softc_lock held in the
+	 * case of SIOCADDMULTI and SIOCDELMULTI ioctls. ioctl handler
+	 * checks IFF_DRV_RUNNING flag which implies port started, but
+	 * it is not guaranteed to remain. softc_lock shared lock can't
+	 * be held in the case of these ioctls processing, since it
+	 * results in failure where kernel complains that non-sleepable
+	 * lock is held in sleeping thread. Both problems are repeatable
+	 * on LAG with LACP proto bring up.
+	 */
+	if (port->init_state == SFXGE_PORT_STARTED)
+		rc = sfxge_mac_filter_set_locked(sc);
+	else
+		rc = 0;
 	mtx_unlock(&port->lock);
 	return rc;
 }
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

sfxge: Do no allow EFSYS_MEM_ALLOC sleep

2014-05-20 Thread Andrew Rybchenko


It solves locking problem when EFSYS_MEM_ALLOC is called in
the context holding a mutex (not allowed to sleep).
E.g. on interface bring up or multicast addresses addition.
sfxge: Do no allow EFSYS_MEM_ALLOC sleep

It solves locking problem when EFSYS_MEM_ALLOC is called in
the context holding a mutex (not allowed to sleep).
E.g. on interface bring up or multicast addresses addition.

Submitted by:   Andrew Rybchenko 
Sponsored by:   Solarflare Communications, Inc.

diff --git a/head/sys/dev/sfxge/common/efsys.h b/head/sys/dev/sfxge/common/efsys.h
--- a/head/sys/dev/sfxge/common/efsys.h
+++ b/head/sys/dev/sfxge/common/efsys.h
@@ -701,7 +701,11 @@
 #define	EFSYS_KMEM_ALLOC(_esip, _size, _p)\
 	do {\
 		(_esip) = (_esip);	\
-		(_p) = malloc((_size), M_SFXGE, M_WAITOK|M_ZERO);	\
+		/*			\
+		 * The macro is used in non-sleepable contexts, for	\
+		 * example, holding a mutex.\
+		 */			\
+		(_p) = malloc((_size), M_SFXGE, M_NOWAIT|M_ZERO);	\
 	_NOTE(CONSTANTCONDITION)	\
 	} while (B_FALSE)
 
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: What is the relationship between Intel and FreeBSD in regards to igb(4)?

2011-12-19 Thread Andrew Boyer

On Dec 19, 2011, at 7:03 AM, Michael Tuexen wrote:
> On Dec 19, 2011, at 12:01 PM, Tanel Rebane wrote:
> 
>> Kevin and Jack, thank you both for your replies. As you might have guessed,
>> I was asking because I'm looking into buying a whole bunch of NIC's and I
>> feel much more reassured now. Yet, there is one additional question, as I
>> mentioned, I couldn't find much in src/sys/dev/igb, is the source for igb
>> to be found elsewhere in the src?
> They are in
> src/sys/dev/e1000/
> 

In case this is not clear, the sys/dev/e1000 folder contains three drivers:
 if_lem.c : legacy 1G cards
 if_em.c: older 1G cards
 if_igb.c: newer 1G cards

The rest of the files are common code shared among the three drivers.

-Andrew

--
Andrew Boyerabo...@averesystems.com




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Bad interaction between 82599 hardware RSC and VLANs

2012-01-13 Thread Andrew Boyer

Hello Jack,
I'm seeing an issue on 82599 controllers.  When hardware RSC is used, large 
VLAN packets arrive without the VP bit set, even though the vtag in the 
descriptor is correct.  It totally kills the receive performance.  Turning off 
hardware RSC in the driver (falling back to software LRO) works fine, as does 
turning off LRO entirely.

I've worked around the problem for now by overriding the VP bit if 
ixgbe_rxeof() finds a valid vtag in the descriptor.

Have you seen this before?

It's not in the latest errata.  It almost seems to be the opposite of what Ryan 
reported in November 2010 ("82599 receiving packets with vlan tag=0 (vlan strip 
problem)?").

Thanks,
  Andrew

------
Andrew Boyerabo...@averesystems.com




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Assigning multiple IPs in the same network to an interface

2012-02-16 Thread Andrew Boyer


On Feb 16, 2012, at 8:16 AM, Damien Fleuriot wrote:

> On 2/16/12 8:08 AM, M. V. wrote:
>> hi everybody,
>> 
>> i have a problem with setting multiple IPs in the same network in FreeBSD:
>> 
>> - suppose I assign two new IP addresses in the same network to eth0 with 
>> ifconfig:
>> #ifconfig eth0 add 192.168.10.1/24
>> #ifconfig eth0 add 192.168.10.2/24
>> 
>> - everything works fine and the output of "netstat -r" is like what it 
>> should be:
>> #netstat -r
>> 
>> 192.168.10.0   eth0
>> 192.168.10.1lo0
>> 192.168.10.2lo0
>> ...
>> 
>> - but now if I delete first IP address, connection to 192.168.10.0 network 
>> will be gone. and in output of "netstat -r" the route to 192.168.10.0 (via 
>> eth0) is gone:
>> #ifconfig eth0 delete 192.168.10.1
>> 
>> #netstat -r
>> 
>> 
>> 192.168.10.2lo0
>> .
>> 
>> - am i missing something here? shouldn't the route to the network remain in 
>> routing table (because we still have 192.168.10.2 assigned to interface)?
>> 
>> Thanks.
>> 
> 
> You shouldn't assign your secondary IP with a /24 mask, use /32.
> 
> You'll run into problems otherwise.
> 
> As a rule of thumb, your aliases = /32
> 

M.V. -
What you are doing should work fine.  There were a handful of routing table 
bugs fixed in the last few months that corrected this behavior.  The last two 
were just merged to stable/8 yesterday.  What release are you running?  

-Andrew

--
Andrew Boyerabo...@averesystems.com




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Assigning multiple IPs in the same network to an interface

2012-02-16 Thread Andrew Boyer


On Feb 16, 2012, at 9:30 AM, Rainer Bredehorn wrote:

>> i have a problem with setting multiple IPs in the same network in FreeBSD:
>> 
>> - suppose I assign two new IP addresses in the same network to eth0 with 
>> ifconfig:
>> #ifconfig eth0 add 192.168.10.1/24
>> #ifconfig eth0 add 192.168.10.2/24
>> 
> Second address should be an alias address.
> 
> ;-)
> 


'ifconfig add' and 'ifconfig alias' are the same thing.

-Andrew

--
Andrew Boyerabo...@averesystems.com




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: netisr+lagg+fragments=80% packet loss

2012-02-24 Thread Andrew Thompson

2012/2/25 Eugene Grosbein :
> 25.02.2012 00:14, Eugene Grosbein пишет:
>> This problem occurs only when net.isr.direct=0/net.isr.direct_force=0.
>> And only when lagg1 has both ports up and running. And when I use oversized 
>> pings.
>> At the same time, transit oversized pings go through this BRAS just fine,
>> no packet loss at all.
>
> Running two copies of tcpdump for igb0 and igb1 simultaneously,
> I see that fragments of the same ICMP echo-reply packet encapsulated within 
> PPPoE
> frame always go out through different ports of lagg1. Even when they arrive 
> to client in order,
> it seems this depends of switching network in between PPPoE server and client.
>

If you are running a recent HEAD then you can try setting
net.link.lagg.0.use_flowid to zero.


Andrew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Fwd: bridge interface type

2012-03-04 Thread Andrew Thompson

On 5 March 2012 18:46, Julian Elischer  wrote:
> On 3/4/12 1:36 PM, hiren panchasara wrote:
>>
>> Is this the correct mailer for such questions?
>
>
> probably n...@freebsd.org would be better.
>
> I do not understand why a bridge needs an interface type at all
> it seems a very odd way to implement it to me..  see how ng_bridge is done..
> that makes a lot more sense to me.

The bridge interface type is used in a few places of the network code.
(eg in_arpinput(), in6_ifattach()). It is needed.


Andrew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: bridge interface type

2012-03-04 Thread Andrew Thompson

> From: hiren panchasara 
>
> I created bridge1 this way:
>
> $ sudo ifconfig bridge create
> Password:
> bridge1
>
> $ ifconfig bridge1
> bridge1: flags=8802 metric 0 mtu 1500
>    ether 02:32:c8:92:b6:01
>    nd6 options=29
>    id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
>    maxage 20 holdcnt 6 proto rstp maxaddr 100 timeout 1200
>    root id 00:00:00:00:00:00 priority 0 ifcost 0 port 0
>
> but when I try to look at the interface via "struct sockaddr_dl",
> sdl = (struct sockaddr_dl *) ifa->ifa_addr;
>
> sdl->sdl_type is "IFT_ETHER" for that interface.
>
> Shouldn't it be "IFT_BRIDGE"? What am I missing here?

The address type is set in ether_ifattach() and the bridge does not
overwrite it, this means sdl_type will always be IFT_ETHER (see
if_ethersubr.c line 1003).

Here is a patch that changes it but I do not know what may break.



Index: if_bridge.c
===
--- if_bridge.c (revision 232321)
+++ if_bridge.c (working copy)
@@ -568,6 +568,7 @@ bridge_clone_create(struct if_clone *ifc, int unit
 {
struct bridge_softc *sc, *sc2;
struct ifnet *bifp, *ifp;
+   struct sockaddr_dl *sdl;
int fb, retry;
unsigned long hostid;

@@ -642,6 +643,8 @@ bridge_clone_create(struct if_clone *ifc, int unit
/* Now undo some of the damage... */
ifp->if_baudrate = 0;
ifp->if_type = IFT_BRIDGE;
+   sdl = (struct sockaddr_dl *)ifp->if_addr->ifa_addr;
+   sdl->sdl_type = IFT_BRIDGE;

mtx_lock(&bridge_list_mtx);
LIST_INSERT_HEAD(&bridge_list, sc, sc_list);
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: STP on netgraph bridge node

2012-03-13 Thread Andrew Thompson

On 14 March 2012 09:40, Julian Elischer  wrote:
> On 3/11/12 1:06 AM, h bagade wrote:
>>
>> Hi all,
>>
>> Is there any way to add STP and RSTP protocols to bridge node on
>> netgraph? Should I implement it on the node or it has done before?
>> ___
>> freebsd-net@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>>
> feel free.. I don't think it has been done,
>
> whether one adds it to that node or a separate node that hooks
> onto the side of it might be a reasonable question.. (but I don't
> know enough about STP to judge..)  I dont' know how well the code
> in if_bridge will port over, or iff it can be made a common module.

I split it off as a module 5 years ago and mentioned ng_bridge in the
commit message.

http://svnweb.freebsd.org/base?view=revision&revision=160704

Still waiting for someone to plumb it in :)


Andrew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Assigning multiple IPs in the same network to an interface

2012-03-16 Thread Andrew Boyer


On Feb 18, 2012, at 5:39 AM, Damien Fleuriot wrote:

> On 2/16/12 3:39 PM, Andrew Boyer wrote:
>> 
>> On Feb 16, 2012, at 8:16 AM, Damien Fleuriot wrote:
>> 
>>> On 2/16/12 8:08 AM, M. V. wrote:
>>>> hi everybody,
>>>> 
>>>> i have a problem with setting multiple IPs in the same network in FreeBSD:
>>>> 
>>>> - suppose I assign two new IP addresses in the same network to eth0 with 
>>>> ifconfig:
>>>> #ifconfig eth0 add 192.168.10.1/24
>>>> #ifconfig eth0 add 192.168.10.2/24
>>>> 
>>>> - everything works fine and the output of "netstat -r" is like what it 
>>>> should be:
>>>> #netstat -r
>>>> 
>>>> 192.168.10.0   eth0
>>>> 192.168.10.1lo0
>>>> 192.168.10.2lo0
>>>> ...
>>>> 
>>>> - but now if I delete first IP address, connection to 192.168.10.0 network 
>>>> will be gone. and in output of "netstat -r" the route to 192.168.10.0 (via 
>>>> eth0) is gone:
>>>> #ifconfig eth0 delete 192.168.10.1
>>>> 
>>>> #netstat -r
>>>> 
>>>> 
>>>> 192.168.10.2lo0
>>>> .
>>>> 
>>>> - am i missing something here? shouldn't the route to the network remain 
>>>> in routing table (because we still have 192.168.10.2 assigned to 
>>>> interface)?
>>>> 
>>>> Thanks.
>>>> 
>>> 
>>> You shouldn't assign your secondary IP with a /24 mask, use /32.
>>> 
>>> You'll run into problems otherwise.
>>> 
>>> As a rule of thumb, your aliases = /32
>>> 
>> 
>> M.V. -
>> What you are doing should work fine.  There were a handful of routing table 
>> bugs fixed in the last few months that corrected this behavior.  The last 
>> two were just merged to stable/8 yesterday.  What release are you running?  
>> 
>> -Andrew
>> 
> 
> This is of interest to me.
> 
> Do these fixes allow one to use say /24 aliases instead of /32 without
> running into problems ?
> 


Sorry for the long delay.  I'm not aware of any restriction on how many IPs or 
subnets you can install, as long as the subnets don't conflict.

I haven't tried IPv6, though...

-Andrew

--
Andrew Boyerabo...@averesystems.com




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

LACP kernel panics: /* unlocking is safe here */

2012-03-30 Thread Andrew Boyer

While investigating a LACP issue, I turned on LACP_DEBUG on a debug kernel.  In 
this configuration it's easy to panic the kernel - just run 'ifconfig lagg0 
laggproto lacp' on a lagg that's already in LACP mode and receiving LACP 
messages.

The problem is that lagg_lacp_detach() drops the lagg wlock (with the comment 
in the title), which allows incoming LACP messages to get through lagg_input() 
while the structure is being destroyed in lacp_detach().

There's a very simple fix, but I don't know if it's the best way to fix it.  
Resetting the protocol before calling sc_detach causes any further incoming 
packets to be dropped until the lagg gets reconfigured.  Thoughts?

Is it safe to just hold on to the lagg wlock across the callout_drain() calls 
in lacp_detach()?  That's what OpenBSD does.

-Andrew

Index: sys/net/if_lagg.c
===
--- sys/net/if_lagg.c   (revision 233707)
+++ sys/net/if_lagg.c   (working copy)
@@ -952,9 +952,10 @@
}
if (sc->sc_proto != LAGG_PROTO_NONE) {
LAGG_WLOCK(sc);
+   /* Reset protocol */
+   sc->sc_proto = LAGG_PROTO_NONE;
error = sc->sc_detach(sc);
-   /* Reset protocol and pointers */
-   sc->sc_proto = LAGG_PROTO_NONE;
+   /* Reset pointers */
sc->sc_detach = NULL;
sc->sc_start = NULL;
sc->sc_input = NULL;

--
Andrew Boyerabo...@averesystems.com




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: LACP kernel panics: /* unlocking is safe here */

2012-04-07 Thread Andrew Thompson

On 3 April 2012 00:35, John Baldwin  wrote:
> On Friday, March 30, 2012 6:04:24 pm Andrew Boyer wrote:
>> While investigating a LACP issue, I turned on LACP_DEBUG on a debug kernel.
> In this configuration it's easy to panic the kernel - just run 'ifconfig lagg0
> laggproto lacp' on a lagg that's already in LACP mode and receiving LACP
> messages.
>>
>> The problem is that lagg_lacp_detach() drops the lagg wlock (with the
> comment in the title), which allows incoming LACP messages to get through
> lagg_input() while the structure is being destroyed in lacp_detach().
>>
>> There's a very simple fix, but I don't know if it's the best way to fix it.
> Resetting the protocol before calling sc_detach causes any further incoming
> packets to be dropped until the lagg gets reconfigured.  Thoughts?
>
> This looks sensible.

Changing the order also needs an additional check as LAGG_PROTO_NONE
no longer means the detach is finished. If one ioctl sleeps then we
may nullify all the pointers upon wake that have already been set by
the other ioctl.

Does this look ok?

Index: if_lagg.c
===
--- if_lagg.c   (revision 233252)
+++ if_lagg.c   (working copy)
@@ -950,11 +950,11 @@ lagg_ioctl(struct ifnet *ifp, u_long cmd, caddr_t
error = EPROTONOSUPPORT;
break;
}
+   LAGG_WLOCK(sc);
if (sc->sc_proto != LAGG_PROTO_NONE) {
-   LAGG_WLOCK(sc);
+   /* Reset protocol first in case detach unlocks */
+   sc->sc_proto = LAGG_PROTO_NONE;
error = sc->sc_detach(sc);
-   /* Reset protocol and pointers */
-   sc->sc_proto = LAGG_PROTO_NONE;
sc->sc_detach = NULL;
sc->sc_start = NULL;
sc->sc_input = NULL;
@@ -966,8 +966,11 @@ lagg_ioctl(struct ifnet *ifp, u_long cmd, caddr_t
sc->sc_lladdr = NULL;
sc->sc_req = NULL;
sc->sc_portreq = NULL;
-   LAGG_WUNLOCK(sc);
+   } else if (sc->sc_input != NULL) {
+   /* Still detaching */
+   error = EBUSY;
}
+   LAGG_WUNLOCK(sc);
if (error != 0)
break;
for (int i = 0; i < (sizeof(lagg_protos) /
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: LACP kernel panics: /* unlocking is safe here */

2012-04-09 Thread Andrew Boyer

Makes sense to me.

-Andrew

On Apr 7, 2012, at 4:02 AM, Andrew Thompson wrote:

> On 3 April 2012 00:35, John Baldwin  wrote:
>> On Friday, March 30, 2012 6:04:24 pm Andrew Boyer wrote:
>>> While investigating a LACP issue, I turned on LACP_DEBUG on a debug kernel.
>> In this configuration it's easy to panic the kernel - just run 'ifconfig 
>> lagg0
>> laggproto lacp' on a lagg that's already in LACP mode and receiving LACP
>> messages.
>>> 
>>> The problem is that lagg_lacp_detach() drops the lagg wlock (with the
>> comment in the title), which allows incoming LACP messages to get through
>> lagg_input() while the structure is being destroyed in lacp_detach().
>>> 
>>> There's a very simple fix, but I don't know if it's the best way to fix it.
>> Resetting the protocol before calling sc_detach causes any further incoming
>> packets to be dropped until the lagg gets reconfigured.  Thoughts?
>> 
>> This looks sensible.
> 
> Changing the order also needs an additional check as LAGG_PROTO_NONE
> no longer means the detach is finished. If one ioctl sleeps then we
> may nullify all the pointers upon wake that have already been set by
> the other ioctl.
> 
> Does this look ok?
> 
> Index: if_lagg.c
> ===
> --- if_lagg.c (revision 233252)
> +++ if_lagg.c (working copy)
> @@ -950,11 +950,11 @@ lagg_ioctl(struct ifnet *ifp, u_long cmd, caddr_t
>   error = EPROTONOSUPPORT;
>   break;
>   }
> + LAGG_WLOCK(sc);
>   if (sc->sc_proto != LAGG_PROTO_NONE) {
> - LAGG_WLOCK(sc);
> + /* Reset protocol first in case detach unlocks */
> + sc->sc_proto = LAGG_PROTO_NONE;
>   error = sc->sc_detach(sc);
> - /* Reset protocol and pointers */
> - sc->sc_proto = LAGG_PROTO_NONE;
>   sc->sc_detach = NULL;
>   sc->sc_start = NULL;
>   sc->sc_input = NULL;
> @@ -966,8 +966,11 @@ lagg_ioctl(struct ifnet *ifp, u_long cmd, caddr_t
>   sc->sc_lladdr = NULL;
>   sc->sc_req = NULL;
>   sc->sc_portreq = NULL;
> - LAGG_WUNLOCK(sc);
> + } else if (sc->sc_input != NULL) {
> + /* Still detaching */
> + error = EBUSY;
>   }
> + LAGG_WUNLOCK(sc);
>   if (error != 0)
>   break;
>   for (int i = 0; i < (sizeof(lagg_protos) /

--
Andrew Boyerabo...@averesystems.com




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

getifaddrs & ipv6 scope

2012-04-12 Thread Andrew Thompson

Hi,


I have noticed that getifaddrs() does not have sin6_scope_id set to
the interface id for link local addresses on AF_INET6 types.  Running
the following program gives different results on Linux

FreeBSD:

dev: bge0 address:  scope 0
dev: xl0  address:  scope 0
dev: lo0  address: <::1> scope 0
dev: lo0  address:  scope 0

Linux:

dev: lo   address: <::1> scope 0
dev: eth1 address: <2404:130:0:1000:204:75ff:febc:b8f0> scope 0
dev: eth1 address:  scope 2
dev: eth0 address:  scope 3


Should FreeBSD be setting sin6_scope_id?

Andrew



#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

#include 
#include 

int
main(int argc, char *argv[])
{
struct ifaddrs *ifaddr, *ifa;
char host[NI_MAXHOST];
int rc;

if (getifaddrs(&ifaddr) == -1) {
perror("getifaddrs");
exit(EXIT_FAILURE);
}

for (ifa = ifaddr; ifa != NULL; ifa = ifa->ifa_next) {
struct sockaddr_in6 *in6 = (struct sockaddr_in6 *)ifa->ifa_addr;

if (ifa->ifa_addr == NULL)
continue;
if (ifa->ifa_addr->sa_family != AF_INET6)
continue;

rc = getnameinfo(ifa->ifa_addr, sizeof(struct sockaddr_in6),
host, NI_MAXHOST, NULL, 0, NI_NUMERICHOST);
if (rc != 0) {
printf("getnameinfo() failed: %s\n", gai_strerror(rc));
exit(EXIT_FAILURE);
}
printf("dev: %-8s address: <%s> scope %d\n",
ifa->ifa_name, host, in6->sin6_scope_id);
}

freeifaddrs(ifaddr);
exit(EXIT_SUCCESS);
}
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: getifaddrs & ipv6 scope

2012-04-13 Thread Andrew Thompson

On 13 April 2012 18:41, Rainer Bredehorn  wrote:
> Hi!
>
>> I have noticed that getifaddrs() does not have sin6_scope_id set to
>> the interface id for link local addresses on AF_INET6 types. Running
>> the following program gives different results on Linux
>
> ifconfig shows the scopeid according to the interface:
>
> inet6 fe80::208:9bff:fe13:784e%fxp1 prefixlen 64 scopeid 0x2
>
> Are you talking about the scope value of an multicast address or
> the scopeid for link local addresses?

I am talking about the scopeid for link local addresses which (as far
as I understand) is the interface index.


Andrew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: getifaddrs & ipv6 scope

2012-04-15 Thread Andrew Thompson

On 14 April 2012 06:03, Hajimu UMEMOTO  wrote:
> Hi,
>
>>>>>> On Fri, 13 Apr 2012 20:01:39 +1200
>>>>>> Andrew Thompson  said:
>
> thompsa> On 13 April 2012 18:41, Rainer Bredehorn  wrote:
>> Hi!
>>
>>> I have noticed that getifaddrs() does not have sin6_scope_id set to
>>> the interface id for link local addresses on AF_INET6 types. Running
>>> the following program gives different results on Linux
>>
>> ifconfig shows the scopeid according to the interface:
>>
>> inet6 fe80::208:9bff:fe13:784e%fxp1 prefixlen 64 scopeid 0x2
>>
>> Are you talking about the scope value of an multicast address or
>> the scopeid for link local addresses?
>
> thompsa> I am talking about the scopeid for link local addresses which (as far
> thompsa> as I understand) is the interface index.
>
> The issue you mentioned comes from an implementation decision of the
> KAME IPv6 stack.
> The attached patch should address it.  However, it may break the
> applications which expect that getifaddrs() returns a link-local
> address with KAME's embeded scopeid representation.  I'm not sure
> there are such applications, for now.

This is now working how I expected it. From my original test app,

dev: bge0 address:  scope 2
dev: lo0  address: <::1> scope 0
dev: lo0  address:  scope 5
dev: tun5 address:  scope 6
dev: tun3 address:  scope 7
dev: tun0 address:  scope 8


regards,
Andrew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: lagg(4) MAC address selection proposal

2012-04-17 Thread Andrew Thompson

On 18 April 2012 12:39, Ed Maste  wrote:
> When a new lagg(4) interface is created the link layer address from the
> first port in the group is assigned to the lagg and to all other lagg
> port members.  This means the address assigned to the lagg is different
> if specified as, for example, "laggport em0 laggport em1" vs
> "laggport em1 laggport em0".
>
> The code in lagg_port_create(), in if_lagg.c that chooses the first
> l2 address:
>
>   575  if (SLIST_EMPTY(&sc->sc_ports)) {
>   576          sc->sc_primary = lp;
>   577          lagg_lladdr(sc, IF_LLADDR(ifp));
>   578  } else {
>   579          /* Update link layer address for this port */
>   580          lagg_port_lladdr(lp, IF_LLADDR(sc->sc_ifp));
>   581  }
>
> For the current modes lagg supports this probably doesn't matter much,
> but we have some improvements in the pipeline for which this behaviour
> is undesirable.  (The first of which is an interface for choosing a
> different master; this allows a failover lagg to be set to transmit on a
> new port, without changing link states.  With the current behaviour this
> causes all ports in the lagg to then change their l2 address.)
>
> In looking into potential solutions I found that the bridgestp code in
> bridge(4) searches the list of associated MAC addresses and uses the
> lowest one when it needs to select one from a group.  I'd like to
> propose using the same logic for lagg's MAC address selection.  Can
> anyone foresee an issue with this change?  (I'm not aware of any lagg
> use cases that rely on the current behaviour.)

I do not foresee any issues. What we also need is a event trigger for
various pseudo interfaces when the mac or primary interface changes,
this would allow arp/nd6 to rebroadcast.


Andrew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Bad interaction between 82599 hardware RSC and VLANs

2012-04-25 Thread Andrew Boyer

Any update on this?

-Andrew

On Jan 13, 2012, at 6:04 PM, Jack Vogel wrote:

> Hey Andrew,
> 
> Not heard of this before, but I'll check around.
> 
> Jack
> 
> 
> On Fri, Jan 13, 2012 at 3:01 PM, Andrew Boyer  wrote:
> Hello Jack,
> I'm seeing an issue on 82599 controllers.  When hardware RSC is used, large 
> VLAN packets arrive without the VP bit set, even though the vtag in the 
> descriptor is correct.  It totally kills the receive performance.  Turning 
> off hardware RSC in the driver (falling back to software LRO) works fine, as 
> does turning off LRO entirely.
> 
> I've worked around the problem for now by overriding the VP bit if 
> ixgbe_rxeof() finds a valid vtag in the descriptor.
> 
> Have you seen this before?
> 
> It's not in the latest errata.  It almost seems to be the opposite of what 
> Ryan reported in November 2010 ("82599 receiving packets with vlan tag=0 
> (vlan strip problem)?").
> 
> Thanks,
>  Andrew
> 
> --
> Andrew Boyerabo...@averesystems.com
> 
> 
> 
> 
> ___
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
> 

--
Andrew Boyerabo...@averesystems.com




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Major performance hit with ToS setting

2012-05-25 Thread Andrew Gallatin


On 05/24/12 18:55, Kevin Oberman wrote:



This is,of course, on a 10G interface. On 7.3 there is little


Hi Kevin,


What you're seeing looks almost like a checksum is bad, or
there is some other packet damage.  Do you see any
error counters increasing if you run netstat -s before
and after the test & compare the results?

Thinking that, perhaps, this was a bug in my mxge(4), I attempted
to reproduce it this morning between  8.3 and 9.0 boxes and
failed to see the bad behavior..

% nuttcp-6.1.2 -c32t -t diablo1-m < /dev/zero
 9161.7500 MB /  10.21 sec = 7526.5792 Mbps 53 %TX 97 %RX 0 
host-retrans 0.11 msRTT

% nuttcp-6.1.2  -t diablo1-m < /dev/zero
 9140.6180 MB /  10.21 sec = 7509.8270 Mbps 53 %TX 97 %RX 0 
host-retrans 0.11 msRTT



However, I don't have any 8.2-r box handy, so I cannot
exactly repro your experiment...


Drew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: [please review] TSO mbuf chain length limiting patch

2012-05-30 Thread Andrew Gallatin


On 05/30/12 10:59, Colin Percival wrote:

Hi all,

The Xen virtual network interface has an issue (ok, really the issue is with
the linux back-end, but that's what most people are using) where it can't
handle scatter-gather writes with lots of pieces, aka. long mbuf chains.
This currently bites us hard with TSO enabled, since it produces said long
mbuf chains.


Colin,

Thanks for pointing me at this.  I've been talking about this
with bz@ a little.

I've never been clear about what the max TSO size supported by FreeBSD
is.  The NIC I maintain (mxge) is limited to 64K - epsilon for both
IPv4 *AND* IPv6.  Up until now, this has been enforced by the 16-bit
ip length limit of IPv4 and we have not had IPv6 TSO until this week.
With IPv6, I'm worried that FreeBSD may now send packets down larger
than I could handle.  In my case, however, the problem is not s/g list
length, but rather it is internal limits in the NIC which limit us to
64K - epsilon for IPv6 as well.  I think there may be other NICs in
the same boat for IPv6 (and maybe even some which cannot handle the
full 64K for IPv4).

Your approach would not work well for my size limit.  For
example, I'd have to set the limit to 4 mbufs to stay under 64KB.
This would be assuming the worst case of 16KB jumbo mbufs, so
that would limit me to ~8KB per TSO if 2KB mbufs were used.

I think a better approach would be to have a limit on the size of the
pre-segmented TCP payload size sent to the driver.  I tend to think
that this would be more generically useful, and it is a better match
for the NDIS APIs, where a driver must specify the max TSO size.  I
think the changes to the TCP stack might be simpler (eg, they
would seem to jive better with the existing "maxmtu" approach).

I think this could work for you as well.  You could set the Xen max
tso size to be 32K (derived from 18 pages/skb, multiplied by a typical
2KB mbuf size, with some slack built in).  If the chain was too large,
you could m_defrag it down to size.



Drew


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: [please review] TSO mbuf chain length limiting patch

2012-05-30 Thread Andrew Gallatin


On 05/30/12 18:35, Colin Percival wrote:

On 05/30/12 08:30, Andrew Gallatin wrote:

On 05/30/12 10:59, Colin Percival wrote:

The Xen virtual network interface has an issue (ok, really the issue is with
the linux back-end, but that's what most people are using) where it can't
handle scatter-gather writes with lots of pieces, aka. long mbuf chains.
This currently bites us hard with TSO enabled, since it produces said long
mbuf chains.


I've never been clear about what the max TSO size supported by FreeBSD
is.  The NIC I maintain (mxge) is limited to 64K - epsilon for both
IPv4 *AND* IPv6.  Up until now, this has been enforced by the 16-bit
ip length limit of IPv4 and we have not had IPv6 TSO until this week.
With IPv6, I'm worried that FreeBSD may now send packets down larger
than I could handle.  In my case, however, the problem is not s/g list
length, but rather it is internal limits in the NIC which limit us to
64K - epsilon for IPv6 as well.  I think there may be other NICs in
the same boat for IPv6 (and maybe even some which cannot handle the
full 64K for IPv4).

Your approach would not work well for my size limit.  For
example, I'd have to set the limit to 4 mbufs to stay under 64KB.
This would be assuming the worst case of 16KB jumbo mbufs, so
that would limit me to ~8KB per TSO if 2KB mbufs were used.


Right, the problem you describe isn't the one I was trying to solve. :-)


I think a better approach would be to have a limit on the size of the
pre-segmented TCP payload size sent to the driver.  I tend to think
that this would be more generically useful, and it is a better match
for the NDIS APIs, where a driver must specify the max TSO size.  I
think the changes to the TCP stack might be simpler (eg, they
would seem to jive better with the existing "maxmtu" approach).

I think this could work for you as well.  You could set the Xen max
tso size to be 32K (derived from 18 pages/skb, multiplied by a typical
2KB mbuf size, with some slack built in).  If the chain was too large,
you could m_defrag it down to size.


Sounds good -- I don't want to m_defrag too often, but I imagine in most
cases when TSO is being invoked most of the mbufs will have 2 kB each.
This should also make the patch simpler by avoiding the need to modify
uipc_mbuf.c; if we just limit the TSO payload size then the TCP stack can
figure things out by itself.

Are you working on a patch, or should I put one together?



No, I'd like to, but I'm afraid that I just don't have the time
right now.  I would very much appreciate it if you could put it
together.  I'd be happy to review it.

Thanks,

Drew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: some questions on virtual machine bridging.

2012-05-30 Thread Andrew Gallatin


On 05/28/12 12:12, Luigi Rizzo wrote:

I am doing some experiments with implementing a software bridge
between virtual machines, using netmap as the communication API.

I have a first prototype up and running and it is quite fast (10 Mpps
with 60-byte frames, 4 Mpps with 1500 byte frames, compared to the
~500-800Kpps @60 bytes that you get with the tap interface used by
openvswitch or the native linux bridging).


That is awesome!


   - and of course, using PCI passthrough you get more or less hw speed
 (constrained by the OS), but need support from an external switch
 or the NIC itself to do forwarding between different ports.
   anything else ?


In terms of PCI passthrough / SR-IOV there are the emerging/competing
EVB and VEPA standards to allow VM<->VM communication to
go on the wire to a "real" switch, then back to the correct VM.


* any high-performance virtual switching solution around ?
   As mentioned, i have measured native linux bridging and in-kernel ovs
   and the numbers are above (not surprising; the tap involves a syscall
   on each packet if i am not mistaken, and internally you need a
   data copy)


You should probably compare to ESXi.  I've seen ~1Mpps going to or from
from 1..N VMs and in or out a port on a 10GbE interface with ESX4
and newer.

Drew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Major performance hit with ToS setting

2012-06-03 Thread Andrew Gallatin


On 06/03/12 01:18, Kevin Oberman wrote:


What can I say but that you are right. When I looked at the interface
stats I found that the link overflow drops were through the roof! This
confuses me a bit since the traffic is outbound and I woudl assume


Indeed, link overflow is incoming traffic that was dropped
due to lack of rx resources.  If you have flow control
disabled, it is drops simply due to lack of space in the
rx fifo.  If you have flow control enabled, link overflow
can include drops due to lack of host rx buffers as well.
For primarily WAN traffic, we suggest that flow control
be disabled (it is enabled by default).  With f/c disabled,
drops due to lack of rx buffers are counted as
dropped_no_[big|small]_buffer

At any rate, it is surprising to see link overflow increase
on an outgoing unidirectional test.  Is there other
incoming traffic that you might not have been aware of?
The only really unlikely thing I can think of is if something
is buffering tens of thousands of acks and dumping them
all at once.

Drew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: [please review] TSO mbuf chain length limiting patch

2012-06-03 Thread Andrew Gallatin


On 06/03/12 12:51, Colin Percival wrote:

On 05/30/12 08:30, Andrew Gallatin wrote:

On 05/30/12 10:59, Colin Percival wrote:

The Xen virtual network interface has an issue (ok, really the issue is with
the linux back-end, but that's what most people are using) where it can't
handle scatter-gather writes with lots of pieces, aka. long mbuf chains.
This currently bites us hard with TSO enabled, since it produces said long
mbuf chains.


I think a better approach would be to have a limit on the size of the
pre-segmented TCP payload size sent to the driver.  I tend to think
that this would be more generically useful, and it is a better match
for the NDIS APIs, where a driver must specify the max TSO size.  I
think the changes to the TCP stack might be simpler (eg, they
would seem to jive better with the existing "maxmtu" approach).

I think this could work for you as well.  You could set the Xen max
tso size to be 32K (derived from 18 pages/skb, multiplied by a typical
2KB mbuf size, with some slack built in).  If the chain was too large,
you could m_defrag it down to size.


I've attached a new patch which:
1. adds a IFCAP_TSO_MSS "capability" and a if_tx_tso_mss field to struct ifnet,
2. sets these in netfront when the IFCAP_TSO4 flag is set,
3. extends tcp_maxmtu to read this value,
4. adds a tx_tso_mss field to struct tcpcb,
5. makes tcp_mss_update set tx_tso_mss using tcp_maxmtu, and
6. limits TSO lengths to tx_tso_mss in tcp_output.



Looks good to me.  I don't pretend to understand the bowels of
the TCP stack, so I can't comment on the "sendalot" stuff to
force segmentation.  I assume it works as well as your
previous patch to solve the problems you were seeing with Xen?

One minor nit is that I envision this limit to always be
64K or less, so you could probably get away with stealing
16 bits from ifcspare.

The other trivial nit is that I would have made these new
if_tx_tso_mss and  t_tx_tso_mss fields unsigned.

Thanks so much for doing this!

Drew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: 'ifconfig tun0 destroy' gets stuck

2012-06-07 Thread Andrew Thompson

On 7 June 2012 19:08, Andriy Gapon  wrote:
>
> I experience a problem where vpnc can not exit cleanly and gets stuck.
> pstree shows this chain:
>  |-+= 31375 root vpnc
>  | \-+- 13412 root /bin/sh /usr/local/sbin/vpnc-script-custom
>  |   \--- 13446 root ifconfig tun0 destroy
>
> $ procstat -k 13446
>  PID    TID COMM             TDNAME           KSTACK
> 13446 102739 ifconfig         -                mi_switch sleepq_switch
> sleepq_wait _cv_wait_unlock tun_destroy tun_clone_destroy ifc_simple_destroy
> if_clone_destroyif if_clone_destroy ifioctl soo_ioctl kern_ioctl sys_ioctl
> amd64_syscall Xfast_syscall
>
> My system is FreeBSD 10.0-CURRENT amd64 r236503.
>
> I think that this started happening recently but I am not sure exactly when.
> Maybe after recent vpnc-scripts update or maybe after base system + kernel 
> update.

This means the tun device is still open, this behavior hasn't changed
in 3.5 years.

http://svnweb.freebsd.org/base?view=revision&revision=186391


Andrew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Panic with if_bridge when removing components

2012-06-10 Thread Andrew Thompson

On 10 June 2012 02:27, Gustau Perez Querol  wrote:
>  Hi,
>
>  I'm seeing panics when removing an interface of a bridge. The system runs
> HEAD/AMD64 r236733. I see no changes to if_bridge.c in the last two days, so
> I would say the problem's still there. I also checked stable and the problem
> should be there too.
>
>  The problem is that I have a bridge composed of two ethernet interfaces, an
> ath interface and a tap. As soon as I remove any of them the system panics.
> Because the system runs openvpn with the tap connected to the bridge, when
> the system starts to reboot, the openvpn daemon removes the tap and thus
> causing also the panic.
>
>  The panic comes because at sys/net/if_bridge.c:943 the struct
> *ifnet->if_bridge of the interface removed is set to NULL too early. Because
> of this, at sys/net/if_bridge.c:996 we call if_bridge.c:bridge_linkstate
> where the struct *ifnet->if_bridge is needed. This causes the panic.


I introduced this issue in r234487, please try this patch.
http://people.freebsd.org/~thompsa/bridge_link.diff

regards,
Andrew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: lagg speed trouble

2012-07-04 Thread Andrew Thompson

On 4 July 2012 23:30, Vyacheslav Kulikovskyy  wrote:
> i have sever with two 1G links (em) aggregated by lagg0
>
> after 1700Megabits i have collisions/errors on lagg0 port, but not on em0
> or em1
>
> I'm using nginx in own CDN. and server don't limited my mbufs, irq, or
> anything else.. only lagg0 errors (

> netstat -w 1 -I lagg0
> input(lagg0)   output
>packets  errs idrops  bytespackets  errs  bytes colls
>  87964 0 05474019  78172  1964  20549 0
>  88842 0 05533987  78852  1811  222578109 0
>  87687 0 05454717  77279  2416  86391 0
>  87995 0 05471653  78090  2040  223488046 0
>  88314 0 05493348  78495  1994  222548964 0
>  88411 0 05502818  78228  1949  14374 0
>
> how i can get full link speed on this server?

This probably means the packet could not be queued on the lagg
interface send queue.  Please try this patch.


Andrew


lagg_transmit.diff
Description: Binary data
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: lagg speed trouble

2012-07-05 Thread Andrew Thompson

On 6 July 2012 04:43, Vyacheslav Kulikovskyy  wrote:
> 2012/7/4 Andrew Thompson 
>>
>> On 4 July 2012 23:30, Vyacheslav Kulikovskyy  wrote:
>> > i have sever with two 1G links (em) aggregated by lagg0
>> >
>> > after 1700Megabits i have collisions/errors on lagg0 port, but not on
>> > em0
>> > or em1
>> >
>> > I'm using nginx in own CDN. and server don't limited my mbufs, irq, or
>> > anything else.. only lagg0 errors (
>>
>> > netstat -w 1 -I lagg0
>> > input(lagg0)   output
>> >packets  errs idrops  bytespackets  errs  bytes colls
>> >  87964 0 05474019  78172  1964  20549 0
>> >  88842 0 05533987  78852  1811  222578109 0
>> >  87687 0 05454717  77279  2416  86391 0
>> >  87995 0 05471653  78090  2040  223488046 0
>> >  88314 0 05493348  78495  1994  222548964 0
>> >  88411 0 05502818  78228  1949  14374 0
>> >
>> > how i can get full link speed on this server?
>>
>> This probably means the packet could not be queued on the lagg
>> interface send queue.  Please try this patch.
>>
> this patch don't help (, on switch errors not found.
>

Can you be more specific. Did the patch fail to work or was there no
change in the speed? I don't know what you mean by "on switch errors
not found"


Andrew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: lacp lagg port flags do not show correctly resulting in poor traffic distribution/performance

2012-07-10 Thread Andrew Boyer


On Jul 9, 2012, at 8:38 PM, Adarsh Joshi wrote:

> Hi,
> 
> I am trying to configure lacp lagg interfaces with 2 systems connected back 
> to back as follows:
> 
> Ifconfig lagg0 create
> Ifconfig lagg0 laggproto lacp laggport ql0 laggport ql1 192.168.100.1 netmask 
> 255.255.255.0
> 
> Sometimes, the lag interface comes up correctly but sometimes the laggport 
> flags do not show properly. Instead of 1c, it 
> shows values of 18. I have seen similar issues reported on various forums 
> with no solution.
> Looking at the lagg driver code and reading the standard, I thought the 
> laggport flags ( defined in if_lagg.h) are based on the LACP_STATE_BITS in 
> file ieee8023ad_lacp.h. But the following ifconfig -v output does not make 
> any sense to me.
> 
> My concern is that when all the interfaces show flags as 1c, the traffic is 
> distributed across both the interfaces uniformly and I get aggregated 
> throughput. If not, the traffic flows only on 1 interface.
> 
> Is this a bug? How do I solve this? Or am I doing something wrong?
> 
> I am using Free-BSD 9.0 release.
> 
> System 1:
> # ifconfig -v lagg0
>lag id: [(8000,00-0E-1E-08-05-20,0213,,),
> (8000,00-0E-1E-04-2C-F0,0213,,)]
>laggport: ql1 flags=18 state=7D
>[(8000,00-0E-1E-08-05-20,0213,8000,000F),
> (,00-00-00-00-00-00,,,)]
>laggport: ql0 flags=1c state=3D
>[(8000,00-0E-1E-08-05-20,0213,8000,000E),
> (8000,00-0E-1E-04-2C-F0,0213,8000,000E)]
> 
> System 2:
> 
> # ifconfig -v lagg0
>lag id: [(8000,00-0E-1E-04-2C-F0,0213,,),
> (,00-00-00-00-00-00,,,)]
>laggport: ql1 flags=1c state=7D
>   [(8000,00-0E-1E-04-2C-F0,0213,8000,000F),
> (,00-00-00-00-00-00,,,)]
>laggport: ql0 flags=18 state=3D
>[(8000,00-0E-1E-04-2C-F0,0213,8000,000E),
> (8000,00-0E-1E-08-05-20,0213,8000,000E)]
> 
> 
> thanks
> Adarsh
> 


I don't think you have a port flags problem per se; the flags are correctly 
displaying the state of the lagg.  Your problem is that your systems aren't 
negotiating the correct lagg configuration.  Each tuple after the laggport 
represents the [(actor state),(partner state)].  Ports ql0 have been able to 
talk to their partners (each other).  Neither ql1 port has seen a response from 
a partner, though.

You could try restarting the state machine on one box with 'ifconfig lagg0 
laggproto lacp'.  To see the negotiation you'll need to rebuild your kernel 
with '#define LACP_DEBUG 1' added to the top of sys/net/ieee802.3ad_lacp.c.  Or 
upgrade to a newer stable snapshot that has the net.lacp_debug sysctl and turn 
it on.

Or just turn off LACP.  What does it get you in this configuration?

Hope this helps,
  Andrew

--
Andrew Boyerabo...@averesystems.com




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Interface MTU question...

2012-07-12 Thread Andrew Boyer

On Jul 12, 2012, at 12:55 PM, Jason Hellenthal wrote:
> Something else to look into ... 
> 
> # ifconfig lagg0 mtu 1492
> ifconfig: ioctl (set mtu): Invalid argument
> 
> This is on stable/8 r238264 when the interface was up/up and down/down
> 
> Also attempted on the member interfaces dc0 and dc1


It's disabled by default, but I don't know why.  This seems to work for us.

-Andrew

Index: sys/net/if_lagg.c
===
--- sys/net/if_lagg.c   (revision 238402)
+++ sys/net/if_lagg.c   (working copy)
@@ -752,8 +752,18 @@
break;
 
case SIOCSIFMTU:
-   /* Do not allow the MTU to be changed once joined */
-   error = EINVAL;
+   LAGG_WLOCK(sc);
+   SLIST_FOREACH(lp, &sc->sc_ports, lp_entries) {
+   if (!error) {
+   /* Call the base ioctl for each port */
+   error = (*lp->lp_ioctl)(lp->lp_ifp, cmd, data);
+   }
+   }
+   if (!error) {
+   /* Update the aggregate MTU */
+   sc->sc_ifp->if_mtu = ifr->ifr_mtu;
+   }
+   LAGG_WUNLOCK(sc);
break;
 
default:

--
Andrew Boyerabo...@averesystems.com




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: lacp lagg port flags do not show correctly resulting in poor traffic distribution/performance

2012-07-27 Thread Andrew Boyer

Adarsh,
Sorry for the delay.

I'm not an LACP protocol expert, but looking at your logs I don't see ql1 on 
either node receiving a lacpdu response.  Are you certain that link is working?

-Andrew

On Jul 10, 2012, at 1:53 PM, Adarsh Joshi wrote:

> Andrew,
> 
> Here are the logs with LACP_DEBUG defined in ieee802.3ad_lacp.c,
> 
> after typing
> 
> Ifconfig lagg0 create
> ifconfig lagg0 laggproto lacp laggport ql0 laggport ql1 192.168.100.1 netmask 
> 255.255.255.0
> 
> I compiled it as a standalone driver by the way.
> 
> System 1:
> 
> # ifconfig -v lagg0
> lagg0: flags=8843 metric 0 mtu 1500
>options=13b
>ether 00:0e:1e:08:05:20
>inet 192.168.100.1 netmask 0xff00 broadcast 192.168.100.255
>nd6 options=29
>media: Ethernet autoselect
>status: active
>groups: lagg
>laggproto lacp
>lag id: [(8000,00-0E-1E-08-05-20,01D3,,),
> (8000,00-0E-1E-04-2C-F0,0213,,)]
>laggport: ql1 flags=18 state=7D
>[(8000,00-0E-1E-08-05-20,01D3,8000,000D),
> (,00-00-00-00-00-00,,,)]
>laggport: ql0 flags=1c state=3D
>[(8000,00-0E-1E-08-05-20,01D3,8000,000C),
> (8000,00-0E-1E-04-2C-F0,0213,8000,000E)]
> 
> 
> System 2:
> 
> # ifconfig -v lagg0
> lagg0: flags=8843 metric 0 mtu 1500
>options=13b
>ether 00:0e:1e:04:2c:f0
>inet 192.168.100.2 netmask 0xff00 broadcast 192.168.100.255
>nd6 options=29
>media: Ethernet autoselect
>status: active
>groups: lagg
>laggproto lacp
>lag id: [(8000,00-0E-1E-04-2C-F0,0213,,),
> (,00-00-00-00-00-00,,,)]
>laggport: ql1 flags=1c state=7D
>[(8000,00-0E-1E-04-2C-F0,0213,8000,000F),
> (,00-00-00-00-00-00,,,)]
>laggport: ql0 flags=18 state=3D
>[(8000,00-0E-1E-04-2C-F0,0213,8000,000E),
> (8000,00-0E-1E-08-05-20,01D3,8000,000C)]
> 
> 
> System 1 logs :
> 
> Jul 10 10:38:49 bsd-14 kernel: lacp_attach[738] : lacp attached
> Jul 10 10:38:49 bsd-14 kernel: lacp_attach[740] : lacp_defined
> Jul 10 10:38:49 bsd-14 kernel: lagg0: link state changed to UP
> Jul 10 10:38:49 bsd-14 kernel: ql0: media changed 0x0 -> 0x100033, ether = 1, 
> fdx = 1, link = 1
> Jul 10 10:38:49 bsd-14 kernel: ql0: -> UNSELECTED
> Jul 10 10:38:49 bsd-14 kernel: ql1: media changed 0x0 -> 0x100033, ether = 1, 
> fdx = 1, link = 1
> Jul 10 10:38:49 bsd-14 kernel: ql1: -> UNSELECTED
> Jul 10 10:38:49 bsd-14 kernel: lacp_select_tx_port: no active aggregator
> Jul 10 10:38:50 bsd-14 kernel: ql1: port 
> lagid=[(8000,00-0E-1E-08-05-20,01D3,8000,000D),(,00-00-00-00-00-00,,,)]
> Jul 10 10:38:50 bsd-14 kernel: ql1: aggregator created
> Jul 10 10:38:50 bsd-14 kernel: ql1: aggregator 
> lagid=[(8000,00-0E-1E-08-05-20,01D3,,),(,00-00-00-00-00-00,,,)]
> Jul 10 10:38:50 bsd-14 kernel: ql1: mux_state 0 -> 1
> Jul 10 10:38:50 bsd-14 kernel: ql0: port 
> lagid=[(8000,00-0E-1E-08-05-20,01D3,8000,000C),(,00-00-00-00-00-00,,,)]
> Jul 10 10:38:50 bsd-14 kernel: ql0: aggregator created
> Jul 10 10:38:50 bsd-14 kernel: ql0: aggregator 
> lagid=[(8000,00-0E-1E-08-05-20,01D3,,),(,00-00-00-00-00-00,,,)]
> Jul 10 10:38:50 bsd-14 kernel: ql0: mux_state 0 -> 1
> Jul 10 10:38:51 bsd-14 kernel: ql1: lacpdu transmit
> Jul 10 10:38:51 bsd-14 kernel: actor=(8000,00-0E-1E-08-05-20,01D3,8000,000D)
> Jul 10 10:38:51 bsd-14 kernel: actor.state=85
> Jul 10 10:38:51 bsd-14 kernel: partner=(,00-00-00-00-00-00,,,)
> Jul 10 10:38:51 bsd-14 kernel: partner.state=2
> Jul 10 10:38:51 bsd-14 kernel: maxdelay=0
> Jul 10 10:38:51 bsd-14 kernel: ql0: lacpdu transmit
> Jul 10 10:38:51 bsd-14 kernel: actor=(8000,00-0E-1E-08-05-20,01D3,8000,000C)
> Jul 10 10:38:51 bsd-14 kernel: actor.state=85
> Jul 10 10:38:51 bsd-14 kernel: partner=(,00-00-00-00-00-00,,,)
> Jul 10 10:38:51 bsd-14 kernel: partner.state=2
> Jul 10 10:38:51 bsd-14 kernel: maxdelay=0
> Jul 10 10:38:51 bsd-14 kernel: ql0: lacpdu receive
> Jul 10 10:38:51 bsd-14 kernel: actor=(8000,00-0E-1E-04-2C-F0,0213,8000,000E)
> Jul 10 10:38:51 bsd-14 kernel: actor.state=5
> Jul 10 10:38:51 bsd-14 kernel: partner=(8000,00-0E-1E-08-05-20,01D3,8000,000C)
> Jul 10 10:38:51 bsd-14 kernel: partner.state=85
> Jul 10 10:38:51 bsd-14 kernel: maxdelay=0
> Jul 10 10:38:51 bsd-14 kernel: ql0: old pstate 2
> Jul 10 10:38:51 bsd-14 kernel: ql0: new pstate 5
> Jul 10 10:38:51 bsd-14 kernel: ql0: partner timeout c

[patch] ixgbe stats cleanup

2012-08-06 Thread Andrew Boyer

This patch fixes some nits in the ixgbe driver statistics:
 - Only read FCCRC and FCLAST on 82599+
 - Store total_missed_rx in stats.mpctotal, and display it in a sysctl
 - Don't increment if_opackets and if_ipackets every packet; they're 
overwritten by hw stats collection
 - Increment adapter->dropped_pkts instead of if_ierrors; if_ierrors is 
overwritten by hw stats collection
 - Include adapter->dropped_pkts in the calculation of if_ierrors
 - Increment rxr->packets so that AIM works

Comments welcome.

-Andrew



ixgbe_stats.diff
Description: Binary data


--
Andrew Boyerabo...@averesystems.com




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

[patch] e1000 stats cleanup

2012-08-08 Thread Andrew Boyer

This patch fixes a nit in the em, lem, and igb driver statistics, similar to 
what I proposed for ixgbe a few days ago.  Increment adapter->dropped_pkts 
instead of if_ierrors because if_ierrors is overwritten by hw stats collection.

Comments welcome.

-Andrew



e1000_stats.diff
Description: Binary data


--
Andrew Boyerabo...@averesystems.com




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

[patch] e1000 / lem handling of TSO defragmentation

2012-08-08 Thread Andrew Boyer

Similar to what was done for em in r220254, this patch improves the error 
handling in lem when a TSO packet has too many segments to DMA.  This improves 
the behavior when either call to bus_dmamap_load_mbuf_sg() returns ENOMEM.  
Although we don't need to go back and redo any offload calculation in lem, 
doing it this way reduces code duplication.

Comments welcome.

-Andrew



e1000_tso_defrag.diff
Description: Binary data


--
Andrew Boyerabo...@averesystems.com




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: [CFT] if_transmit method for lagg(4)

2012-09-20 Thread Andrew Thompson

On 20 September 2012 19:47, Gleb Smirnoff  wrote:
>   Hi,
>
>   Yet another patch to test. Was suprising to me that lagg(4), which
> aims at high-performance, still utilizes if_start.
>
>   Attached is patch that converts lagg(4) to use if_transmit. I'd
> appreciate if someone who do use lagg(4) tests the patch. If anyone
> benchmarks lagg(4) with and w/o patch that will be most appreciated.

Sean Bruno has already tested this patch at Yahoo, I have just been
delayed in committing it. There are just a few small differences so we
can commit one or merge.


Andrew


lagg_transmit.diff
Description: Binary data
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: [CFT] if_transmit method for lagg(4)

2012-09-20 Thread Andrew Thompson

On 20 September 2012 20:48, Gleb Smirnoff  wrote:
>   Hi!
>
> On Thu, Sep 20, 2012 at 08:37:19PM +1200, Andrew Thompson wrote:
> A> >   Yet another patch to test. Was suprising to me that lagg(4), which
> A> > aims at high-performance, still utilizes if_start.
> A> >
> A> >   Attached is patch that converts lagg(4) to use if_transmit. I'd
> A> > appreciate if someone who do use lagg(4) tests the patch. If anyone
> A> > benchmarks lagg(4) with and w/o patch that will be most appreciated.
> A>
> A> Sean Bruno has already tested this patch at Yahoo, I have just been
> A> delayed in committing it. There are just a few small differences so we
> A> can commit one or merge.
>
> Also fabient@ replied to me in private with this patch :)
>
> Hmm, I've missed stats update. I have merge statistics updates from your patch
> to mine and here it is attached.

Looks good, please commit.

There is just a stray `i` here

+static int lagg_transmit(struct ifnet *i, struct mbuf *);


Andrew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: kern/172113: [panic] [e1000] [patch] 9.1-RC1/amd64 panices in igb(4): m_getjcl: invalid cluster type

2012-10-24 Thread Andrew Filonov

The following reply was made to PR kern/172113; it has been noted by GNATS.

From: Andrew Filonov 
To: bug-follo...@freebsd.org, egrosb...@rdtc.ru
Cc: j...@freebsd.org
Subject: Re: kern/172113: [panic] [e1000] [patch] 9.1-RC1/amd64 panices in
 igb(4): m_getjcl: invalid cluster type
Date: Wed, 24 Oct 2012 14:49:41 +0400

 --000e0ce03f94006e2c04cccbd81d
 Content-Type: multipart/alternative; boundary=000e0ce03f94006e2504cccbd81b
 
 --000e0ce03f94006e2504cccbd81b
 Content-Type: text/plain; charset=ISO-8859-1
 
 We have same problem with HP DL160g6 servers with RELENG8.
 
 Simpliest workaround for me is
 if_igb_load="YES" in /boot/loader.conf
 
 Patch, corrected for 8.3 included
 
 --000e0ce03f94006e2504cccbd81b
 Content-Type: text/html; charset=ISO-8859-1
 
 We have same problem with HP DL160g6 servers with RELENG8.Simpliest 
workaround for me is if_igb_load="YES" in 
/boot/loader.confPatch, corrected for 8.3 included
 
 --000e0ce03f94006e2504cccbd81b--
 --000e0ce03f94006e2c04cccbd81d
 Content-Type: text/plain; charset=US-ASCII; name="igb-path-8.txt"
 Content-Disposition: attachment; filename="igb-path-8.txt"
 Content-Transfer-Encoding: base64
 X-Attachment-Id: f_h8obei320
 
 LS0tIHN5cy9kZXYvZTEwMDAvaWZfaWdiLmMub3JpZwkyMDEyLTEwLTA5IDIyOjAyOjA1LjAwMDAw
 MDAwMCArMDQwMAorKysgc3lzL2Rldi9lMTAwMC9pZl9pZ2IuYwkyMDEyLTEwLTI0IDE0OjMzOjEz
 LjAwMDAwMDAwMCArMDQwMApAQCAtMTMwNSw5ICsxMzA1LDYgQEAKIAkvKiBEb24ndCBsb3NlIHBy
 b21pc2N1b3VzIHNldHRpbmdzICovCiAJaWdiX3NldF9wcm9taXNjKGFkYXB0ZXIpOwogCi0JaWZw
 LT5pZl9kcnZfZmxhZ3MgfD0gSUZGX0RSVl9SVU5OSU5HOwotCWlmcC0+aWZfZHJ2X2ZsYWdzICY9
 IH5JRkZfRFJWX09BQ1RJVkU7Ci0KIAljYWxsb3V0X3Jlc2V0KCZhZGFwdGVyLT50aW1lciwgaHos
 IGlnYl9sb2NhbF90aW1lciwgYWRhcHRlcik7CiAJZTEwMDBfY2xlYXJfaHdfY250cnNfYmFzZV9n
 ZW5lcmljKCZhZGFwdGVyLT5odyk7CiAKQEAgLTEzMzMsNiArMTMzMCw5IEBACiAJLyogU2V0IEVu
 ZXJneSBFZmZpY2llbnQgRXRoZXJuZXQgKi8KIAogCWUxMDAwX3NldF9lZWVfaTM1MCgmYWRhcHRl
 ci0+aHcpOworCisJaWZwLT5pZl9kcnZfZmxhZ3MgfD0gSUZGX0RSVl9SVU5OSU5HOworCWlmcC0+
 aWZfZHJ2X2ZsYWdzICY9IH5JRkZfRFJWX09BQ1RJVkU7CiB9CiAKIHN0YXRpYyB2b2lkCkBAIC0x
 NTQ3LDYgKzE1NDcsMTEgQEAKIAlFMTAwMF9XUklURV9SRUcoJmFkYXB0ZXItPmh3LCBFMTAwMF9F
 SU1DLCBxdWUtPmVpbXMpOwogCSsrcXVlLT5pcnFzOwogCisJaWYgKCEoYWRhcHRlci0+aWZwLT5p
 Zl9kcnZfZmxhZ3MgJiBJRkZfRFJWX1JVTk5JTkcpKSB7CisJCXJldHVybjsKKwl9CisJbW9yZV9y
 eCA9IGlnYl9yeGVvZihxdWUsIGFkYXB0ZXItPnJ4X3Byb2Nlc3NfbGltaXQsIE5VTEwpOworCiAJ
 SUdCX1RYX0xPQ0sodHhyKTsKIAlpZ2JfdHhlb2YodHhyKTsKICNpZiBfX0ZyZWVCU0RfdmVyc2lv
 biA+PSA4MDAwMDAKQEAgLTE1NjAsOCArMTU2NSw2IEBACiAjZW5kaWYKIAlJR0JfVFhfVU5MT0NL
 KHR4cik7CiAKLQltb3JlX3J4ID0gaWdiX3J4ZW9mKHF1ZSwgYWRhcHRlci0+cnhfcHJvY2Vzc19s
 aW1pdCwgTlVMTCk7Ci0KIAlpZiAoYWRhcHRlci0+ZW5hYmxlX2FpbSA9PSBGQUxTRSkKIAkJZ290
 byBub19jYWxjOwogCS8qCg==
 --000e0ce03f94006e2c04cccbd81d--
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: kern/150247: [patch] [ixgbe] Version in -current won't build on 7.x systems

2011-01-07 Thread Andrew Boyer

The following reply was made to PR kern/150247; it has been noted by GNATS.

From: Andrew Boyer 
To: bug-follo...@freebsd.org,
 Andrew Boyer 
Cc:  
Subject: Re: kern/150247: [patch] [ixgbe] Version in -current won't build on 
7.x systems
Date: Fri, 7 Jan 2011 13:36:15 -0500

 The problem has spread to the new file ixv.h:
 
 --- ixv.h  2010-11-26 17:46:32.0 -0500
 +++ ixv.h  2011-01-07 13:08:45.0 -0500
 @@ -175,7 +175,11 @@
  #define VFTA_SIZE 128
 =20
  /* Offload bits in mbuf flag */
 +#if __FreeBSD_version >=3D 80
  #define CSUM_OFFLOAD  (CSUM_IP|CSUM_TCP|CSUM_UDP|CSUM_SCTP)
 +#else
 +#define CSUM_OFFLOAD  (CSUM_IP|CSUM_TCP|CSUM_UDP)
 +#endif
 =20
  /*
   =
 **=
 ***
 @@ -400,7 +404,7 @@
  #define IXV_TX_LOCK_ASSERT(_sc) mtx_assert(&(_sc)->tx_mtx, =
 MA_OWNED)
 =20
  /* Workaround to make 8.0 buildable */
 -#if __FreeBSD_version < 800504
 +#if __FreeBSD_version >=3D 80 && __FreeBSD_version < 800504
  static __inline int
  drbr_needs_enqueue(struct ifnet *ifp, struct buf_ring *br)
  {
 
 
 
 
 
 
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Check in small patches for ixgbe?

2011-01-07 Thread Andrew Boyer

Would someone please check in the patches I submitted under these PRs?

kern/150247: [patch] [ixgbe] Version in -current won't build on 7.x systems
kern/153772: [ixgbe] [patch] sysctls reference wrong XON/XOFF variables

It should only take a minute and I think they're noncontroversial...

Thank you,
  Andrew

------
Andrew Boyerabo...@averesystems.com




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: 8.2-PRERELEASE: if_bridge ARP and broadcasts issues

2011-01-25 Thread Andrew Thompson

On 26 January 2011 02:32, Alexander Zagrebin  wrote:
> Hi!
>
> I've found some issues with the if_bridge on 8.2-PRERELEASE.
>
> 1. An ARP issue
>
> Suppose we have a box with the 4 interfaces: nic0, nic1, nic2, nic3.
> The interfaces are linked pairwise using 2 bridge(4) interfaces: bridge0
> and bridge1. Only nic0 has an IP address assigned (for example,
> 192.168.0.1/24).
> So we have configuration like this:
>
>  192.168.0.1
> ---nic0---+       +---nic2---
>          |       |
>       bridge0 bridge1
>          |       |
> ---nic1---+       +---nic3---
>
> The problem: when ARP query about MAC address of 192.168.0.1 is received
> on the nic2 or nic3, then system responds with the MAC address of the nic0,
> though networks on the bridge0 and bridge1 are completely independent.
> IMHO, it isn't correct.
>
> The reason is in ARP handling code: it looks for an address of the interface
> belonging to a bridge, but there is not check that a bridge is the same.
>
> Attached patch (patch-if_ether.c) fixes the issue.

I have committed this, thanks.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: kern/150247: [patch] [ixgbe] Version in -current won't build on 7.x systems

2011-02-23 Thread Andrew Boyer

While I understand that's generally the case, it's how Intel supports older 
releases.  The code in -current does support building within older releases (at 
least 7.X and 8.X), except for the flaws I pointed out.

Jack Vogel submitted r217129, r217131, and r217132 to fix this PR, but didn't 
mark it closed.

-Andrew

On Feb 21, 2011, at 5:10 AM, Bruce Cran wrote:

> The following reply was made to PR kern/150247; it has been noted by GNATS.
> 
> From: Bruce Cran 
> To: bug-follo...@freebsd.org,
> abo...@averesystems.com
> Cc:  
> Subject: Re: kern/150247: [patch] [ixgbe] Version in -current won't build on 
> 7.x systems
> Date: Mon, 21 Feb 2011 10:05:27 +
> 
> We don't support building drivers from -CURRENT within the environment for an 
> older release. I think the ixgbe driver would need to be backported to 7-
> STABLE.
> 
> -- 
> Bruce Cran
> ___
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

--
Andrew Boyerabo...@averesystems.com




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: panic: bufwrite: buffer is not busy???

2011-04-11 Thread Andrew Boyer

Thank you for the response.  This is slightly off-topic for the freebsd-net 
list, but I was hoping your experience might help me with 
http://www.freebsd.org/cgi/query-pr.cgi?pr=155421.

You can look up commits by number at 
http://svn.freebsd.org/viewvc/base?view=revision&revision=220257, etc.  220257 
is the first commit on 2011-04-02, and 220507 is the last commit on 2011-04-09. 
 I don't see any smoking guns in-between, though.

-Andrew

On Apr 11, 2011, at 1:07 PM, Flávio wrote:

> On Mon, Apr 11, 2011 at 11:18 AM, Andrew Boyer  
> wrote:
>> Would you please elaborate on this?  Is that Feb-04-2010 or Apr-02-2010?  Do 
>> you have the before and after commit numbers?
>> 
>> Thank you,
>>  Andrew
>> 
>> On Apr 11, 2011, at 12:37 AM, Flávio wrote:
>> 
>>> Some patch between 02/04 and 09/09 fixed core dumps so now I can provide 
>>> real info on those panics:
>> 
>> --
>> Andrew Boyerabo...@averesystems.com
>> 
>> 
> 
> Sorry, it was a typo.
> 
> In 2011-04-02 I synced some of my servers with HEAD (yes, I know I
> should not use HEAD in a production environment, but I have some 7.2
> servers doing fine, which allow me to test development versions) and
> rebuild world but then they started to panic every other day and would
> always hang while dumping core
> 
> In 2011-04-09 I resynced and updated my kernel. Yesterday one my
> servers with HEAD panicked and dumped core normally, so I assume there
> was some patch in that week which fixed the issue.
> 
> Where can I see those commit numbers?
> ___
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

--
Andrew Boyerabo...@averesystems.com




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Intel ix (X520) disconnects when manipulating ips?

2011-04-22 Thread Andrew Boyer

Hello Steve and Jack,
You need to handle the SIOCSIFADDR ioctl or it gets passed up the stack to 
ether_ioctl().  When it goes up the interface gets reset.  See the comments in 
em_ioctl() and igb_ioctl().

We fixed this in ixgbe in our internal tree and it seems to work fine with 
82598 and 82599.  You also need to include opt_inet.h for the INET #define to 
be valid.

-Andrew

On Apr 22, 2011, at 7:06 PM, Steven Hartland wrote:

> Just double checked on igb1 on the same machine, adding an alias causes
> no loss in network from the primary or existing ip aliases for the nic.
> 
> So this should be eliminating most variables except the driver?
> 
>   Regards
>   Steve
> 
> - Original Message - From: "Jack Vogel" 
> To: "Steven Hartland" 
> Cc: ; "Vogel, Jack" 
> Sent: Friday, April 22, 2011 11:35 PM
> Subject: Re: Intel ix (X520) disconnects when manipulating ips?
> 
> 
>> OK, did some testing, this re-init with link transition will happen on both
>> the 1G
>> drivers as well as ixgbe, its due to the stack/ioctl behavior when  you do
>> the
>> ifconfig.
>> So, what are you comparing this to that DOESN'T do this?? If this were to
>> be kept from happening I'm not sure where the responsible code would be
>> but I'm pretty sure its not in the driver :)
> 
> 
> 
> This e.mail is private and confidential between Multiplay (UK) Ltd. and the 
> person or entity to whom it is addressed. In the event of misdirection, the 
> recipient is prohibited from using, copying, printing or otherwise 
> disseminating it or any information contained in it. 
> In the event of misdirection, illegible or incomplete transmission please 
> telephone +44 845 868 1337
> or return the E.mail to postmas...@multiplay.co.uk.
> 
> ___
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

--
Andrew Boyerabo...@averesystems.com




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Intel ix (X520) disconnects when manipulating ips?

2011-04-25 Thread Andrew Boyer

On Apr 23, 2011, at 12:50 AM, Julian Elischer wrote:

> On 4/22/11 5:08 PM, Andrew Boyer wrote:
>> Hello Steve and Jack,
>> You need to handle the SIOCSIFADDR ioctl or it gets passed up the stack to 
>> ether_ioctl().  When it goes up the interface gets reset.  See the comments 
>> in em_ioctl() and igb_ioctl().
>> 
>> We fixed this in ixgbe in our internal tree and it seems to work fine with 
>> 82598 and 82599.  You also need to include opt_inet.h for the INET #define 
>> to be valid.
> 
> so, what else have you fixed?   :-)
> 
>> -Andrew
>> 


Here's a list of suggested improvements that we could give back.  I try to only 
bother Jack if a problem results in a hang or other serious issue.

- Add tunables for LRO and HWRSC
- Add code in ixgbe_attach() to detect 'disabled' hint (call 
resource_disabled()) [also useful in e1000!]
- Add code in ixgbe_attach() to handle IXGBE_ERR_SFP_NOT_PRESENT returned by 
ixgbe_init_hw() (set sfp_probe to TRUE)
- Add VLAN_HWTSO support when handling SIOCSIFCAP
- Call ixgbe_disable_queue() at the beginning of ixgbe_msix_que()
- Add code to ixgbe_local_timer() to first call ixgbe_txeof() on every queue 
before checking the queue status
- Rework ixgbe_config_link(); if sfp is TRUE, I think it should just schedule 
mod_task and let it handle the msf case
- Add locking and PHY type detection to ixgbe_handle_link() (to support 
copper->optical and optical->copper transitions)
- Add locking to ixgbe_handle_mod(), detect the PHY type, and only schedule 
msf_task if multispeed_fiber is true
- Add locking to ixgbe_handle_msf()

I could provide patches for any item that people are interested in testing / 
incorporating.  Of course Jack still gets the final say on what goes into ixgbe.

-Andrew

--
Andrew Boyerabo...@averesystems.com




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: ixgbe> vlan addition and removal brings the interfaces down and up

2011-05-19 Thread Andrew Boyer

I have a patch that will fix this.  Please give me a little while to clean it 
up, and I will send it out on the list.

-Andrew

On May 19, 2011, at 2:58 AM, Igor Anishchuk wrote:

> Hi All,
> 
> I've been using Intel E10G42AFDA 10Gbit/s AF DA Dual Port adapters
> with direct attach cables and there is one thing keeps bothering me.
> I've been searching the Internet for any information with no luck. I
> would also assume that the problem is widely known, and I found one
> related PR kern/141285 but that one was closed unsolved.
> 
> When a VLAN interface is added or removed to from the ix interfaces
> the parent interface is briefly brought down and up. This event is
> visible for all applications and the switches. With my use case I add
> and remove VLAN interfaces on the fly and the described behavior
> causes undesired effects, especially for BGP daemons that are
> configured to monitor one of permanent VLAN interfaces.
> 
> I use FreeBSD 7-STABLE and the behavior is the same with stock
> drivers, with 2.2.3 and with 2.3.8 drivers downloaded from Intel web
> site. I have attempted to disable -vlanhwtag, -vlanhwfilter and
> -vlanhwtso with no effect.
> 
> Could someone help me to stop the cards behaving this way? I do not
> mind some performance penalties, nor running in permanent promiscuous
> mode. I just want the card to stay up all the time regardless of the
> vlan interfaces attached to it.
> 
> Any help, links, patches are much appreciated.
> 
> Regards,
> 
> Igor Anishchuk
> ___
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

--
Andrew Boyerabo...@averesystems.com




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: ixgbe> vlan addition and removal brings the interfaces down and up

2011-06-28 Thread Andrew Boyer

Hello Igor,
Sorry for the delay.  I'm a little hesitant to share our ixgbe patch to change 
this behavior because Jack has checked in changes to igb that make me think 
that our change is not correct.  Or, at least, that he's probably working on 
fixing ixgbe the right way.  Jack, are you planning to copy the reorganization 
of igb_setup_vlan_hw_support() over to ixgbe_setup_vlan_hw_support?

-Andrew

On Jun 28, 2011, at 4:02 PM, Igor Anishchuk wrote:

> Hi Andrew,
> 
> could you please share the patch as I'm dying with this problem.
> 
> What makes it worse is that on a busy router the DOWN/UP of the
> interfaces causes the ixgbe card to lose all network access until the
> box is rebooted. I can reproduce it easily on a variety of hosts from
> both HP and Dell. Therefore a patch that would not cause the card to
> reset would help a lot.
> 
> -- Igor
> 
> On Thu, May 19, 2011 at 8:58 PM, Andrew Boyer  wrote:
>> I have a patch that will fix this.  Please give me a little while to clean 
>> it up, and I will send it out on the list.
>> 
>> -Andrew
>> 
>> On May 19, 2011, at 2:58 AM, Igor Anishchuk wrote:
>> 
>>> Hi All,
>>> 
>>> I've been using Intel E10G42AFDA 10Gbit/s AF DA Dual Port adapters
>>> with direct attach cables and there is one thing keeps bothering me.
>>> I've been searching the Internet for any information with no luck. I
>>> would also assume that the problem is widely known, and I found one
>>> related PR kern/141285 but that one was closed unsolved.
>>> 
>>> When a VLAN interface is added or removed to from the ix interfaces
>>> the parent interface is briefly brought down and up. This event is
>>> visible for all applications and the switches. With my use case I add
>>> and remove VLAN interfaces on the fly and the described behavior
>>> causes undesired effects, especially for BGP daemons that are
>>> configured to monitor one of permanent VLAN interfaces.
>>> 
>>> I use FreeBSD 7-STABLE and the behavior is the same with stock
>>> drivers, with 2.2.3 and with 2.3.8 drivers downloaded from Intel web
>>> site. I have attempted to disable -vlanhwtag, -vlanhwfilter and
>>> -vlanhwtso with no effect.
>>> 
>>> Could someone help me to stop the cards behaving this way? I do not
>>> mind some performance penalties, nor running in permanent promiscuous
>>> mode. I just want the card to stay up all the time regardless of the
>>> vlan interfaces attached to it.
>>> 
>>> Any help, links, patches are much appreciated.
>>> 
>>> Regards,
>>> 
>>> Igor Anishchuk
>>> _______
>>> freebsd-net@freebsd.org mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>>> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>> 
>> --
>> Andrew Boyerabo...@averesystems.com
>> 
>> 
>> 
>> 
>> 

--
Andrew Boyerabo...@averesystems.com




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Fwd: kern/156978: [lagg][patch] Take lagg rlock before checking flags

2011-07-06 Thread Andrew Boyer

Can someone please review this and check in the patch?  (And MFC it to 
stable/8?)

Thank you,
  Andrew

Begin forwarded message:

> From: lini...@freebsd.org
> Date: May 12, 2011 10:36:26 AM EDT
> To: lini...@freebsd.org, freebsd-b...@freebsd.org, freebsd-net@FreeBSD.org
> Subject: Re: kern/156978: [lagg][patch] Take lagg rlock before checking flags
> 
> Synopsis: [lagg][patch] Take lagg rlock before checking flags
> 
> Responsible-Changed-From-To: freebsd-bugs->freebsd-net
> Responsible-Changed-By: linimon
> Responsible-Changed-When: Thu May 12 14:36:20 UTC 2011
> Responsible-Changed-Why: 
> Over to maintainer(s).
> 
> http://www.freebsd.org/cgi/query-pr.cgi?pr=156978
> ___
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

--
Andrew Boyerabo...@averesystems.com




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

MFC Re: soreceive_stream: issues with O_NONBLOCK

2011-07-11 Thread Andrew Boyer

On Jul 8, 2011, at 6:51 AM, Andre Oppermann wrote:

> On 07.07.2011 21:24, Mikolaj Golub wrote:
>> 
>> On Thu, 07 Jul 2011 12:47:15 +0200 Andre Oppermann wrote:
>> 
>>  AO>  Please try this patch:
>>  AO>   http://people.freebsd.org/~andre/soreceive_stream.diff-20110707
>> 
>> It works for me. No issues detected so far. Thanks.
> 
> Committed in r223863. Many thanks for testing!
> 
> -- 
> Andre

Hello Andre,
It appears that r197236 was never MFC'd, so soreceive_stream is still on by 
default in stable/8.  Would you be able to MFC it along with 223839 and 223863?

Thank you,
  Andrew

--
Andrew Boyerabo...@averesystems.com




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

MFC of 218627 (SO_SETFIB 0)

2011-07-11 Thread Andrew Boyer

Would someone please MFC r218627 back to stable/8 and stable/7?  They are both 
affected.

Thank you,
  Andrew

--
Andrew Boyerabo...@averesystems.com




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: MFC Re: soreceive_stream: issues with O_NONBLOCK

2011-07-12 Thread Andrew Boyer


On Jul 12, 2011, at 3:10 AM, Andre Oppermann wrote:

> On 11.07.2011 17:15, Andrew Boyer wrote:
>> On Jul 8, 2011, at 6:51 AM, Andre Oppermann wrote:
>> 
>>> On 07.07.2011 21:24, Mikolaj Golub wrote:
>>>> 
>>>> On Thu, 07 Jul 2011 12:47:15 +0200 Andre Oppermann wrote:
>>>> 
>>>> AO>   Please try this patch: AO>
>>>> http://people.freebsd.org/~andre/soreceive_stream.diff-20110707
>>>> 
>>>> It works for me. No issues detected so far. Thanks.
>>> 
>>> Committed in r223863. Many thanks for testing!
>>> 
>>> -- Andre
>> 
>> Hello Andre, It appears that r197236 was never MFC'd, so soreceive_stream is 
>> still on by default
>> in stable/8.  Would you be able to MFC it along with 223839 and 223863?
> 
> soreceive_stream() was never on by default. In fact one had to compile it in
> and enable it with a tuneable.
> 
> I plan do the MFC's in a few days.
> 
> -- 
> Andre


My bad, I missed the #if 0's in tcp_usrreq.c.  Looking forward to the MFC's to 
try it out.

-Andrew

--
Andrew Boyerabo...@averesystems.com




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: system locks up with vr driver on alix board

2011-08-17 Thread Andrew Stevenson


On 17 Aug 2011, at 02:39, Ask Bjørn Hansen wrote:

>> How many PPS or interrupts do you see from vr interface under high
>> network load?
> 
> Honestly I'm not sure.  I only know how to see the interrupt busy percentage 
> from top …Is there a cheap way to get those numbers?If so then I'll 
> log them every second or two and see if it catches anything.

"systat -vmstat" shows interrupts per second per device. Some use of awk or sed 
may be required.

HTH,

Andrew

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: nge(4), tl(4), wb(4) and rl(4) 8129 testers wanted [Re: Question about GPIO bitbang MII]

2011-10-18 Thread andrew bliznak

On Sun, 2011-10-16 at 03:08 +0200, Marius Strobl wrote:
> On Sun, Oct 16, 2011 at 02:46:23AM +0200, Damien Fleuriot wrote:
> > 
> > 
> > On 15 Oct 2011, at 22:56, Marius Strobl  wrote:
> > 
> > > 
> > > Could owners of nge(4), tl(4), wb(4) and rl(4) driven hardware (as for
> > > rl(4) only 8129 need testing, 8139 don't) please give the following
> > > patch a try in order to ensure it doesn't break anything?
> > > for 9/head:
> > > http://people.freebsd.org/~marius/mii_bitbang.diff
> > > for 8:
> > > http://people.freebsd.org/~marius/mii_bitbang.diff8
> > > 
> > > Thanks,
> > > Marius
> > > 
> > 
> > 
> > While I don't have any box with this hardware, I'm thinking you might want 
> > to get a bit more specific about what you want tested...
> > 
> > What do you think the patch might break ?
> > 
> 
> Basically, if there's something wrong with the patch the driver should
> fail to attach, if it still does and gets a link all should be fine.
> 
> Marius
> 

run OK on FreeBSD 10.0-CURRENT #22 r226300M


nge0:  port 0xd800-0xd8ff mem
0xd000-0xdfff irq 22 at device 2.0 on pci1
miibus0:  on nge0
nsgphy0:  PHY 1 on miibus0
none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto
nge0: Ethernet address: 00:30:4f:1e:e4:49

> ___
> freebsd-sta...@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Too much interrupts on ixgbe

2011-10-24 Thread Andrew Boyer

You could try this patch.  It disables the interrupt while it's being handled.  
(The driver already re-enables it at the end of the handler and the task.)

-Andrew

Index: sys/dev/ixgbe/ixgbe.c
===
--- sys/dev/ixgbe/ixgbe.c   (revision 226698)
+++ sys/dev/ixgbe/ixgbe.c   (working copy)
@@ -1362,6 +1362,7 @@
boolmore_tx, more_rx;
u32 newitr = 0;
 
+   ixgbe_disable_queue(adapter, que->msix);
++que->irqs;
 
more_rx = ixgbe_rxeof(que, adapter->rx_process_limit);


On Oct 24, 2011, at 5:41 AM, Sergey Saley wrote:

> There is my FreeBSD box:
> 
> kernel
> ---
> #
> # GENERIC -- Generic kernel configuration file for FreeBSD/i386
> #
> # For more information on this file, please read the config(5) manual page,
> # and/or the handbook section on Kernel Configuration Files:
> #
> #   
> http://www.FreeBSD.org/doc/en_US.ISO8859-1/books/handbook/kernelconfig-config.html
> #
> # The handbook is also available locally in /usr/share/doc/handbook
> # if you've installed the doc distribution, otherwise always see the
> # FreeBSD World Wide Web server (http://www.FreeBSD.org/) for the
> # latest information.
> #
> # An exhaustive list of options and more detailed explanations of the
> # device lines is also present in the ../../conf/NOTES and NOTES files.
> # If you are in doubt as to the purpose or necessity of a line, check first
> # in NOTES.
> #
> # $FreeBSD: head/sys/i386/conf/GENERIC 221743 2011-05-10 16:44:16Z jkim $
> 
> cpu   I686_CPU
> ident POINT07
> 
> 
> options   SCHED_ULE   # ULE scheduler
> options   PREEMPTION  # Enable kernel thread preemption
> options   INET# InterNETworking
> options   FFS # Berkeley Fast Filesystem
> options   SOFTUPDATES # Enable FFS soft updates support
> options   UFS_DIRHASH # Improve performance on big directories
> options   MD_ROOT # MD is a potential root device
> options   MSDOSFS # MSDOS Filesystem
> options   CD9660  # ISO 9660 Filesystem
> options   PROCFS  # Process filesystem (requires PSEUDOFS)
> options   PSEUDOFS# Pseudo-filesystem framework
> options   GEOM_PART_GPT   # GUID Partition Tables.
> options   GEOM_LABEL  # Provides labelization
> options   KTRACE  # ktrace(1) support
> options   STACK   # stack(9) support
> options   SYSVSHM # SYSV-style shared memory
> options   SYSVMSG # SYSV-style message queues
> options   SYSVSEM # SYSV-style semaphores
> options   _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time 
> extensions
> options   PRINTF_BUFR_SIZE=128# Prevent printf output being 
> interspersed.
> options   KBD_INSTALL_CDEV# install a CDEV entry in /dev
> options   HWPMC_HOOKS # Necessary kernel hooks for hwpmc(4)
> #options  KDTRACE_HOOKS   # Kernel DTrace hooks
> options   INCLUDE_CONFIG_FILE # Include this file in kernel
> 
> 
> # To make an SMP kernel, the next two lines are needed
> options   SMP # Symmetric MultiProcessor Kernel
> deviceapic# I/O APIC
> 
> # CPU frequency control
> devicecpufreq
> 
> # Bus support.
> deviceacpi
> devicepci
> 
> # Floppy drives
> devicefdc
> 
> # ATA controllers
> deviceahci# AHCI-compatible SATA controllers
> deviceata # Legacy ATA/SATA controllers
> options   ATA_CAM # Handle legacy controllers with CAM
> options   ATA_STATIC_ID   # Static device numbering
> devicemvs # Marvell 
> 88SX50XX/88SX60XX/88SX70XX/SoC SATA
> devicesiis# SiliconImage SiI3124/SiI3132/SiI3531 
> SATA
> 
> # ATA/SCSI peripherals
> devicescbus   # SCSI bus (required for ATA/SCSI)
> devicech  # SCSI media changers
> deviceda  # Direct Access (disks)
> devicesa  # Sequential Access (tape etc)
> devicecd  # CD
> devicepass# Passthrough device (direct ATA/SCSI 
> access)
> deviceses # SCSI Environmental Services (and 
> SAF-TE)
> 
> 
> # atkbdc0 controls both

Intel Dual port pro/1000: watchdog timeouts and no packets received

2009-09-14 Thread Andrew Snow



This is a very new card which I haven't seen before on the market until 
recently.


Card: E1G42ET (Intel Gigabit PCIe ET Dual Port  Adapter 82576)
Server: Supermicro X7SLA-H
Operating system:  FreeBSD 7.2-RELEASE and 7.2-STABLE
IGB Drivers: 1.4.1 and updated 1.7.3 from intel website



igb0:  port 
0xcc00-0xcc1f mem 
0xfe9e-0xfe9f,0xfe40-0xfe7f,0xfe9dc000-0xfe9d irq 10 
at device 0.0 on pci1

igb0: Using MSIX interrupts with 0 vectors
igb0: [FILTER]
igb0: Ethernet address: 00:1b:21:43:2f:a0

igb1:  port 
0xc880-0xc89f mem 
0xfe9a-0xfe9b,0xfdc0-0xfdff,0xfe9d8000-0xfe9dbfff irq 11 
at device 0.1 on pci1

igb1: Using MSIX interrupts with 0 vectors
igb1: [FILTER]
igb1: Ethernet address: 00:1b:21:43:2f:a1

i...@pci0:1:0:0:class=0x02 card=0xa03c8086 chip=0x10c98086 
rev=0x01 hdr=0x00

vendor = 'Intel Corporation'
class  = network
subclass   = ethernet

i...@pci0:1:0:1:class=0x02 card=0xa03c8086 chip=0x10c98086 
rev=0x01 hdr=0x00

vendor = 'Intel Corporation'
class  = network
subclass   = ethernet

Card detects OK but when you assign an IP and try to ping, no packets 
are received, not even ARP replies.


This appears on the console:

igb0: watchdog timeout -- resetting
igb0: Queue(0) tdh = 9, tdt = 9
igb0: Queue(0) desc avail = 247, Next Desc to Clean = 0
igb0: link state changed to DOWN
igb0: link state changed to UP



Thanks,

- Andrew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Is this a race in mbuf's refcounting?

2009-09-21 Thread Andrew Brampton

I've been reading the FreeBSD source code to understand how mbufs are
reference counted. However, there are a few bits of code that I'm
wondering if they would fail under the exactly right timing. Take for
example in uipc_mbuf.c:

 286 static void
 287 mb_dupcl(struct mbuf *n, struct mbuf *m)
 288 {
...
 293if (*(m->m_ext.ref_cnt) == 1)
 294*(m->m_ext.ref_cnt) += 1;
 295else
 296atomic_add_int(m->m_ext.ref_cnt, 1);
...
 305 }

Now, the way I understand this code is, if ref_cnt is 1, then it is
not shared. In that case non-atomically increment ref_cnt. However, if
ref_cnt was something else, then it is shared so update the value in
an atomic way. This seems valid, however what happens if two threads
call mb_dupcl at the same time with a non-shared m. Could they both
evaluate the if on line 293 at the same time, and then both
non-atomically increment ref_cnt?

If this could happen then we have a lost update and our reference
counting is broken. I've also noticed that in other places similar
optimisations are made to avoid the atomic operation.

So is this a problem?

thanks
Andrew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Is this a race in mbuf's refcounting?

2009-09-21 Thread Andrew Brampton

2009/9/21 Bruce Evans :
> On Mon, 21 Sep 2009, Andrew Brampton wrote:
>
>> I've been reading the FreeBSD source code to understand how mbufs are
>> reference counted. However, there are a few bits of code that I'm
>> wondering if they would fail under the exactly right timing. Take for
>> example in uipc_mbuf.c:
>>
>> 286 static void
>> 287 mb_dupcl(struct mbuf *n, struct mbuf *m)
>> 288 {
>> ...
>> 293        if (*(m->m_ext.ref_cnt) == 1)
>> 294                *(m->m_ext.ref_cnt) += 1;
>> 295        else
>> 296                atomic_add_int(m->m_ext.ref_cnt, 1);
>> ...
>> 305 }
>>
>> Now, the way I understand this code is, if ref_cnt is 1, then it is
>> not shared. In that case non-atomically increment ref_cnt. However, if
>> ref_cnt was something else, then it is shared so update the value in
>> an atomic way. This seems valid, however what happens if two threads
>> call mb_dupcl at the same time with a non-shared m. Could they both
>> evaluate the if on line 293 at the same time, and then both
>> non-atomically increment ref_cnt?
>>
>> If this could happen then we have a lost update and our reference
>> counting is broken. I've also noticed that in other places similar
>> optimisations are made to avoid the atomic operation.
>>
>> So is this a problem?
>
> I don't see how it can work.
>
> Also, if the count was 1, then it should become 2, but there is nothing to
> flush the store to memory.  This seems to mainly enlarge the race window
> for the previous problem.
>
> Bruce
>

Sorry, are you agreeing or disagreeing with my original post? If you
are disagreeing I would appreciate if you could explain the error in
my ways.

I see the following happening:
Thread 1: Reads *(m->m_ext.ref_cnt) and determines it is 1, and enters
the true branch of the if
Thread 1: Then reads *(m->m_ext.ref_cnt) again (since it is volatile)
Thread 2: Interrupts and reads *(m->m_ext.ref_cnt) and determines it
is 1, and enters the true branch of the if
Thread 2: Then reads *(m->m_ext.ref_cnt), adds one to it and stores
the result (ie 2)
Thread 1: Resumes with the value it had (ie 1) and adds one to it, and
stores the result (ie 2)

Due to this sequence we have lost an update, since the value of
*(m->m_ext.ref_cnt) should be 3. Now if this if wasn't there and
atomic_add_int is used the result will be 3.

If you find a flaw in my logic please point it out.

thanks
Andrew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Is this a race in mbuf's refcounting?

2009-09-21 Thread Andrew Brampton

2009/9/21 Ed Maste :
> On Mon, Sep 21, 2009 at 01:43:33PM +0100, Andrew Brampton wrote:
>
> Your analysis is correct; this issue also has a PR, kern/137145.
> http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/137145
>
> As you point out it requires that two threads have a reference to the
> same non-shared mbuf.  I had a quick look and didn't find any case of
> this in the vanilla FreeBSD tree; if I didn't miss anything it'll
> affect only 3rd party src.
>
> We'll need to have a look at this after 8.0 is done.
>
> -Ed
>

I didn't notice there was a PR for this.

Thanks
Andrew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: WLAN performance Windows/XP ./. FreeBSD 8-CURRENT

2009-09-28 Thread Andrew Kuriger


On Mon, 28 Sep 2009 16:01:53 +0200, "Paul B. Mahol" 
wrote:
> On 9/28/09, Matthias Apitz  wrote:
>>
>> Hello,
>>
>> I am wondering what could cause the following WLAN performance diff
>> between a XP and 8-CURRENT laptop, sitting side by side and connected
to
>> the same AP:
>>
>> OS   XP   8-CURRENT
>> NIC  Intel 3945ABGAtheros 5424/2424
>> Ping 6ms  116ms
>> downstream   9.05Mbit/s   6.58Mbit/s
>> upstream 6.58Mbit/s   4.55Mbit/s
>>
>> measured with http://www.speedtest.net/ against the same remote server
>> at the same time... Any ideas?
> 
> Emulated flash?

I have also found that speedtest(s) aren't always that accurate at all
either. If you really wanted to test the speed diffs you may want to attach
a local file server, and check upload and download speeds against a local
server instead of an unknown remote host for accurate results.

~Andrew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Choosing two 10GiGE cards

2009-10-07 Thread Andrew Snow



The only one worth getting IMO is Intel EXPX9502CX4
(INTEL 10 GIGABIT CX4 DUAL PORT SERVER ADAPTER)

It is low power and very fast, and works under FreeBSD.  Like all Intel 
NICs It supports interrupt modulation so polling support isn't really 
needed.



- Andrew

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Choosing two 10GiGE cards

2009-10-07 Thread Andrew Snow


rihad wrote:

The only one worth getting IMO is Intel EXPX9502CX4
(INTEL 10 GIGABIT CX4 DUAL PORT SERVER ADAPTER)


Thanks. What does DUAL PORT mean? It has two jacks? I think one such 
adapter will be more than enough to replace our two 1000 mbps cards, 
whether two jacks or not?


Correct, it has two ports on the one card. It uses a PCIe x8 slot so 
plenty of bandwidth to serve two ports.


- Andrew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Can we turn off WPI_DEBUG

2009-10-27 Thread Andrew Thompson

On Tue, Oct 27, 2009 at 03:29:11PM -0700, Doug Barton wrote:
> I cc'ed those who seem to have put the most/recent effort into
> sys/dev/wpi.
> 
> Is there any objection to turning off WPI_DEBUG by default? it creates
> a lot of spam that the average user doesn't need. I use my 3945abg
> every day and haven't had any problems with it for ages so I think
> it's safe to say we're out of the period were debug by default is needed?

Go for it.


Andrew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: uath under FreeBSD 8.0-STABLE

2009-12-22 Thread Andrew Thompson

On Tue, Dec 22, 2009 at 05:54:25PM -0500, Steven Friedrich wrote:
> On Tuesday 22 December 2009 02:31:04 pm Weongyo Jeong wrote:
> > On Tue, Dec 15, 2009 at 04:03:31PM -0500, Steven Friedrich wrote:
> > > Ok, I am able to load firmware with:
> > > uathload -d /dev/ugen4.3
> > > but it also appears to do so when I plug it in...
> > >
> > > Now what? It doesn't show up in ifconfig...
> > 
> > Could you please show me dmesg output and the output of
> > 
> > # usbconfig dump_device_desc
> This message appears on the console when the device is plugged in:
> ugen4.3: < Atheros Communications Inc > at usbus4
> 
> usbconfig -u 4 -a 3 dump_info
> ugen4.3:  at usbus4, cfg=0 md=HOST 
> spd=HIGH 
> (480Mbps) pwr=ON

It looks like you are missing the step to load the firmware with
uathload(8).

# uathload -d /dev/ugen4.3


Maybe the documentation is missing this, or even better would be to have
devd do it automagically.


Andrew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: uath under FreeBSD 8.0-STABLE

2009-12-22 Thread Andrew Thompson

On Wed, Dec 23, 2009 at 12:01:22PM +1300, Andrew Thompson wrote:
> On Tue, Dec 22, 2009 at 05:54:25PM -0500, Steven Friedrich wrote:
> > On Tuesday 22 December 2009 02:31:04 pm Weongyo Jeong wrote:
> > > On Tue, Dec 15, 2009 at 04:03:31PM -0500, Steven Friedrich wrote:
> > > > Ok, I am able to load firmware with:
> > > > uathload -d /dev/ugen4.3


Oops, disregard my suggestion :)


> > > > but it also appears to do so when I plug it in...
> > > >
> > > > Now what? It doesn't show up in ifconfig...
> > > 
> > > Could you please show me dmesg output and the output of
> > > 
> > >   # usbconfig dump_device_desc
> > This message appears on the console when the device is plugged in:
> > ugen4.3: < Atheros Communications Inc > at usbus4
> > 
> > usbconfig -u 4 -a 3 dump_info
> > ugen4.3:  at usbus4, cfg=0 md=HOST 
> > spd=HIGH 
> > (480Mbps) pwr=ON
> 
> It looks like you are missing the step to load the firmware with
> uathload(8).
> 
> # uathload -d /dev/ugen4.3
> 
> 
> Maybe the documentation is missing this, or even better would be to have
> devd do it automagically.
> 
> 
> Andrew
> ___
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Question about MFC for 194760,194813, etc. (ifaddr races)

2010-02-02 Thread Andrew Boyer

http://svn.freebsd.org/viewvc/base?view=revision&revision=194760

Hello all,
We are currently working with the FreeBSD 7.1 release code as the foundation 
for our product.  The other day I experienced a kernel panic when an ifaddr 
race condition caused a use-after-free error.  It looks like SVN commits 
194760, 194813, 194819, etc. address this issue.  The original commit message 
for 194760 says "MFC after: 6 weeks (portions)", but I don't see anything to 
indicate that the fixes were ever merged back.

Is anyone planning / willing to merge this back into the 7.X branches?  We 
would very much appreciate it.  It looks like there are about two dozen related 
commits and the diffs don't apply cleanly for me.  If it has diverged too much 
we'll just have to wait until we sync up with 8.0 later this year.

Thanks,
 Andrew

------
Andrew Boyerabo...@averesystems.com




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

OSPF Neighbors inactivity on 8.0-STABLE

2010-03-01 Thread Andrew Rikhlivsky

I have a few NASes based on FreeBSD 7.2-RELEASE and quagga 0.99.14 and 
they all have the same

configuration with little changes.

When I add to network a test server based on FreeBSD 8.0-STABLE
with quagga 0.99.15, other servers doesn't receive HELLO packets from him.

On nas9# tcpdump -i vr0 proto ospf

13:24:04.591907 IP 193.62.62.14>  OSPF-ALL.MCAST.NET: OSPFv2,
Hello, length 44
13:24:09.594136 IP 193.62.62.14>  OSPF-ALL.MCAST.NET: OSPFv2,
Hello, length 44
13:24:14.596375 IP 193.62.62.14a>  OSPF-ALL.MCAST.NET: OSPFv2,
Hello, length 44
13:24:19.598678 IP 193.62.62.14>  OSPF-ALL.MCAST.NET: OSPFv2,
Hello, length 44
13:24:24.600867 IP 193.62.62.14>  OSPF-ALL.MCAST.NET: OSPFv2,
Hello, length 44
13:24:29.603050 IP 193.62.62.14>  OSPF-ALL.MCAST.NET: OSPFv2,
Hello, length 44
13:24:34.605322 IP 193.62.62.14>  OSPF-ALL.MCAST.NET: OSPFv2,
Hello, length 44
13:24:39.607564 IP 193.62.62.14>  OSPF-ALL.MCAST.NET: OSPFv2,
Hello, length 44
13:24:44.609799 IP 193.62.62.14>  OSPF-ALL.MCAST.NET: OSPFv2,
Hello, length 44

# uname -a
FreeBSD localhost 8.0-STABLE FreeBSD 8.0-STABLE #0: Fri Feb 19 19:33:17 
UTC 2010 r...@localhost:/usr/obj/usr/src/sys/nas9  i386

# ospfd -v
ospfd version 0.99.15


Counter of multicast packets on switch port constantly increasing.
What the reason of inaccessibility other servers over multicast?
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Why lagg(4) wants ~IFF_DRV_OACTIVE?

2010-03-08 Thread Andrew Thompson

On Mon, Mar 08, 2010 at 11:12:25AM -0800, Xin LI wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> Hi,
> 
> Maybe this is a stupid question but I really don't understand why a
> interface with IFF_DRV_OACTIVE can't be added to a lagg(4) interface.
> Looking at OpenBSD code, they do this since the day 0.
> 
> Could anyone shed some light, why we need to enforce this check? :)

I think it was just carried over, I dont see any reason to keep it.


Andrew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: kern/143046: [mxge] [panic] panics since mxge(4) update

2010-03-14 Thread Andrew Gallatin


lini...@freebsd.org wrote:

Synopsis: [mxge] [panic] panics since mxge(4) update

Responsible-Changed-From-To: freebsd-net->gallatin
Responsible-Changed-By: linimon
Responsible-Changed-When: Sat Mar 13 19:56:17 UTC 2010
Responsible-Changed-Why: 
Drew wants these PRs.


http://www.freebsd.org/cgi/query-pr.cgi?pr=143046


Thanks.  I had no idea; I don't regularly read freebsd-net.
Please send any/all mxge issues directly to me, or to h...@myri.com
if possible.

Thanks,

Drew

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Choosing CPU for router

2010-03-16 Thread Andrew Snow


Matthias Gamsjager wrote:

 Way over the top for simple fw and dhcpd. but how much traffic will
be involved?
Investing in a good nics will return more then a pricey cpu and
motherboard (eec mem is good idea for 24/7 tho).



Agreed.

The Supermicro Atom miniserver is more than enough CPU grunt for this 
sort of routing/ipfw task.  The main reason to go Xeon is if you need 
ECC RAM, and even then you can get away with just using the cheapest CPU 
available.



- Andrew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Choosing CPU for router

2010-03-17 Thread Andrew Snow



Jon Otterholm wrote:
This machine is going to act as access-router serving ~500 
FTTH-customers.
About 500Mbit/s and 200kpps. The big issue is Dummynet, around 1000 
pipes (2

pipes/customer).


That doesn't sound right,  200kpps @ 500Mbps works out to an average 
packet size of 250 bytes?  Am I missing something



- Andrew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Running rtadvd or DHCPv6 server via if_bridge interface

2010-03-18 Thread Andrew Thompson

On Thu, Mar 18, 2010 at 11:27:43PM +0100, Stefan Bethke wrote:
> Am 11.12.2009 um 07:51 schrieb Chris Cowart:
> 
> > Bruce Cran wrote:
> >> I have a router configured using if_bridge with a 4-port NIC that's
> >> serving addresses over DHCP. I'd like to add in either rtadvd or
> >> DHCPv6, but neither work because the bridge interface doesn't have an
> >> IPv6 link-local address. Is there a way around this, or is it not
> >> possible to serve IPv6 addresses over if_bridge interfaces?
> > 
> > It's totally doable; you just have to assigned a link-local address to
> > the bridge. There are some reasons why one isn't defined by default,
> > which somebody more knowledgeable about the challenges in the
> > implementation can highlight.
> > 
> > Here's my configuration from rc.conf:
> > 
> > ipv6_ifconfig_bridge0="2001:470:8337:10::1/64"
> > ipv6_ifconfig_bridge0_alias0="fe80::2%bridge0 prefixlen 64"
> > 
> > Once you're doing that, rtadvd will start doing the right thing.
> 
> I've just stumbled over this the first time.
> 
> I thought that best practice nowadays was to use the bridge interface for 
> host communications, and leaving the physical interfaces unconfigured, so I'm 
> a bit confused why if_bridge would not allow the auto-assignment of a 
> link-local address.
> 
> If you have two or more bridged interfaces now, and you enable automatic 
> assignment of link-local addresses, you already have multiple link-locals 
> this way; having the bridge have one as well wouldn't make things worse (I 
> think).
> 

http://svn.freebsd.org/viewvc/base?view=revision&revision=149829

"IPv6 auto-configuration is disabled. An IPv6 link-local address has a
link-local scope within one link, the spec is unclear for the bridge
case and it may cause scope violation."

That is the reason. I dont know if its still true but you would need to
find someone more familair with IPv6 to comment on it.


cheers,
Andrew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Intel 10Gb

2010-05-11 Thread Andrew Gallatin

Murat Balaban [mu...@enderunix.org] wrote:
> 
> Much of the FreeBSD networking stack has been made parallel in order to
> cope with high packet rates at 10 Gig/sec operation. 
> 
> I've seen good numbers (near 10 Gig) in my tests involving TCP/UDP
> send/receive. (latest Intel driver).
> 
> As far as BPF is concerned, above statement does not hold true,
> since there is some work that needs to be done here in terms
> of BPF locking and parallelism. My tests show that there
> is a high lock contention around "bpf interface lock", resulting
> in input errors at high packet rates and with many bpf devices. 

If you're interested in 10GbE packet sniffing at line rate on the
cheap, have a look at the Myri10GE "sniffer" interface.  This is a
special software package that takes a normal mxge(4) NIC, and replaces
the driver/firmware with a "myri_snf" driver/firmware which is
optimized for packet sniffing.

Using this driver/firmware combo, we can receive minimal packets at
line rate (14.8Mpps) to userspace.  You can even access this using a
libpcap interface.  The trick is that the fast paths are OS-bypass,
and don't suffer from OS overheads, like lock contention.  See
http://www.myri.com/scs/SNF/doc/index.html for details.

Best Regards,

Drew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: FreeBSD.org IPv6 issue - AAAA records disabled

2010-05-11 Thread Andrew Gallatin


David Malone wrote:

On Mon, May 10, 2010 at 11:02:41AM -0400, Andrew Gallatin wrote:

I think something may be holding onto an mbuf after free,
then re-freeing it.  But only after somebody else allocated
it.   I was hoping that the mbuf double free referenced
above was the smoking gun, but it turns out that there isn't
even a bge interface in my pr (just bce and mxge).


Weren't there some bugs fixed recently that alowed the arp/ndp code
to free packets that weren't previously being freed? They'd be good
candidates for something that holds onto an mbuf for a while and
then frees it.


Unfortunately,  I think at least the PR I'm looking into pre-dates
those fixes  -- these problems started in r202120 (early Jan).
I need to ask what he upgraded from.

When did IPv6 become unstable for others?

Drew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

ixgbe 2.1.7 can't disable LRO on 82599?

2010-05-12 Thread Andrew Boyer

Hello all,
I'm using the 2.1.7 version of ixgbe from -CURRENT, backported to FreeBSD 7.1.  
With some fiddling it seems to work on both 82598 and 82599 controllers.

On 82598, 'ifconfig ix0 -lro' causes dev.ix.0.counters.rxr0.lro_queued and 
...lro_flushed to stop incrementing, as expected.  There's also a significant 
throughput hit which would seem to indicate that it took effect.

However, it appears that LRO is always enabled on 82599.  'ifconfig ix0 -lro' 
removes the LRO flag from the port in ifconfig but the ...hw_lro_merge counter 
continues to increase.  The throughput reported by the iperf port is the same 
with or without LRO on.

Any advice?  Am I misinterpreting something?

Thanks,
  Andrew

P.S.  We need to disable LRO because we don't have Appropriate Byte Counting 
support and LRO causes TCP ACK havoc without it.

--
Andrew Boyerabo...@averesystems.com




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: ixgbe 2.1.7 can't disable LRO on 82599?

2010-05-13 Thread Andrew Boyer

All, 
The solution was simple.  Check to make sure the IFCAP_LRO bit is set before 
calling ixgbe_setup_hw_rsc().

-Andrew

--- a/src/sys/dev/ixgbe/ixgbe.c
+++ b/src/sys/dev/ixgbe/ixgbe.c
@@ -3728,6 +3728,9 @@ ixgbe_setup_receive_ring(struct rx_ring *rxr)
** Disable RSC when RXCSUM is off
*/
if ((adapter->hw.mac.type == ixgbe_mac_82599EB) &&
+   (ifp->if_capenable & IFCAP_LRO) &&
(ifp->if_capenable & IFCAP_RXCSUM))
ixgbe_setup_hw_rsc(rxr);
else if (ifp->if_capenable & IFCAP_LRO) {


On May 12, 2010, at 4:29 PM, Jack Vogel wrote:

> Correction, the 82599 is doing HW RSC, I'm sluggish after a good Indian lunch 
> :)
> 
> 
> On Wed, May 12, 2010 at 1:28 PM, Jack Vogel  wrote:
> Oh, this is because the 82598 is doing HW RSC which is a different code path 
> from the LRO that the 598
> does, and that may be the problem, I will need to look into that. Thanks for 
> the report.
> 
> And, yes, LRO is a major improvement in 10G performance, as is TSO. Are you 
> sure you have no
> alternative to disabling?
> 
> Cheers,
> 
> Jack
> 
> 
> On Wed, May 12, 2010 at 12:03 PM, Andrew Boyer  
> wrote:
> Hello all,
> I'm using the 2.1.7 version of ixgbe from -CURRENT, backported to FreeBSD 
> 7.1.  With some fiddling it seems to work on both 82598 and 82599 controllers.
> 
> On 82598, 'ifconfig ix0 -lro' causes dev.ix.0.counters.rxr0.lro_queued and 
> ...lro_flushed to stop incrementing, as expected.  There's also a significant 
> throughput hit which would seem to indicate that it took effect.
> 
> However, it appears that LRO is always enabled on 82599.  'ifconfig ix0 -lro' 
> removes the LRO flag from the port in ifconfig but the ...hw_lro_merge 
> counter continues to increase.  The throughput reported by the iperf port is 
> the same with or without LRO on.
> 
> Any advice?  Am I misinterpreting something?
> 
> Thanks,
>  Andrew
> 
> P.S.  We need to disable LRO because we don't have Appropriate Byte Counting 
> support and LRO causes TCP ACK havoc without it.

--
Andrew Boyerabo...@averesystems.com




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Intel 10Gb

2010-05-14 Thread Andrew Gallatin

Alexander Sack wrote:
<...>
>> Using this driver/firmware combo, we can receive minimal packets at
>> line rate (14.8Mpps) to userspace.  You can even access this using a
>> libpcap interface.  The trick is that the fast paths are OS-bypass,
>> and don't suffer from OS overheads, like lock contention.  See
>> http://www.myri.com/scs/SNF/doc/index.html for details.
>
> But your timestamps will be atrocious at 10G speeds.  Myricom doesn't
> timestamp packets AFAIK.  If you want reliable timestamps you need to
> look at companies like Endace, Napatech, etc.

I see your old help ticket in our system.  Yes, our timestamping
is not as good as a dedicated capture card with a GPS reference,
but it is good enough for most people.

> PS I am not sure but Intel also supports writing packets directly in
> cache (yet I thought the 82599 driver actually does a prefetch anyway
> which had me confused on why that helps)

You're talking about DCA.  We support DCA as well (and I suspect some
other 10G NICs do to).  There are a few barriers to using DCA on
FreeBSD, not least of which is that FreeBSD doesn't currently have the
infrastructure to support it (no IOATDMA or DCA drivers).

DCA is also problematic because support from system/motherboard
vendors is very spotty.  The vendor must provide the correct tag table
in BIOS such that the tags match the CPU/core numbering in the system.
Many motherboard vendors don't bother with this, and you cannot enable
DCA on a lot of systems, even though the underlying chipset supports
DCA.  I've done hacks to force-enable it in the past, with mixed
results. The problem is that DCA depends on having the correct tag
table, so that packets can be prefetched into the correct CPU's cache.
If the tag table is incorrect, DCA is a big pessimization, because it
blows the cache in other CPUs.

That said, I would *love* it if FreeBSD grew ioatdma/dca support.
Jack, does Intel have any interest in porting DCA support to FreeBSD?

Drew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Intel 10Gb

2010-05-14 Thread Andrew Gallatin


Alexander Sack wrote:
> On Fri, May 14, 2010 at 10:07 AM, Andrew Gallatin 
 wrote:

>> Alexander Sack wrote:
>> <...>
>>>> Using this driver/firmware combo, we can receive minimal packets at
>>>> line rate (14.8Mpps) to userspace.  You can even access this using a
>>>> libpcap interface.  The trick is that the fast paths are OS-bypass,
>>>> and don't suffer from OS overheads, like lock contention.  See
>>>> http://www.myri.com/scs/SNF/doc/index.html for details.
>>> But your timestamps will be atrocious at 10G speeds.  Myricom doesn't
>>> timestamp packets AFAIK.  If you want reliable timestamps you need to
>>> look at companies like Endace, Napatech, etc.
>> I see your old help ticket in our system.  Yes, our timestamping
>> is not as good as a dedicated capture card with a GPS reference,
>> but it is good enough for most people.
>
> I was told btw that it doesn't timestamp at ALL.  I am assuming NOW
> that is incorrect.

I think you might have misunderstood how we do timestamping.
I definately don't understand it, and I work there ;)
I do know that there is NIC component of it (eg, it is not 100%
done in the host).  I also realize that it is not is good as
something that is 1PPS GPS based.

> Define *most* people.

I may have a skewed view of the market, but it seems like
some people care deeply about accurate timestamps, and
others (mostly doing deep packet inspection) care only
within a few milliseconds, or even seconds.

> I am not knocking the Myricom card.  In fact I so wish you guys would
> just add the ability to latch to a 1PPS for timestamping and it would
> be perfect.
>
> We use I think an older version of the card internally for replay.
> Its a great multi-purpose card.
>
> However with IPG at 10G in the nanoseconds, anyone trying to do OWDs
> or RTT will find it difficult compared to an Endace or Napatech card.
>
> Btw, I was referring to bpf(4) specifically, so please don't take my
> comments as a knock against it.
>
>>> PS I am not sure but Intel also supports writing packets directly in
>>> cache (yet I thought the 82599 driver actually does a prefetch anyway
>>> which had me confused on why that helps)
>> You're talking about DCA.  We support DCA as well (and I suspect some
>> other 10G NICs do to).  There are a few barriers to using DCA on
>> FreeBSD, not least of which is that FreeBSD doesn't currently have the
>> infrastructure to support it (no IOATDMA or DCA drivers).
>
> Right.
>
>> DCA is also problematic because support from system/motherboard
>> vendors is very spotty.  The vendor must provide the correct tag table
>> in BIOS such that the tags match the CPU/core numbering in the system.
>> Many motherboard vendors don't bother with this, and you cannot enable
>> DCA on a lot of systems, even though the underlying chipset supports
>> DCA.  I've done hacks to force-enable it in the past, with mixed
>> results. The problem is that DCA depends on having the correct tag
>> table, so that packets can be prefetched into the correct CPU's cache.
>> If the tag table is incorrect, DCA is a big pessimization, because it
>> blows the cache in other CPUs.
>
> Right.
>
>> That said, I would *love* it if FreeBSD grew ioatdma/dca support.
>> Jack, does Intel have any interest in porting DCA support to FreeBSD?
>
> Question for Jack or Drew, what DOES FreeBSD have to do to support
> DCA?  I thought DCA was something you just enable on the NIC chipset
> and if the system is IOATDMA aware, it just works.  Is that not right
> (assuming cache tags are correct and accessible)?  i.e. I thought this
> was hardware black magic than anything specific the OS has to do.

IOATDMA and DCA are sort of unfairly joined for two reasons: The DCA
control stuff is implemented as part of the IOATDMA PCIe device, and
IOATDMA is a great usage model for DCA, since you'd want the DMAs
that it does to be prefetched.

To use DCA you need:

- A DCA driver to talk to the IOATDMA/DCA pcie device, and obtain the tag
table
- An interface that a client device (eg, NIC driver) can use to obtain
either the tag table, or at least the correct tag for the CPU
that the interrupt handler is bound to.  The basic support in
a NIC driver boils down to something like:

nic_interrupt_handler()
{
  if (sc->dca.enabled && (curcpu != sc->dca.last_cpu)) {
 sc->dca.last_cpu = curcpu;
 tag = dca_get_tag(curcpu);
 WRITE_REG(sc, DCA_TAG, tag);
  }
}

Drew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Intel 10Gb

2010-05-14 Thread Andrew Gallatin


Alexander Sack wrote:


To use DCA you need:

- A DCA driver to talk to the IOATDMA/DCA pcie device, and obtain the tag
   table
- An interface that a client device (eg, NIC driver) can use to obtain
   either the tag table, or at least the correct tag for the CPU
   that the interrupt handler is bound to.  The basic support in
   a NIC driver boils down to something like:

nic_interrupt_handler()
{
 if (sc->dca.enabled && (curcpu != sc->dca.last_cpu)) {
sc->dca.last_cpu = curcpu;
tag = dca_get_tag(curcpu);
WRITE_REG(sc, DCA_TAG, tag);
 }
}


Drew, at least in the Intel documentation, it seems the NIC uses the
LAPIC id to tell the PCIe TLPs where to put inbound NIC I/O (in the
TLP the DCA info is stored) to the appropriate core's cache.  i.e. the
heuristic you gave above is more granular than what I think Intel


The pseudo-code above was intended to be the MSI-X interrupt handler
for a single queue, not some dispatcher for multiple queues.
Sorry that wasn't clear.  So yes, the DCA tag value may be different
per queue.


does.  I could be wrong, maybe Jack can chime in and correct me.  But
it seems with Intel chipsets it is a per queue parameter which allows
you to bind a core cache's to a queue via DCA.  The added piece to
this for at least bpf(4) consumers is to have bpf(4) subscribe to
these queues AND to allow an interface for libpcap applications to
know where what queue is on what core and THEN bind to it.


Yes, everything associated with a queue must be bound to the same
core (or at least to cores which share a cache).

Drew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Bringing VLANs created with rc.conf vlans_ 'up' on boot?

2010-05-17 Thread Andrew Thompson

On Mon, May 17, 2010 at 10:08:36AM -0700, Peter Kieser wrote:
> Hello,
> 
> I am experimenting with FreeBSD vlan's using the vlans option in 
> rc.conf, my configuration is as follows:
> 
> ifconfig_em1="up"
> vlans_em1="100 101 102 103 104 105 106 107 108 109 110"
> autobridge_interfaces="bridge0"
> autobridge_bridge0="em0 em1.*"
> ifconfig_bridge0="up"
> 
> rc script create em1.100 - em1.110 but doesn't bring the interfaces up on 
> boot, I have to issue 'ifconfig em1.100 up', etc. to bring them online. I 
> also cannot use 'ifconfig_em1.100="up"' because the rc scripts don't 
> support periods in the variable names. Is there a way to accomplish this? 

Use an underscore where the period should be, the rc.d scripts support
this.

ifconfig_em1.100="up" --> ifconfig_em1_100="up"


cheers,
Andrew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Link state changes

2010-06-01 Thread Andrew Thompson

> Revision 205024 - (annotate)
> Thu Mar 11 17:56:46 2010 UTC (2 months, 3 weeks ago) by qingli
>
> The if_tap interface is of IFT_ETHERNET type, but it
> does not set or update the if_link_state variable.
> As such RT_LINK_IS_UP() fails for the if_tap interface.
>
> Also, the RT_LINK_IS_UP() needs to bypass all loopback
> interfaces because loopback interfaces are considered
> up logically as long as the system is running.
>
> This patch fixes the above issues by setting and updating
> the if_link_state variable when the tap interface is
> opened or closed respectively. Similary approach is
> already done in the if_tun device.

This is also a problem for bridge(4) and possibly ef(4), edesc(4) and
epair(4). Should the same change be applied to them?


Andrew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Dual-rate transceivers with ixgbe?

2010-06-10 Thread Andrew Boyer


On Jun 10, 2010, at 3:59 PM, Alexander Sack wrote:
> 
>> One thing that the base driver probably ought to do is not fail in
>> attach if there's an unrecognized SFP+ module.  Since we get
>> interrupts on module change (although this doesn't seem to always work
>> *entirely* right in the stock sources, mostly wrt stored values of
>> AUTOC and the like) it should be possible to bring the interface up
>> with the unsupported (and disabled) SFP+ module and do the SFP+ module
>> probing we already do on hot-swap.
> 
> Alright, let me see if I can test that.  Let me rephrase so I validate
> what you are saying:
> 
> The driver can come up with an unsupported module but disable the
> interface (ifconfig shows the interface, etc.).
> 
> If you then hot-swap a supported SFP, it should come up then with a
> ifconfig down/up cycle.  Right?
> 
> As it stand now, if you load the driver with an unsupported module, it
> will not attach at all causing you to reload the entire driver OR
> reboot the box to have it reattach to the other SFP.
> 

We use this patch to allow the driver to attach when no module is installed.  
This might be a starting point for you.  I haven't tested it without all of our 
other changes in place so my apologies if it doesn't quite work.  We only have 
Intel modules around for testing.

-Andrew

--- ixgbe.c 2010-06-10 16:53:08.0 -0400
+++ ixgbe.c 2010-06-10 16:55:26.0 -0400
@@ -566,7 +566,7 @@
} else if (error == IXGBE_ERR_SFP_NOT_SUPPORTED)
device_printf(dev,"Unsupported SFP+ Module\n");
 
-   if (error) {
+   if (error && error != IXGBE_ERR_SFP_NOT_PRESENT) {
error = EIO;
device_printf(dev,"Hardware Initialization Failure\n");
goto err_late;

--- ixgbe_82598.c   2010-06-10 16:53:24.0 -0400
+++ ixgbe_82598.c   2010-06-10 16:56:31.0 -0400
@@ -257,10 +257,6 @@
ret_val = ixgbe_get_sfp_init_sequence_offsets(hw,
&list_offset,
&data_offset);
-   if (ret_val != IXGBE_SUCCESS) {
-   ret_val = IXGBE_ERR_SFP_NOT_SUPPORTED;
-   goto out;
-   }
break;
default:
break;


--
Andrew Boyerabo...@averesystems.com




___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: ndis: fix ugly code

2010-10-05 Thread Andrew Thompson

On 6 October 2010 09:19, Paul B Mahol  wrote:
> Hi,
>
> If clang did not complain, I would probbaly never spot it.
>
> Patch attached.

Committed.
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: Use lagg(4) or Use Layer-4 Load Balancing?

2008-06-18 Thread Andrew Thompson

On Tue, Jun 17, 2008 at 04:32:03AM -0400, Martes G Wigglesworth wrote:
> Greetings all.
> 
> I have been attempting to research what  I have been informed is
> actually accomplished with layer-4 load balancing.  I have seen many
> articles and reviews that indicate that lagg(4) will accomplish the
> teaming of multiple internet access sorces into a single logical pipe,
> however, I have tried this using a dumb switch two nic interfaces and
> this simply is not the case.  
> 
> I am new and may not have enough cool equipment around, however, aside
> from using the fail-over mode for redundancy, and lacp on a supported
> switch, then if lagg(4) could really combine multiple sources into one
> for use as a larger overall backbone, then I should be able to get
> doulbed bandwidth using two separate ports on an unmanaged switch using
> some option on the lagg(4) driver, which is not the cast.(if this is
> wrong I would be happy to get the correct information, however I have a
> few network engineer references that say that you cannot do anything
> more than layer-2 lacp with appropriate equipment to create an
> isp-supported trunk)  Even in the on-lamp interview the 7.0 developer
> implies that you can do what I am attempting to research however, it is
> not possible at layer 2 without an end-point.

How are you testing this? You need to have multiple IP flows in order to
fully utilise the multiple links. See this snippet from the handbook
(i'll put it in the man page too).

"Since frame ordering is mandatory on Ethernet links then any traffic
between two stations always flows over the same physical link limiting
the maximum speed to that of one interface. The transmit algorithm
attempts to use as much information as it can to distinguish different
traffic flows and balance across the available interfaces."


Does that answer your question, you will not get more speed on a single
download.


Andrew
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

1 2 3 4 5 6 >

1 - 100 of 542 matches

Mail list logo