Hi,
On 11-10-12 01:03 PM, YongHyeon PYUN wrote:
On Wed, Oct 12, 2011 at 10:07:02AM -0400, Karim wrote:
On 11-10-11 01:31 PM, Kevin Oberman wrote:
On Tue, Oct 11, 2011 at 10:10 AM, YongHyeon PYUN<pyu...@gmail.com> wrote:
On Tue, Oct 11, 2011 at 11:40:42AM -0400, Karim wrote:
Hi List,
Using a Marvell NIC plugged into a CISCO switch I see the
auto-negotiation failing and even when forcing the device to full-duplex
we sometimes see packet drops.
Here is the device description from dmesg:
mskc0:<Marvell Yukon 88E8053 Gigabit Ethernet> port 0xbe00-0xbeff mem
0xfdefc000-0xfdefffff irq 16 at device 0.0 on pci1
msk0:<Marvell Technology Group Ltd. Yukon EC Id 0xb6 Rev 0x02> on mskc0
msk0: Ethernet address: 00:03:2d:09:94:52
miibus0:<MII bus> on msk0
e1000phy0:<Marvell 88E1111 Gigabit PHY> PHY 0 on miibus0
e1000phy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
mskc0: [ITHREAD]
The switch its plugged in (Cisco) is configured for 100baseTX
full-duplex.
ifconfig reports:
msk0:
flags=608843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,SATELLITE,LAN_NET>
metric 0 mtu 1500
options=40018<VLAN_MTU,VLAN_HWTAGGING>
The flags and options show that you're using very customized
driver, right?
ether 00:03:2d:09:94:52
inet 192.168.122.7 netmask 0xffffff00 broadcast 192.168.122.255
media: Ethernet autoselect (100baseTX<half-duplex>)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Resolved duplex is half so I guess it would be normal to see
dropped frames which may be triggered by collision.
You have a duplex mis-match. If you are hard setting the remote end to
full, the local end must also be configured o full. Auto-configuration
of duplex requires that both ends run auto-config. When one end is set
to not do auto-config, the other end SHOULD always set to half-duplex.
This is part of 802.3 that is a carry-over from the days when hubs and
coax dominated, so the default was declared to be half. Since so much
hardware now exists with that default, changing it ill never happen.
Either set your computer to full duplex or turn on auto-configuration
on the Cisco.
Very little hardware now in service fails to auto-config correctly,
but the practice of lacking down the duplex setting became common in
the early days of full-duplex when it was not yet a standard and many
Ethernet chip-sets didn't play nice with others. Things would be much
better if people would just stop hard-setting the duplex, but old, old
habits and memes die hard.
Also, contrary to common belief, collisions are NOT errors. They are a
normal part of half-dulpex Ethernet operation and do NOT result in
packets being dropped. Only "excessive collisions do and they ARE a
real error and a clear indication that something is wrong.
Hi,
Thanks for the feedback and detailed information.
I have to clarify here; I get the issue even with forcing full duplex.
The driver modifications are minor and shouldn't affect link negotiation
or link state. Its also interesting that this problem only shows up with
Cisco's switches and faced with other types of switches the issue does
not come up.
Right. It would be also interesting to see MAC statistics of Cisco
switch and check both parties agree on resolved
speed/duplex/flow-control.
Also, nothing like collisions or missed/dropped packets can be found in
msk_stats (see below) that somewhat relate to the issue I'm seeing. One
thing interesting is the ifm_status value from msk_mediastatus. It often
changes from (IFM_AVALID | IFM_ACTIVE) which is 0x3 to 0x1 (IFM_AVALID)
for no reason I can tell.
Hmm, that indicates driver lost established link. msk(4) will
detect this condition and stop RX/TX MACs until it knows PHY
re-established a link. This may be the reason why you see occasional
packet drops. However I don't know why PHY loses established link
in the middle of working.
Yes, I am convinced this lost of link is related to the packet drops as
well. At this point we can safely discard cabling issues or router
issues (physical ones that is) since the same happens on a different
network with different cables.
From the code in e1000phy_status:
static void
e1000phy_status(struct mii_softc *sc)
{
struct mii_data *mii = sc->mii_pdata;
int bmcr, bmsr, ssr;
mii->mii_media_status = IFM_AVALID;
mii->mii_media_active = IFM_ETHER;
bmsr = PHY_READ(sc, E1000_SR) | PHY_READ(sc, E1000_SR);
bmcr = PHY_READ(sc, E1000_CR);
ssr = PHY_READ(sc, E1000_SSR);
if (bmsr& E1000_SR_LINK_STATUS)
mii->mii_media_status |= IFM_ACTIVE;
I can see the bmsr& E1000_SR_LINK_STATUS check failing when the problem
occurs. As a side note why are we ORing the same call twice isn't the
same thing as calling it once:
bmsr = PHY_READ(sc, E1000_SR) | PHY_READ(sc, E1000_SR);
The E1000_SR_LINK_STATUS bit is latched low so it should be read
twice. If you want to read once use E1000_SSR_LINK bit of
E1000_SSR register but I remember that bit was not reliable on some
PHY models.
Thanks for the explanation and the alternative. The ssr register seems
to give me the right bit (E1000_SSR_LINK) but it also gives me an extra
bit 0x0100 that is not defined in e1000phyreg.h. Any idea what that bit
would be/means?
By chance, does your back-ported driver include r222219?
If yes, did you cold boot after applying the change?
Warm boot does have effect.
I do have this patch in the back-ported driver and due to several
reasons I didn't cold boot the appliance. We will give that a try and see.
To be more precises I have included msk patches up to r222516.
Thanks!
Karim.
As requested here is my msk0 stats output right after the problem showed up:
dev.msk.0.stats.rx.ucast_frames: 58103886
dev.msk.0.stats.rx.bcast_frames: 0
dev.msk.0.stats.rx.pause_frames: 0
dev.msk.0.stats.rx.mcast_frames: 0
dev.msk.0.stats.rx.crc_errs: 0
dev.msk.0.stats.rx.good_octets: 5927739395
dev.msk.0.stats.rx.bad_octets: 0
dev.msk.0.stats.rx.frames_64: 53
dev.msk.0.stats.rx.frames_65_127: 58091128
dev.msk.0.stats.rx.frames_128_255: 12545
dev.msk.0.stats.rx.frames_256_511: 40
dev.msk.0.stats.rx.frames_512_1023: 89
dev.msk.0.stats.rx.frames_1024_1518: 31
dev.msk.0.stats.rx.frames_1519_max: 0
dev.msk.0.stats.rx.frames_too_long: 0
dev.msk.0.stats.rx.jabbers: 0
dev.msk.0.stats.rx.overflows: 0
dev.msk.0.stats.tx.ucast_frames: 58104799
dev.msk.0.stats.tx.bcast_frames: 53
dev.msk.0.stats.tx.pause_frames: 0
dev.msk.0.stats.tx.mcast_frames: 0
dev.msk.0.stats.tx.octets: 5927439760
dev.msk.0.stats.tx.frames_64: 547
dev.msk.0.stats.tx.frames_65_127: 58091680
dev.msk.0.stats.tx.frames_128_255: 12576
dev.msk.0.stats.tx.frames_256_511: 17
dev.msk.0.stats.tx.frames_512_1023: 1
dev.msk.0.stats.tx.frames_1024_1518: 32
dev.msk.0.stats.tx.frames_1519_max: 0
dev.msk.0.stats.tx.colls: 2
dev.msk.0.stats.tx.late_colls: 2
dev.msk.0.stats.tx.excess_colls: 0
dev.msk.0.stats.tx.multi_colls: 0
dev.msk.0.stats.tx.single_colls: 2
dev.msk.0.stats.tx.underflows: 0
The 2 collisions occurred before I forced the interface to full-duplex.
Karim.
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"