On 24/11/2011 10:07, Kees Jan Koster wrote: > This seems to be local to my machine. Here is another reason why I > say that: I can reliably transmit data when I bind to the aliased IP > address: If I use mtr to measure packet loss from saffron (the stricken > machine) to cumin (another machine in a different data center) I see the > following: > > saffron (ip address a) -> cumin: packet loss > saffron (ip address b) -> cumin: no packet loss > > cumin -> saffron (ip address a): packet loss > cumin -> saffron (ip address b): no packet loss > > This is consistent from running mtr for 5 minutes straight. This to > me shows that the hardware is fine. Using the alias IP address I can > run with no packet loss for as long as I like. > > Sooo.... Now what? I am completely at a loss. :-/
Hmm... I wouldn't dismiss hardware problems just yet. Earlier you showed the ifconfig output for your problem machine: > [kjkoster@saffron ~]$ ifconfig bge0 > bge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 > > options=8009b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LINKSTATE> > ether 00:e0:81:32:ed:b4 > inet 91.196.169.165 netmask 0xfffffff8 broadcast 91.196.169.167 > inet 91.196.169.166 netmask 0xffffffff broadcast 91.196.169.166 > media: Ethernet autoselect (100baseTX > <full-duplex,flowcontrol,rxpause,txpause>) > status: active Where there is a one-bit difference between the addresses. Can you try temporarily using two even-numbered addresses and then two odd-numbered addresses and repeat your mtr tests? If the packet loss problem correlates with whether the address is even or odd, then I think that's pretty good evidence for a dud network interface: a one-bit problem in a memory register somewhere, occasionally flipping the least significant bit in the address to 0. Another test would be to swap the configuration order (ie. make .166 the primary address and .165 the alias) -- if it's always the first configured address that has problems, again that indicates memory trouble in the hardware. Are these NICs built-in to your motherboard? If so, they will almost certainly share a PHY, which is where the problem would be, and why swapping the cables between interfaces made no difference. Unfortunately in that case to fix the problem, you'll either have to swap out the motherboard or add a separate NIC card to your system. Hopefully the system is still under warranty. Cheers, Matthew -- Dr Matthew J Seaman MA, D.Phil. 7 Priory Courtyard Flat 3 PGP: http://www.infracaninophile.co.uk/pgpkey Ramsgate JID: matt...@infracaninophile.co.uk Kent, CT11 9PW
signature.asc
Description: OpenPGP digital signature