Re: [ceph-users] Ceph OSD daemon causes network card issues

2019-07-18 Thread Konstantin Shalygin
On 7/18/19 7:43 PM, Geoffrey Rhodes wrote: Sure, also attached. Try to disable flow control via `ethtool -K rx off tx off`. k ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph OSD daemon causes network card issues

2019-07-18 Thread Paul Emmerich
Hi, Intel 82576 is bad. I've seen quite a few problems with these older igb familiy NICs, but losing the PCIe link is a new one. I usually see them getting stuck with a message like "tx queue X hung, resetting device..." Try to disable offloading features using ethtool, that sometimes helps w

Re: [ceph-users] Ceph OSD daemon causes network card issues

2019-07-18 Thread Geoffrey Rhodes
Sure, also attached. cephuser@cephnode6:~$ ethtool -S enp3s0f0 NIC statistics: rx_packets: 3103528 tx_packets: 20954382 rx_bytes: 1385006975 tx_bytes: 30063866207 rx_broadcast: 8 tx_broadcast: 2 rx_multicast: 14098 tx_multicast: 476 multicast: 14098

Re: [ceph-users] Ceph OSD daemon causes network card issues

2019-07-18 Thread Konstantin Shalygin
I've been having an issue since upgrading my cluster to Mimic 6 months ago (previously installed with Luminous 12.2.1). All nodes that have the same PCIe network card seem to loose network connectivity randomly. (frequency ranges from a few days to weeks per host node) The affected nodes only have

[ceph-users] Ceph OSD daemon causes network card issues

2019-07-18 Thread Geoffrey Rhodes
Hi Cephers, I've been having an issue since upgrading my cluster to Mimic 6 months ago (previously installed with Luminous 12.2.1). All nodes that have the same PCIe network card seem to loose network connectivity randomly. (frequency ranges from a few days to weeks per host node) The affected nod