Hello, after struggling with issue for almost 2 months i decided to ask here for help. We have CoreOS nodes with Kubernetes cluster. Only 3 nodes in cluster (masters) have iSCSI attached (from IBM SVC). Problem: 3 nodes that have iSCSI attached are rebooting once in 2-3 days. And this happens regularly. Usually nothing in logs before crash. But once i was able to catch
Dec 06 04:43:02 0000b0112d294c69 kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000078 Dec 06 04:43:02 0000b0112d294c69 kernel: IP: iscsi_requeue_task+0x169/0x950 [libiscsi] We don't observe anything specific before crashes, no performance peaks, no specific operations in logs. Issue observed only after we started to use iscsi. If it can be related, we are running kubelet (service that runs containers in Kubernetes and also manages iscsi) in container with following configuration [1]. Output from iscsiadm -m session -P3 [2] Output from multipath -ll [3] Output from ethtool [4] Simiar bug in CoreOS [5] Can also be related [6] CoreOS 1576.4.0 (reproduced on 1409.7.0 too) Kernel 4.13.6 Server Lenovo x3650 M5 NIC Intel 82599ES 1 - https://gist.github.com/r7vme/1637b78b3f28da9e551243c5bd1b0613 2 - https://gist.github.com/r7vme/f84b75a392b4c9c6e3f0f8a52e501611 3 - https://gist.github.com/r7vme/5214a070e4c4964067b55c12fd06aeef 4 - https://gist.github.com/r7vme/9a696e5e90931db9515f24a52647db12 5 - https://github.com/coreos/bugs/issues/2167 6 - https://groups.google.com/forum/#!topic/open-iscsi/-HOsGZ_GT9I -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/open-iscsi. For more options, visit https://groups.google.com/d/optout.
