Hello,

after struggling with issue for almost 2 months i decided to ask here for 
help.
We have CoreOS nodes with Kubernetes cluster. Only 3 nodes in cluster 
(masters)
have iSCSI attached (from IBM SVC). Problem: 3 nodes that have iSCSI 
attached
are rebooting once in 2-3 days. And this happens regularly. Usually nothing 
in logs
before crash. But once i was able to catch

Dec 06 04:43:02 0000b0112d294c69 kernel: BUG: unable to handle kernel NULL 
pointer dereference at 0000000000000078
Dec 06 04:43:02 0000b0112d294c69 kernel: IP: iscsi_requeue_task+0x169/0x950 
[libiscsi]

We don't observe anything specific before crashes, no performance peaks,
no specific operations in logs. Issue observed only after we started to use 
iscsi.
If it can be related, we are running kubelet (service that runs containers 
in Kubernetes
and also manages iscsi) in container with following configuration [1].

Output from iscsiadm -m session -P3 [2]
Output from multipath -ll [3]
Output from ethtool [4]
Simiar bug in CoreOS [5]
Can also be related [6]

CoreOS 1576.4.0 (reproduced on 1409.7.0 too)
Kernel 4.13.6
Server Lenovo x3650 M5
NIC Intel 82599ES

1 - https://gist.github.com/r7vme/1637b78b3f28da9e551243c5bd1b0613
2 - https://gist.github.com/r7vme/f84b75a392b4c9c6e3f0f8a52e501611
3 - https://gist.github.com/r7vme/5214a070e4c4964067b55c12fd06aeef
4 - https://gist.github.com/r7vme/9a696e5e90931db9515f24a52647db12
5 - https://github.com/coreos/bugs/issues/2167
6 - https://groups.google.com/forum/#!topic/open-iscsi/-HOsGZ_GT9I

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/open-iscsi.
For more options, visit https://groups.google.com/d/optout.

Reply via email to