Hi all!

A few minutes ago I just experienced a horrible crash on one of my
computers with Debian Jessie:

-------------------------------------------------------------------------------
Oct 31 17:17:32 ss01 kernel: [694807.832919] general protection fault: 0000 
[#1] SMP
Oct 31 17:17:32 ss01 kernel: [694807.832946] Modules linked in: vhost_net vhost 
macvtap macvlan tun binfmt_misc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd 
fscache sunrpc bridge stp llc snd_hda_codec_analog snd_hda_codec_generic arc4 
rtl8187 nouveau kvm_amd ppdev eeprom_93cx6 kvm mac80211 pcspkr cfg80211 
snd_hda_intel serio_raw mxm_wmi wmi video ttm snd_hda_controller edac_mce_amd 
edac_core drm_kms_helper rfkill snd_hda_codec k8temp drm joydev nv_tco 
i2c_algo_bit evdev snd_hwdep snd_pcm shpchp parport_pc snd_timer i2c_nforce2 
snd parport soundcore processor asus_atk0110 button adt7475 hwmon_vid i2c_core 
firewire_sbp2 loop autofs4 ext4 crc16 mbcache jbd2 dm_mod raid456 
async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 
md_mod sr_mod cdrom hid_generic usbhid hid sg sd_mod crc_t10dif 
crct10dif_generic crct10dif_common psmouse ohci_pci sata_sil24 forcedeth 
firewire_ohci firewire_core ata_generic crc_itu_t floppy ohci_hcd ehci_pci 
ehci_hcd pata_amd sata_nv 8139too 8139cp mii libata fan scsi_mod thermal 
thermal_sys usbcore usb_common
Oct 31 17:17:32 ss01 kernel: [694807.833197] CPU: 1 PID: 22 Comm: khugepaged 
Not tainted 3.16.0-4-amd64 #1 Debian 3.16.36-1+deb8u2
Oct 31 17:17:32 ss01 kernel: [694807.833209] Hardware name: System manufacturer 
System Product Name/M2N32-SLI DELUXE, BIOS ASUS M2N32-SLI DELUXE ACPI BIOS 
Revision 2205 03/02/2009
Oct 31 17:17:32 ss01 kernel: [694807.833223] task: ffff8801b8b38b60 ti: 
ffff8801b8b44000 task.ti: ffff8801b8b44000
Oct 31 17:17:32 ss01 kernel: [694807.833234] RIP: 0010:[<ffffffff81516ee1>]  
[<ffffffff81516ee1>] down_read+0x11/0x20
Oct 31 17:17:32 ss01 kernel: [694807.833248] RSP: 0018:ffff8801b8b478e0  
EFLAGS: 00010246
Oct 31 17:17:32 ss01 kernel: [694807.833256] RAX: 3553423839545239 RBX: 
3553423839545239 RCX: 0000000000000000
Oct 31 17:17:32 ss01 kernel: [694807.833264] RDX: 0000000000000001 RSI: 
0000000000000000 RDI: 3553423839545239
Oct 31 17:17:32 ss01 kernel: [694807.833273] RBP: ffff8801b8b47950 R08: 
ffff8800ba69c200 R09: 0000000000000000
Oct 31 17:17:32 ss01 kernel: [694807.833281] R10: ffff8800ba69c000 R11: 
0000000000000001 R12: 0000000000000000
Oct 31 17:17:32 ss01 kernel: [694807.833289] R13: 0000000000000000 R14: 
ffff880100000000 R15: 0000000000000001
Oct 31 17:17:32 ss01 kernel: [694807.833299] FS:  00007f92ef5f5700(0000) 
GS:ffff8801bfc80000(0000) knlGS:0000000000000000
Oct 31 17:17:32 ss01 kernel: [694807.833308] CS:  0010 DS: 0000 ES: 0000 CR0: 
000000008005003b
Oct 31 17:17:32 ss01 kernel: [694807.833315] CR2: 00007f51b75d4000 CR3: 
0000000001813000 CR4: 00000000000007e0
Oct 31 17:17:32 ss01 kernel: [694807.833324] Stack:
Oct 31 17:17:32 ss01 kernel: [694807.833328]  ffff88005c5e8b80 ffffffff8118ec20 
ffffea0001ce4088 ffff880199f9f538
Oct 31 17:17:32 ss01 kernel: [694807.833341]  ffffea0001ce4088 ffff8801b8b479e0 
0000000000000001 0000000000000000
Oct 31 17:17:32 ss01 kernel: [694807.833354]  ffffea0001ce40a8 ffffea0001ce4088 
ffffffff811773ef 0000000100000001
Oct 31 17:17:32 ss01 kernel: [694807.833366] Call Trace:
Oct 31 17:17:32 ss01 kernel: [694807.833375]  [<ffffffff8118ec20>] ? 
rmap_walk_ksm+0x80/0x180
Oct 31 17:17:32 ss01 kernel: [694807.833385]  [<ffffffff811773ef>] ? 
page_referenced+0x9f/0x110
Oct 31 17:17:32 ss01 kernel: [694807.833394]  [<ffffffff811755b0>] ? 
__page_check_address+0x1c0/0x1c0
Oct 31 17:17:32 ss01 kernel: [694807.833403]  [<ffffffff81176a30>] ? 
page_get_anon_vma+0x70/0x70
Oct 31 17:17:32 ss01 kernel: [694807.833414]  [<ffffffff81152bad>] ? 
shrink_active_list+0x1dd/0x380
Oct 31 17:17:32 ss01 kernel: [694807.833423]  [<ffffffff81153369>] ? 
shrink_lruvec+0x619/0x6a0
Oct 31 17:17:32 ss01 kernel: [694807.833432]  [<ffffffff8109818b>] ? 
try_to_wake_up+0x1cb/0x2f0
Oct 31 17:17:32 ss01 kernel: [694807.833441]  [<ffffffff8109818b>] ? 
try_to_wake_up+0x1cb/0x2f0
Oct 31 17:17:32 ss01 kernel: [694807.833450]  [<ffffffff81153464>] ? 
shrink_zone+0x74/0x1b0
Oct 31 17:17:32 ss01 kernel: [694807.833459]  [<ffffffff8115399d>] ? 
do_try_to_free_pages+0x12d/0x520
Oct 31 17:17:32 ss01 kernel: [694807.833469]  [<ffffffff81153e55>] ? 
try_to_free_pages+0xc5/0x190
Oct 31 17:17:32 ss01 kernel: [694807.833478]  [<ffffffff81148b46>] ? 
__alloc_pages_nodemask+0x726/0xb50
Oct 31 17:17:32 ss01 kernel: [694807.833488]  [<ffffffff8119901f>] ? 
khugepaged+0x59f/0x11d0
Oct 31 17:17:32 ss01 kernel: [694807.833497]  [<ffffffff810a9590>] ? 
prepare_to_wait_event+0xf0/0xf0
Oct 31 17:17:32 ss01 kernel: [694807.833507]  [<ffffffff81198a80>] ? 
maybe_pmd_mkwrite+0x20/0x20
Oct 31 17:17:32 ss01 kernel: [694807.833516]  [<ffffffff810894bd>] ? 
kthread+0xbd/0xe0
Oct 31 17:17:32 ss01 kernel: [694807.833524]  [<ffffffff81089400>] ? 
kthread_create_on_node+0x180/0x180
Oct 31 17:17:32 ss01 kernel: [694807.833534]  [<ffffffff815184d8>] ? 
ret_from_fork+0x58/0x90
Oct 31 17:17:32 ss01 kernel: [694807.833543]  [<ffffffff81089400>] ? 
kthread_create_on_node+0x180/0x180
Oct 31 17:17:32 ss01 kernel: [694807.833550] Code: de 48 89 07 66 b8 00 02 c6 
47 18 01 48 89 47 08 48 8b 7f 10 e9 e1 13 b8 ff 90 66 66 66 66 90 53 48 89 fb 
e8 32 e2 ff ff 48 89 d8 <f0> 48 ff 00 79 05 e8 24 3a da ff 5b c3 66 90 66 66 66 
66 90 53 
Oct 31 17:17:32 ss01 kernel: [694807.833646] RIP  [<ffffffff81516ee1>] 
down_read+0x11/0x20
Oct 31 17:17:32 ss01 kernel: [694807.833656]  RSP <ffff8801b8b478e0>
Oct 31 17:17:32 ss01 kernel: [694807.859234] ---[ end trace 9255f27c78cbeb8f 
]---
-------------------------------------------------------------------------------

At that time I lost SSH access to that computer and the virtual machines it 
have.
The only way I had of getting access was directly in front of it with a 
keyboard.
While I managed to log in, when I tried to issue a "ps", the cursor blinked
without returning any results.


I was researching on the Internet about this failure but I found nothing 
conclusive.
That is, errors of this type (general protection fault), but due to different
exceptions, which does not give me any pattern.

In this case it seems that there was an exception when read the down_read 
function.
But I'm not sure if this is a bug or something related to hardware.

Ideas that help to give a little more light on this issue will be appreciated.

Thanks in advance.


Kind regards,
Daniel

Reply via email to