** Description changed:

  Binary package hint: lvm2
  
  We have a script that automates the creation and removal of a LVM
  snapshots on our VMware servers. Three times now we have had machines go
  down when the snapshot was removed. I have logs showing the machine
- going down immediately after the snapshot removal script (triggered and
- logged by our backup software).
+ going down immediately after the snapshot removal script has fired
+ (triggered and logged by our backup software).
  
  This has occurred on both an HP Pavilion Desktop (uniprocessor, single
  disk) and a Sun Sunfire V60X (SMP, md raid1).
  
  The crash leaves the LV in an inconsistent state with device nodes and
  snapshot names completely out of sync. On all occasions I have been able
  to recover the volume by following the steps below:
  
  Ubuntu 6.06 LTS, LVM Hard Crash repair
  --------------------------------------
  
  observe kernel oops.
  perform hard reset.
  machine comes back up with md2 array dirty, starting background 
reconstruction.
  fails on mounting partitions,boots to single.
  login at console.
  mount /usr
  vi /etc/fstab
  comment out snapshotted lvm partition (/vmware)
  exit. System boots to multi user.
  open ssh shell to system
  
  **** some info before we begin
  
  [EMAIL PROTECTED]:~# lvscan
    ACTIVE            '/dev/vg_sys/lv_tmp' [4.00 GB] inherit
    ACTIVE            '/dev/vg_sys/lv_swap' [4.00 GB] inherit
    ACTIVE            '/dev/vg_sys/lv_var' [1.00 GB] inherit
    ACTIVE            '/dev/vg_sys/lv_usr' [1.00 GB] inherit
    inactive Original '/dev/vg_sys/lv_vmware' [52.00 GB] contiguous
    inactive Snapshot '/dev/vg_sys/lv_vmware_snap' [5.00 GB] inherit
  
  [EMAIL PROTECTED]:~# pvscan
    PV /dev/md2   VG vg_sys   lvm2 [67.33 GB / 340.00 MB free]
    Total: 1 [67.33 GB] / in use: 1 [67.33 GB] / in no VG: 0 [0   ]
  
  [EMAIL PROTECTED]:~# cat /proc/mdstat
  Personalities : [raid1]
  md2 : active raid1 sda3[0] sdb3[1]
        70605568 blocks [2/2] [UU]
        [=====>...............]  resync = 26.8% (18967232/70605568) 
finish=18.0min speed=47688K/sec
  
  md1 : active raid1 sda2[0] sdb2[1]
        979840 blocks [2/2] [UU]
  
  md0 : active raid1 sda1[0] sdb1[1]
        96256 blocks [2/2] [UU]
  
  unused devices: <none>
  
  [EMAIL PROTECTED]:~# uname -a
  Linux anvil 2.6.15-27-server #1 SMP Sat Sep 16 02:57:21 UTC 2006 i686 
GNU/Linux
  
  [EMAIL PROTECTED]:~# ls /dev/mapper/
  control  vg_sys-lv_swap  vg_sys-lv_tmp  vg_sys-lv_usr  vg_sys-lv_var  
vg_sys-lv_vmware  vg_sys-lv_vmware-real
  
  [EMAIL PROTECTED]:~# ls /dev/vg_sys/
  lv_swap  lv_tmp  lv_usr  lv_var
  
  ** lets repair the system
  
  * create some missing device nodes
  [EMAIL PROTECTED]:~# vgmknodes
  
  * fix up the device mapper mess
  [EMAIL PROTECTED]:~# mv /dev/mapper/vg_sys-lv_vmware 
/dev/mapper/vg_sys-lv_vmware_snap 
  [EMAIL PROTECTED]:~# mv /dev/mapper/vg_sys-lv_vmware-real 
/dev/mapper/vg_sys-lv_vmware
  
  * check that our fs still exists
  [EMAIL PROTECTED]:~# fsck /dev/mapper/vg_sys-lv_vmware
  fsck 1.38 (30-Jun-2005)
  e2fsck 1.38 (30-Jun-2005)
  /dev/mapper/vg_sys-lv_vmware: recovering journal
  /dev/mapper/vg_sys-lv_vmware: clean, 75/6815744 files, 10826547/13631488 
blocks
  
  * remove the snapshot
  [EMAIL PROTECTED]:~# lvremove /dev/vg_sys/lv_vmware_snap
    Logical volume "lv_vmware_snap" successfully removed
  
  * renable vmware lvm partition
  [EMAIL PROTECTED]:~# vi /etc/fstab
  [EMAIL PROTECTED]:~# touch /forcefsck
  [EMAIL PROTECTED]:~# reboot
  ** system fscks, and boots normally.

-- 
LVM Snapshot removal causes intermittent kernel panic
https://launchpad.net/bugs/71567

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to