Re: Lost partition tables on ide-hd + ahci drive

Aaron Lauterer Fri, 17 Feb 2023 01:45:21 -0800

I am a bit late, but nonetheless, some comments inline.


On 2/15/23 11:53, Fiona Ebner wrote:

Am 14.02.23 um 19:21 schrieb John Snow:

On Thu, Feb 2, 2023 at 7:08 AM Fiona Ebner <f.eb...@proxmox.com> wrote:


Hi,
over the years we've got 1-2 dozen reports[0] about suddenly
missing/corrupted MBR/partition tables. The issue seems to be very rare
and there was no success in trying to reproduce it yet. I'm asking here
in the hope that somebody has seen something similar.

The only commonality seems to be the use of an ide-hd drive with ahci bus.

It does seem to happen with both Linux and Windows guests (one of the
reports even mentions FreeBSD) and backing storages for the VMs include
ZFS, RBD, LVM-Thin as well as file-based storages.

Relevant part of an example configuration:

   -device 'ahci,id=ahci0,multifunction=on,bus=pci.0,addr=0x7' \
   -drive 
'file=/dev/zvol/myzpool/vm-168-disk-0,if=none,id=drive-sata0,format=raw,cache=none,aio=io_uring,detect-zeroes=on'
 \
   -device 'ide-hd,bus=ahci0.0,drive=drive-sata0,id=sata0' \


The first reports are from before io_uring was used and there are also
reports with writeback cache mode and discard=on,detect-zeroes=unmap.

Some reports say that the issue occurred under high IO load.

Many reports suspect backups causing the issue. Our backup mechanism
uses backup_job_create() for each drive and runs the jobs sequentially.
It uses a custom block driver as the backup target which just forwards
the writes to the actual target which can be a file or our backup server.
(If you really want to see the details, apply the patches in [1] and see
pve-backup.c and block/backup-dump.c).

Of course, the backup job will read sector 0 of the source disk, but I
really can't see where a stray write would happen, why the issue would
trigger so rarely or why seemingly only ide-hd+ahci would be affected.

So again, just asking if somebody has seen something similar or has a
hunch of what the cause might be.


Hi Floria;

I'm sorry to say that I haven't worked on the block devices (or
backup) for a little while now, so I am not immediately sure what
might be causing this problem. In general, I advise against using AHCI
in production as better performance (and dev support) can be achieved
through virtio.


Yes, we also recommend using virtio-{scsi,blk}-pci to our users and most
do. Still, some use AHCI, I'd guess mostly for Windows, but not only.

Still, I am not sure why the combination of AHCI with
backup_job_create() would be corrupting the early sectors of the disk.


It's not clear that backup itself is causing the issue. Some of the
reports do correlate it with backup, but there are no precise timestamps
when the corruption happened. It might be that the additional IO during
backup is somehow triggering the issue.

Do you have any analysis on how much data gets corrupted? Is it the
first sector only, the first few? Has anyone taken a peek at the
backing storage to see if there are any interesting patterns that can
be observed? (Zeroes, garbage, old data?)


It does seem to be the first sector only, but it's not entirely clear.
Many of the affected users said that after fixing the partition table
with TestDisk, the VMs booted/worked normally again. We only have dumps
for the first MiB of three images. In this case, all Windows with Ceph
RBD images.

See below[0] for the dumps. One was a valid MBR and matched the latest
good backup, so that VM didn't boot for some other reason, not sure if
even related to this bug. I did not include this one. One was completely
empty and one contained other data in the first 512 Bytes, then again
zeroes, but those zeroes are nothing special AFAIK.

Unfortunately, we only had direct access to those 3 disks mentioned. I took alook at them and for the first MiB, it matches what @Fiona explained. At thefirst MiB, all 3 disk images looked normal when compared to a similar testWindows installation: the start of the NTFS file system. The VMs were installedin BIOS mode, so no ESP.

Cloning the VMs and replacing the first 512 bytes of the disk image from a goodknown earlier backup to restore the partition table seems to be all that wasnecessary. Afterward, those VMs were able to boot all the way to the Windowslogin screen. That matches the reports we have from the community.


We were not able to confirm the integrity of the rest of the disk, though.

Have any errors or warnings been observed in either the guest or the
host that might offer some clues?


There is a single user who seemed to have hardware issues, and I'd be
inclined to blame those in that case. But none of the other users
reported any errors or warnings, though I can't say if any checked
inside the guests.

Is there any commonality in the storage format being used? Is it
qcow2? Is it network-backed?


There are reports with local ZFS volumes, local LVM-Thin volumes, RBD
images, qcow2 on NFS. So no pattern to be seen.

Apologies for the "tier 1" questions.


Thank you for your time!

Best Regards,
Fiona

@Aaron (had access to the broken images): please correct me/add anything
relevant I missed. Are the broken VMs/backups still present? If yes, can
we ask the user to check the logs inside?

I can ask. I guess the plan would be to clone the failed VM, restore the bootsector and then check the logs (Event Viewer) if there is anything of interest?

It is possible, that we won't be able to get access to the VM itself, if thecustomer doesn't want that for data privacy reasons.


[0]:

febner@enia ~/Downloads % hexdump -C dump-vm-120.raw
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00100000
febner@enia ~/Downloads % hexdump -C dump-vm-130.raw
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000000c0  00 00 19 03 46 4d 66 6e  00 00 00 00 00 00 00 00  |....FMfn........|
000000d0  04 f2 7a 01 00 00 00 00  00 00 00 00 00 00 00 00  |..z.............|
000000e0  f0 a4 01 00 00 00 00 00  c8 4d 5b 99 0c 81 ff ff  |.........M[.....|
000000f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000100  00 42 e1 38 0d da ff ff  00 bc b4 3b 0d da ff ff  |.B.8.......;....|
00000110  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000120  78 00 00 00 01 00 00 00  a8 00 aa 00 00 00 00 00  |x...............|
00000130  a0 71 ba b0 0c 81 ff ff  2e 00 2e 00 00 00 00 00  |.q..............|
00000140  a0 71 ba b0 0c 81 ff ff  00 00 00 00 00 00 00 00  |.q..............|
00000150  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000001a0  5c 00 44 00 65 00 76 00  69 00 63 00 65 00 5c 00  |\.D.e.v.i.c.e.\.|
000001b0  48 00 61 00 72 00 64 00  64 00 69 00 73 00 6b 00  |H.a.r.d.d.i.s.k.|
000001c0  56 00 6f 00 6c 00 75 00  6d 00 65 00 32 00 5c 00  |V.o.l.u.m.e.2.\.|
000001d0  57 00 69 00 6e 00 64 00  6f 00 77 00 73 00 5c 00  |W.i.n.d.o.w.s.\.|
000001e0  4d 00 69 00 63 00 72 00  6f 00 73 00 6f 00 66 00  |M.i.c.r.o.s.o.f.|
000001f0  74 00 2e 00 4e 00 45 00  54 00 5c 00 46 00 72 00  |t...N.E.T.\.F.r.|
00000200  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00100000

Re: Lost partition tables on ide-hd + ahci drive

Reply via email to