Jan Johansson writes:

> Mike Larkin <mlar...@nested.page> wrote:
>> On Tue, Mar 09, 2021 at 09:38:57AM -0500, Ian Darwin wrote:
>> > On Tue, Mar 09, 2021 at 09:52:03AM +0100, Jan Johansson wrote:
>> > > If I try to cp or dd the disk image on the host it fails
>> > >
>> > > dd if=disk.raw.old of=disk.raw.bak bs=1m
>> > > dd: disk.raw.old: Input/output error
>> > > 8858+0 records in
>> > > 8858+0 records out
>> > > 9288286208 bytes transferred in 102.048 secs (91018010 bytes/sec)
>> > >
>> > > The host show no other signs of failing hardware.
>> > >
>> > > Is this a software or a hardware error?
>> >
>> > Given that it gives an error outside the VM, it's likely hardware.
>> >
>>
>> Agreed. Sorta hard to fault vmd(8) if it's not even running.
>
> Since these are sparse files, could the vioblk(4) somehow write
> incorrect data that later will make it unreadable such as a
> pointer pointing into nothingness?
>
> The messages
>
> vmd[39543]: vioblk write error: Input/output error
> vmd[39543]: wr vioblk: disk write error
>
> was produced and 01:30 when all the 4 guests and the host all run
> the daily script (which makes backup and other maintenance tasks)
> if that could have any impact.
>
> Should there not be anything on the host logging errors to
> dmesg/syslog such as sd(4) or ahci(4)?
>
> (If it is not obvious my understanding of how the virtio/vioblk
> stuff hooks in to the disk stack is very limited)
>

vmd(8) reads/writes to the disk image files (both raw and qcow2) using
pread(2)/pwrite(2) calls. The qcow2 handling is a bit more complex, but
they're still just calling pread/pwrite as far as I'm aware.

Have you run fsck(8) on your host?

> This drive was installed in august 2020 and if I recall correctly
> it was because of this issue. So I am thinkig cable or
> motherboard.
>
> If I decide to replace would it make sense to make this a
> softraid mirror (RAID1) to avoid or get better indication of this
> kind of problems in the future or would only add more parts that
> can break?
>
> I'am currently trying to provoke the drive from the host with
>
> dd if=/dev/random of=test.raw bs=1m count=17000
>
> then cp/dd and cmp to see if I can make it break for real.

I'd say maybe make sure you have backups of anything important first if
you're purposely going to break things. :-)

--
-Dave Voutila

Reply via email to