On Wed, Jan 20, 2021 at 1:08 AM Erik Jensen <[email protected]> wrote:
>
> On Wed, Jan 20, 2021 at 12:31 AM Qu Wenruo <[email protected]> wrote:
> > On 2021/1/20 下午4:21, Qu Wenruo wrote:
> > > On 2021/1/19 下午5:28, Erik Jensen wrote:
> > >> On Mon, Jan 18, 2021 at 9:22 PM Erik Jensen <[email protected]>
> > >> wrote:
> > >>>
> > >>> On Mon, Jan 18, 2021 at 4:12 AM Erik Jensen <[email protected]>
> > >>> wrote:
> > >>>>
> > >>>> The offending system is indeed ARMv7 (specifically a Marvell ARMADA®
> > >>>> 388), but I believe the Broadcom BCM2835 in my Raspberry Pi is
> > >>>> actually ARMv6 (with hardware float support).
> > >>>
> > >>> Using NBD, I have verified that I receive the same error when
> > >>> attempting to mount the filesystem on my ARMv6 Raspberry Pi:
> > >>> [ 3491.339572] BTRFS info (device dm-4): disk space caching is enabled
> > >>> [ 3491.394584] BTRFS info (device dm-4): has skinny extents
> > >>> [ 3492.385095] BTRFS error (device dm-4): bad tree block start, want
> > >>> 26207780683776 have 3395945502747707095
> > >>> [ 3492.514071] BTRFS error (device dm-4): bad tree block start, want
> > >>> 26207780683776 have 3395945502747707095
> > >>> [ 3492.553599] BTRFS warning (device dm-4): failed to read tree root
> > >>> [ 3492.865368] BTRFS error (device dm-4): open_ctree failed
> > >>>
> > >>> The Raspberry Pi is running Linux 5.4.83.
> > >>>
> > >>
> > >> Okay, after some more testing, ARM seems to be irrelevant, and 32-bit
> > >> is the key factor. On a whim, I booted up an i686, 5.8.14 kernel in a
> > >> VM, attached the drives via NBD, ran cryptsetup, tried to mount, and…
> > >> I got the exact same error message.
> > >>
> > > My educated guess is on 32bit platforms, we passed incorrect sector into
> > > bio, thus gave us garbage.
> >
> > To prove that, you can use bcc tool to verify it.
> > biosnoop can do that:
> > https://github.com/iovisor/bcc/blob/master/tools/biosnoop_example.txt
> >
> > Just try mount the fs with biosnoop running.
> > With "btrfs ins dump-tree -t chunk <dev>", we can manually calculate the
> > offset of each read to see if they matches.
> > If not match, it would prove my assumption and give us a pretty good
> > clue to fix.
> >
> > Thanks,
> > Qu
> >
> > >
> > > Is this bug happening only on the fs, or any other btrfs can also
> > > trigger similar problems on 32bit platforms?
> > >
> > > Thanks,
> > > Qu
>
> I have only observed this error on this file system. Additionally, the
> error mounting with the NAS only started after I did a `btrfs replace`
> on all five 8TB drives using an x86_64 system. (Ironically, I did this
> with the goal of making it faster to use the filesystem on the NAS by
> re-encrypting the drives to use a cipher supported by my NAS's crypto
> accelerator.)
>
> Maybe this process of shuffling 40TB around caused some value in the
> filesystem to increment to the point that a calculation using it
> overflows on 32-bit systems?
>
> I should be able to try biosnoop later this week, and I'll report back
> with the results.

Okay, I tried running biosnoop, but I seem to be running into this
bug: https://github.com/iovisor/bcc/issues/3241 (That bug was reported
for cpudist, but I'm seeing the same error when I try to run
biosnoop.)

Anything else I can try?

Reply via email to