Hi, I've hit a bug with u-boot on my BTRFS filesystem, and I'm fairly certain it's a bug and not a corruption issue.
A bit of history on the filesystem. It is a fairly new filesystem as it was being used to give me access to test a wayland application on a Raspberry Pi. The filesystem was about 3 days old when I hit the bug, and I'm fairly certain it never had an unclean shutdown. I have checked the filesystem with "btrfs check" which has found no errors. The filesystem mounts on Linux and is functional. > # btrfs check --check-data-csum /dev/sda2 > Opening filesystem to check... > Checking filesystem on /dev/sda2 > UUID: 18db6211-ac36-42c1-a22f-5e15e1486e0d > [1/7] checking root items > [2/7] checking extents > [3/7] checking free space tree > [4/7] checking fs roots > [5/7] checking csums against data > [6/7] checking root refs > [7/7] checking quota groups skipped (not enabled on this FS) > found 5070573568 bytes used, no error found > total csum bytes: 4451620 > total tree bytes: 370458624 > total fs tree bytes: 353124352 > total extent tree bytes: 10010624 > btree space waste bytes: 62303284 > file data blocks allocated: 6786519040 > referenced 6328619008 I've made an image of the filesystem so I could reproduce the bug in an environment that doesn't require the physical SBC, and have reproduced the issue using the head of the master branch with "qemu-x86_64_defconfig". My testing qemu was produced with the following: > # make qemu-x86_64_defconfig > # cat << EOF >> .config > CONFIG_AUTOBOOT=y > CONFIG_BOOTDELAY=1 > CONFIG_USE_BOOTCOMMAND=y > CONFIG_BOOTSTD_DEFAULTS=y > CONFIG_BOOTSTD_FULL=y > CONFIG_CMD_BOOTFLOW_FULL=y > CONFIG_BOOTCOMMAND="bootflow scan -lb" > CONFIG_ENV_IS_NOWHERE=y > CONFIG_LZ4=y > CONFIG_BZIP2=y > CONFIG_ZSTD=y > CONFIG_FS_BTRFS=y > CONFIG_CMD_BTRFS=y > CONFIG_GZIP=y > CONFIG_DEVICE_TREE_INCLUDES="bootstd.dtsi" > EOF > # make -j24 bootstd.dtsi is placed at "arch/x86/dts/bootstd.dtsi" and contains: > / { > bootstd { > compatible = "u-boot,boot-std"; > filename-prefixes = "/@boot/", "/boot/", "/"; > bootdev-order = "scsi"; > extlinux { > compatible = "u-boot,extlinux"; > }; > }; > }; The VM was run with > qemu-system-x86_64 -bios u-boot.rom -nographic -M q35 -action reboot=shutdown > -drive file=/mnt/dbg/u-boot-debug.img The error message I recive on boot is > BUG at fs/btrfs/extent-io.c:629/free_extent_buffer_internal()! > BUG! > resetting ... I added a print statement to free_extent_buffer_internal that prints the start address of the extent_buffer as I'm not sure what to be looking for here. This print statement is before the decrement. > printf("free_extent_buffer_internal: eb->start[%llx] eb->refs[%i]\n", > eb->start, eb->refs); The last message before the crash reported eb->start to be "0", with 0 refs. > free_extent_buffer_internal: eb->start[0] eb->refs[0] The extent at 0 struck me as odd, so I tried commenting out the freeing, by removing the call to free_extent_buffer_final, and this resulted in bootflow succeeding and showing me the boot menu, which suprised me. I expected to see the bug reproduce itself, with refs being zero, but eb->start pointing somewhere valid, but I instead got a valid address with refs at 2. I'm assuming that the order free_extent_buffer_internal is called is deterministic, so by counting the print outputs the line that prior held the extent_buffer with a 0 start was replaced with: > free_extent_buffer_internal: eb->start[249c000] eb->refs[2] Interestingly, as can be seen in the full logs with my included print messages, 249c000 is being used just before this, with a ref count of 2. 249c000 does appear to reach a point where it should have been freed in the past, before it gets used again as seen in both logs. The failing boot log: > U-Boot SPL 2024.04-00949-g1dd659fd62-dirty (Apr 22 2024 - 11:32:37 +1000) > Trying to boot from SPI > Jumping to 64-bit U-Boot: Note many features are missing > > > U-Boot 2024.04-00949-g1dd659fd62-dirty (Apr 22 2024 - 11:32:37 +1000) > > CPU: QEMU Virtual CPU version 2.5+ > DRAM: 128 MiB > Core: 13 devices, 13 uclasses, devicetree: separate > Loading Environment from nowhere... OK > Model: QEMU x86 (I440FX) > Net: e1000: 00:00:00:00:00:00 > > Error: e1000#0 No valid MAC address found. > eth_initialize() No ethernet found. > > > Hit any key to stop autoboot: 0 > Scanning for bootflows in all bootdevs > Seq Method State Uclass Part Name Filename > --- ----------- ------ -------- ---- ------------------------ > ---------------- > scanning bus for devices... > Target spinup took 0 ms. > SATA link 1 timeout. > Target spinup took 0 ms. > SATA link 3 timeout. > SATA link 4 timeout. > SATA link 5 timeout. > AHCI 0001.0000 32 slots 6 ports 1.5 Gbps 0x3f impl SATA mode > flags: 64bit ncq only > free_extent_buffer_internal: eb->start[10000] eb->refs[1] > free_extent_buffer_internal: eb->start[1548000] eb->refs[1] > free_extent_buffer_internal: eb->start[150c000] eb->refs[1] > free_extent_buffer_internal: eb->start[154c000] eb->refs[1] > free_extent_buffer_internal: eb->start[1544000] eb->refs[2] > free_extent_buffer_internal: eb->start[28f4000] eb->refs[2] > free_extent_buffer_internal: eb->start[28f4000] eb->refs[2] > free_extent_buffer_internal: eb->start[28f4000] eb->refs[2] > free_extent_buffer_internal: eb->start[2490000] eb->refs[1] > free_extent_buffer_internal: eb->start[6de0000] eb->refs[1] > free_extent_buffer_internal: eb->start[28f4000] eb->refs[1] > free_extent_buffer_internal: eb->start[1544000] eb->refs[1] > Device 0: (0:0) Vendor: ATA Prod.: QEMU HARDDISK Rev: 2.5+ > Type: Hard Disk > Capacity: 58680.0 MB = 57.3 GB (120176640 x 512) > timeout exit! > Scanning bootdev 'ahci_scsi.id0lun0.bootdev': > free_extent_buffer_internal: eb->start[10000] eb->refs[1] > free_extent_buffer_internal: eb->start[1548000] eb->refs[1] > free_extent_buffer_internal: eb->start[150c000] eb->refs[1] > free_extent_buffer_internal: eb->start[154c000] eb->refs[1] > free_extent_buffer_internal: eb->start[1544000] eb->refs[2] > free_extent_buffer_internal: eb->start[28f4000] eb->refs[2] > free_extent_buffer_internal: eb->start[28f4000] eb->refs[2] > free_extent_buffer_internal: eb->start[28f4000] eb->refs[2] > free_extent_buffer_internal: eb->start[2490000] eb->refs[2] > free_extent_buffer_internal: eb->start[28f4000] eb->refs[2] > free_extent_buffer_internal: eb->start[24a0000] eb->refs[1] > free_extent_buffer_internal: eb->start[249c000] eb->refs[2] > free_extent_buffer_internal: eb->start[24a0000] eb->refs[1] > free_extent_buffer_internal: eb->start[249c000] eb->refs[2] > free_extent_buffer_internal: eb->start[24a4000] eb->refs[1] > free_extent_buffer_internal: eb->start[249c000] eb->refs[2] > free_extent_buffer_internal: eb->start[2490000] eb->refs[1] > free_extent_buffer_internal: eb->start[249c000] eb->refs[1] > free_extent_buffer_internal: eb->start[6de0000] eb->refs[1] > free_extent_buffer_internal: eb->start[28f4000] eb->refs[1] > free_extent_buffer_internal: eb->start[1544000] eb->refs[1] > free_extent_buffer_internal: eb->start[10000] eb->refs[1] > free_extent_buffer_internal: eb->start[1548000] eb->refs[1] > free_extent_buffer_internal: eb->start[150c000] eb->refs[1] > free_extent_buffer_internal: eb->start[154c000] eb->refs[1] > free_extent_buffer_internal: eb->start[1544000] eb->refs[2] > free_extent_buffer_internal: eb->start[28f4000] eb->refs[2] > free_extent_buffer_internal: eb->start[28f4000] eb->refs[2] > free_extent_buffer_internal: eb->start[28f4000] eb->refs[2] > free_extent_buffer_internal: eb->start[2490000] eb->refs[2] > free_extent_buffer_internal: eb->start[28f4000] eb->refs[2] > free_extent_buffer_internal: eb->start[24a0000] eb->refs[1] > free_extent_buffer_internal: eb->start[249c000] eb->refs[2] > free_extent_buffer_internal: eb->start[24a0000] eb->refs[1] > free_extent_buffer_internal: eb->start[249c000] eb->refs[2] > free_extent_buffer_internal: eb->start[2490000] eb->refs[2] > free_extent_buffer_internal: eb->start[24a0000] eb->refs[1] > free_extent_buffer_internal: eb->start[249c000] eb->refs[2] > free_extent_buffer_internal: eb->start[24a0000] eb->refs[1] > free_extent_buffer_internal: eb->start[249c000] eb->refs[2] > free_extent_buffer_internal: eb->start[24a4000] eb->refs[1] > free_extent_buffer_internal: eb->start[249c000] eb->refs[2] > free_extent_buffer_internal: eb->start[24a4000] eb->refs[1] > free_extent_buffer_internal: eb->start[249c000] eb->refs[2] > free_extent_buffer_internal: eb->start[24a4000] eb->refs[1] > free_extent_buffer_internal: eb->start[0] eb->refs[0] > BUG at fs/btrfs/extent-io.c:626/free_extent_buffer_internal()! > BUG! > resetting ... The succeeding log with the removed freeing: > U-Boot SPL 2024.04-00949-g1dd659fd62-dirty (Apr 22 2024 - 11:38:25 +1000) > Trying to boot from SPI > Jumping to 64-bit U-Boot: Note many features are missing > > > U-Boot 2024.04-00949-g1dd659fd62-dirty (Apr 22 2024 - 11:38:25 +1000) > > CPU: QEMU Virtual CPU version 2.5+ > DRAM: 128 MiB > Core: 13 devices, 13 uclasses, devicetree: separate > Loading Environment from nowhere... OK > Model: QEMU x86 (I440FX) > Net: e1000: 00:00:00:00:00:00 > > Error: e1000#0 No valid MAC address found. > eth_initialize() No ethernet found. > > > Hit any key to stop autoboot: 0 > Scanning for bootflows in all bootdevs > Seq Method State Uclass Part Name Filename > --- ----------- ------ -------- ---- ------------------------ > ---------------- > scanning bus for devices... > Target spinup took 0 ms. > SATA link 1 timeout. > Target spinup took 0 ms. > SATA link 3 timeout. > SATA link 4 timeout. > SATA link 5 timeout. > AHCI 0001.0000 32 slots 6 ports 1.5 Gbps 0x3f impl SATA mode > flags: 64bit ncq only > free_extent_buffer_internal: eb->start[10000] eb->refs[1] > free_extent_buffer_internal: eb->start[1548000] eb->refs[1] > free_extent_buffer_internal: eb->start[150c000] eb->refs[1] > free_extent_buffer_internal: eb->start[154c000] eb->refs[1] > free_extent_buffer_internal: eb->start[1544000] eb->refs[2] > free_extent_buffer_internal: eb->start[28f4000] eb->refs[2] > free_extent_buffer_internal: eb->start[28f4000] eb->refs[2] > free_extent_buffer_internal: eb->start[28f4000] eb->refs[2] > free_extent_buffer_internal: eb->start[2490000] eb->refs[1] > free_extent_buffer_internal: eb->start[6de0000] eb->refs[1] > free_extent_buffer_internal: eb->start[28f4000] eb->refs[1] > free_extent_buffer_internal: eb->start[1544000] eb->refs[1] > Device 0: (0:0) Vendor: ATA Prod.: QEMU HARDDISK Rev: 2.5+ > Type: Hard Disk > Capacity: 58680.0 MB = 57.3 GB (120176640 x 512) > timeout exit! > Scanning bootdev 'ahci_scsi.id0lun0.bootdev': > free_extent_buffer_internal: eb->start[10000] eb->refs[1] > free_extent_buffer_internal: eb->start[1548000] eb->refs[1] > free_extent_buffer_internal: eb->start[150c000] eb->refs[1] > free_extent_buffer_internal: eb->start[154c000] eb->refs[1] > free_extent_buffer_internal: eb->start[1544000] eb->refs[2] > free_extent_buffer_internal: eb->start[28f4000] eb->refs[2] > free_extent_buffer_internal: eb->start[28f4000] eb->refs[2] > free_extent_buffer_internal: eb->start[28f4000] eb->refs[2] > free_extent_buffer_internal: eb->start[2490000] eb->refs[2] > free_extent_buffer_internal: eb->start[28f4000] eb->refs[2] > free_extent_buffer_internal: eb->start[24a0000] eb->refs[1] > free_extent_buffer_internal: eb->start[249c000] eb->refs[2] > free_extent_buffer_internal: eb->start[24a0000] eb->refs[1] > free_extent_buffer_internal: eb->start[249c000] eb->refs[2] > free_extent_buffer_internal: eb->start[24a4000] eb->refs[1] > free_extent_buffer_internal: eb->start[249c000] eb->refs[2] > free_extent_buffer_internal: eb->start[2490000] eb->refs[1] > free_extent_buffer_internal: eb->start[249c000] eb->refs[1] > free_extent_buffer_internal: eb->start[6de0000] eb->refs[1] > free_extent_buffer_internal: eb->start[28f4000] eb->refs[1] > free_extent_buffer_internal: eb->start[1544000] eb->refs[1] > free_extent_buffer_internal: eb->start[10000] eb->refs[1] > free_extent_buffer_internal: eb->start[1548000] eb->refs[1] > free_extent_buffer_internal: eb->start[150c000] eb->refs[1] > free_extent_buffer_internal: eb->start[154c000] eb->refs[1] > free_extent_buffer_internal: eb->start[1544000] eb->refs[2] > free_extent_buffer_internal: eb->start[28f4000] eb->refs[2] > free_extent_buffer_internal: eb->start[28f4000] eb->refs[2] > free_extent_buffer_internal: eb->start[28f4000] eb->refs[2] > free_extent_buffer_internal: eb->start[2490000] eb->refs[2] > free_extent_buffer_internal: eb->start[28f4000] eb->refs[2] > free_extent_buffer_internal: eb->start[24a0000] eb->refs[1] > free_extent_buffer_internal: eb->start[249c000] eb->refs[2] > free_extent_buffer_internal: eb->start[24a0000] eb->refs[1] > free_extent_buffer_internal: eb->start[249c000] eb->refs[2] > free_extent_buffer_internal: eb->start[2490000] eb->refs[2] > free_extent_buffer_internal: eb->start[24a0000] eb->refs[1] > free_extent_buffer_internal: eb->start[249c000] eb->refs[2] > free_extent_buffer_internal: eb->start[24a0000] eb->refs[1] > free_extent_buffer_internal: eb->start[249c000] eb->refs[2] > free_extent_buffer_internal: eb->start[24a4000] eb->refs[1] > free_extent_buffer_internal: eb->start[249c000] eb->refs[2] > free_extent_buffer_internal: eb->start[24a4000] eb->refs[1] > free_extent_buffer_internal: eb->start[249c000] eb->refs[2] > free_extent_buffer_internal: eb->start[24a4000] eb->refs[1] > free_extent_buffer_internal: eb->start[249c000] eb->refs[2] > free_extent_buffer_internal: eb->start[2490000] eb->refs[1] > free_extent_buffer_internal: eb->start[6de0000] eb->refs[1] > free_extent_buffer_internal: eb->start[28f4000] eb->refs[1] > free_extent_buffer_internal: eb->start[1544000] eb->refs[1] > 0 extlinux ready scsi 2 ahci_scsi.id0lun0.bootdev > /@boot/extlinux/extlinux.conf > ** Booting bootflow 'ahci_scsi.id0lun0.bootdev.part_2' with extlinux > ------------------------------------------------------------ > 1: NixOS - Default > 2: NixOS - Configuration 5 (2024-03-04 11:00 - 24.05.20240205.faf912b) > 3: NixOS - Configuration 4 (2024-03-02 15:05 - 24.05.20240205.faf912b) > 4: NixOS - Configuration 3 (2024-03-02 14:26 - 24.05.20240205.faf912b) > 5: NixOS - Configuration 2 (2024-03-02 14:06 - 24.05.20240205.faf912b) > 6: NixOS - Configuration 1 (1970-01-01 10:00 - 24.05.20240205.faf912b) > Enter choice: 1: NixOS - Default > Retrieving file: > /@boot/extlinux/../nixos/gq8jsgxnhq2wvsjrni3cjw1wb5540wjw-linux-6.1.63-stable_20231123-Image > free_extent_buffer_internal: eb->start[10000] eb->refs[1] I'm not sure where to go from here with the bug, so I'm hoping for some help in tracking it down so it can be fixed. Thanks, Sachi