If it is only giving btrfs errors on 6. 9.x and not the rescue kernel and
6.8.x that would seem like a potential kernel bug.  Run on 6.8.x and wait
for say 6.10 would be best.

On Fri, Jul 26, 2024, 12:31 PM Richard Shaw <hobbes1...@gmail.com> wrote:

> On Fri, Jul 26, 2024 at 8:59 AM John Mellor <john.mel...@gmail.com> wrote:
>
>> On 2024-07-26 8:25 a.m., Richard Shaw wrote:
>>
>> On Thu, Jul 25, 2024 at 6:29 PM Jeffrey Walton <noloa...@gmail.com>
>> wrote:
>>
>>> On Thu, Jul 25, 2024 at 2:15 PM Richard Shaw <hobbes1...@gmail.com>
>>> wrote:
>>> >
>>> > I recently had the Fedora install on my laptop go sideways (Ryzen 5
>>> 4500U w/ nvme disk).
>>> >
>>> > The filesystem was going readonly so I installed System Rescue CD to a
>>> thumb drive to investigate. Sure enough I had 4 unrecoverable errors.
>>> >
>>> > I don't keep anything critical on it so I decided to just reinstall
>>> with Fedora 40. Installation went fine but I did notice weird dnf output on
>>> my first updated buy everything SEEMED fine...
>>> >
>>> > I rebooted after the update and tried to log in when after a minute or
>>> two the system froze. Rebooted and sure enough a `dmesg | grep BTRFS`
>>> showed an error.
>>> >
>>> > Back to booting with System Rescue CD neither a `btrfs check
>>> --check-data-csum` or after mounting, a `btrfs scrub` show any errors.
>>> >
>>> > So who's right? And if there is an error, what's causing it? I've
>>> checked the drive with smartctl and even let the factory HP firmware diag
>>> tools run in a loop overnight checking everything without error.
>>>
>>> The (1) irrecoverable disk errors from the original install, and (2)
>>> the errors from the current install, and (3) the errors from dnf
>>> indicate (to me) you have a failed NVMe drive. I used to see the
>>> symptoms all the time when using SDcards in ARM dev boards. I would
>>> put a swap file on the dev board (due to lack of resources), and the
>>> drives would fail within about 6 months with the symptoms you
>>> describe.
>>>
>>> Now the interesting part (to me) is, (4) lack of errors reported by
>>> some tools. That indicates to me a Chinese drive that misreports drive
>>> size and statistics. They usually show up on thumb drives, but I
>>> experienced one on a SSD drive years ago. Also see
>>> <https://www.google.com/search?q=counterfeit+drive+misreport+size>.
>>>
>>> All in all, I would replace the NVMe drive with a new one from a
>>> trusted source. Not Amazon or eBay.
>>>
>>
>> It's the drive that came with the laptop so unlikely to be a cheap/phony
>> drive but the mystery does get deeper...
>>
>> 1. I was able to see the same results even if I booted to a F40 Live USB.
>> I'm thinking that the system caught the problem quick enough the error
>> didn't actually get written to the disk.
>>
>> 2. I consistently see the problem at about 30 seconds (from dmesg) if I
>> boot the 6.9.9 or 6.9.10 kernels that have been installed via updates. If I
>> boot 6.8.5, the kernel that shipped with F40 I can't reproduce the problem.
>>
>> Of course that's strange because if this was a widespread issue there
>> would be tons of people complaining.
>>
>> Odds are that you have bad ram or are running the processor clock higher
>> than what it can handle.  I also had this kind of issue when I had a bad
>> video card, but the system generally froze or crashed and left the drive in
>> an unrecoverable state.  The tools for fixing a btrfs partition are
>> generally lacking in Fedora, and the tools that come with btrfs are also
>> useless when the  failing partition is your active root partition.  I don't
>> know if Suse has better tools, but its a huge problem with Fedora
>> recoverability.
>>
>
> It's an HP Envy Laptop, no ability to overclock. I did upgrade the memory
> when I first got it over 3 years ago from 8GB to 16GB but it's plain
> DDR4-3200. As I previously mentioned I let the HP diag tools run overnight
> and completed 14 cycles without any errors and now I just finished letting
> Memtest86+ run for 5 complete cycles without any errors.
>
> The only common denominator I have found so far is the two 6.9 kernels I
> have installed.
>
> Thanks,
> Richard
> --
> _______________________________________________
> users mailing list -- users@lists.fedoraproject.org
> To unsubscribe send an email to users-le...@lists.fedoraproject.org
> Fedora Code of Conduct:
> https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives:
> https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
> Do not reply to spam, report it:
> https://pagure.io/fedora-infrastructure/new_issue
>
-- 
_______________________________________________
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue

Reply via email to