If it is only giving btrfs errors on 6. 9.x and not the rescue kernel and 6.8.x that would seem like a potential kernel bug. Run on 6.8.x and wait for say 6.10 would be best.
On Fri, Jul 26, 2024, 12:31 PM Richard Shaw <hobbes1...@gmail.com> wrote: > On Fri, Jul 26, 2024 at 8:59 AM John Mellor <john.mel...@gmail.com> wrote: > >> On 2024-07-26 8:25 a.m., Richard Shaw wrote: >> >> On Thu, Jul 25, 2024 at 6:29 PM Jeffrey Walton <noloa...@gmail.com> >> wrote: >> >>> On Thu, Jul 25, 2024 at 2:15 PM Richard Shaw <hobbes1...@gmail.com> >>> wrote: >>> > >>> > I recently had the Fedora install on my laptop go sideways (Ryzen 5 >>> 4500U w/ nvme disk). >>> > >>> > The filesystem was going readonly so I installed System Rescue CD to a >>> thumb drive to investigate. Sure enough I had 4 unrecoverable errors. >>> > >>> > I don't keep anything critical on it so I decided to just reinstall >>> with Fedora 40. Installation went fine but I did notice weird dnf output on >>> my first updated buy everything SEEMED fine... >>> > >>> > I rebooted after the update and tried to log in when after a minute or >>> two the system froze. Rebooted and sure enough a `dmesg | grep BTRFS` >>> showed an error. >>> > >>> > Back to booting with System Rescue CD neither a `btrfs check >>> --check-data-csum` or after mounting, a `btrfs scrub` show any errors. >>> > >>> > So who's right? And if there is an error, what's causing it? I've >>> checked the drive with smartctl and even let the factory HP firmware diag >>> tools run in a loop overnight checking everything without error. >>> >>> The (1) irrecoverable disk errors from the original install, and (2) >>> the errors from the current install, and (3) the errors from dnf >>> indicate (to me) you have a failed NVMe drive. I used to see the >>> symptoms all the time when using SDcards in ARM dev boards. I would >>> put a swap file on the dev board (due to lack of resources), and the >>> drives would fail within about 6 months with the symptoms you >>> describe. >>> >>> Now the interesting part (to me) is, (4) lack of errors reported by >>> some tools. That indicates to me a Chinese drive that misreports drive >>> size and statistics. They usually show up on thumb drives, but I >>> experienced one on a SSD drive years ago. Also see >>> <https://www.google.com/search?q=counterfeit+drive+misreport+size>. >>> >>> All in all, I would replace the NVMe drive with a new one from a >>> trusted source. Not Amazon or eBay. >>> >> >> It's the drive that came with the laptop so unlikely to be a cheap/phony >> drive but the mystery does get deeper... >> >> 1. I was able to see the same results even if I booted to a F40 Live USB. >> I'm thinking that the system caught the problem quick enough the error >> didn't actually get written to the disk. >> >> 2. I consistently see the problem at about 30 seconds (from dmesg) if I >> boot the 6.9.9 or 6.9.10 kernels that have been installed via updates. If I >> boot 6.8.5, the kernel that shipped with F40 I can't reproduce the problem. >> >> Of course that's strange because if this was a widespread issue there >> would be tons of people complaining. >> >> Odds are that you have bad ram or are running the processor clock higher >> than what it can handle. I also had this kind of issue when I had a bad >> video card, but the system generally froze or crashed and left the drive in >> an unrecoverable state. The tools for fixing a btrfs partition are >> generally lacking in Fedora, and the tools that come with btrfs are also >> useless when the failing partition is your active root partition. I don't >> know if Suse has better tools, but its a huge problem with Fedora >> recoverability. >> > > It's an HP Envy Laptop, no ability to overclock. I did upgrade the memory > when I first got it over 3 years ago from 8GB to 16GB but it's plain > DDR4-3200. As I previously mentioned I let the HP diag tools run overnight > and completed 14 cycles without any errors and now I just finished letting > Memtest86+ run for 5 complete cycles without any errors. > > The only common denominator I have found so far is the two 6.9 kernels I > have installed. > > Thanks, > Richard > -- > _______________________________________________ > users mailing list -- users@lists.fedoraproject.org > To unsubscribe send an email to users-le...@lists.fedoraproject.org > Fedora Code of Conduct: > https://docs.fedoraproject.org/en-US/project/code-of-conduct/ > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives: > https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org > Do not reply to spam, report it: > https://pagure.io/fedora-infrastructure/new_issue >
-- _______________________________________________ users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue