On Fri, Jul 26, 2024 at 8:59 AM John Mellor <john.mel...@gmail.com> wrote:
> On 2024-07-26 8:25 a.m., Richard Shaw wrote: > > On Thu, Jul 25, 2024 at 6:29 PM Jeffrey Walton <noloa...@gmail.com> wrote: > >> On Thu, Jul 25, 2024 at 2:15 PM Richard Shaw <hobbes1...@gmail.com> >> wrote: >> > >> > I recently had the Fedora install on my laptop go sideways (Ryzen 5 >> 4500U w/ nvme disk). >> > >> > The filesystem was going readonly so I installed System Rescue CD to a >> thumb drive to investigate. Sure enough I had 4 unrecoverable errors. >> > >> > I don't keep anything critical on it so I decided to just reinstall >> with Fedora 40. Installation went fine but I did notice weird dnf output on >> my first updated buy everything SEEMED fine... >> > >> > I rebooted after the update and tried to log in when after a minute or >> two the system froze. Rebooted and sure enough a `dmesg | grep BTRFS` >> showed an error. >> > >> > Back to booting with System Rescue CD neither a `btrfs check >> --check-data-csum` or after mounting, a `btrfs scrub` show any errors. >> > >> > So who's right? And if there is an error, what's causing it? I've >> checked the drive with smartctl and even let the factory HP firmware diag >> tools run in a loop overnight checking everything without error. >> >> The (1) irrecoverable disk errors from the original install, and (2) >> the errors from the current install, and (3) the errors from dnf >> indicate (to me) you have a failed NVMe drive. I used to see the >> symptoms all the time when using SDcards in ARM dev boards. I would >> put a swap file on the dev board (due to lack of resources), and the >> drives would fail within about 6 months with the symptoms you >> describe. >> >> Now the interesting part (to me) is, (4) lack of errors reported by >> some tools. That indicates to me a Chinese drive that misreports drive >> size and statistics. They usually show up on thumb drives, but I >> experienced one on a SSD drive years ago. Also see >> <https://www.google.com/search?q=counterfeit+drive+misreport+size>. >> >> All in all, I would replace the NVMe drive with a new one from a >> trusted source. Not Amazon or eBay. >> > > It's the drive that came with the laptop so unlikely to be a cheap/phony > drive but the mystery does get deeper... > > 1. I was able to see the same results even if I booted to a F40 Live USB. > I'm thinking that the system caught the problem quick enough the error > didn't actually get written to the disk. > > 2. I consistently see the problem at about 30 seconds (from dmesg) if I > boot the 6.9.9 or 6.9.10 kernels that have been installed via updates. If I > boot 6.8.5, the kernel that shipped with F40 I can't reproduce the problem. > > Of course that's strange because if this was a widespread issue there > would be tons of people complaining. > > Odds are that you have bad ram or are running the processor clock higher > than what it can handle. I also had this kind of issue when I had a bad > video card, but the system generally froze or crashed and left the drive in > an unrecoverable state. The tools for fixing a btrfs partition are > generally lacking in Fedora, and the tools that come with btrfs are also > useless when the failing partition is your active root partition. I don't > know if Suse has better tools, but its a huge problem with Fedora > recoverability. > It's an HP Envy Laptop, no ability to overclock. I did upgrade the memory when I first got it over 3 years ago from 8GB to 16GB but it's plain DDR4-3200. As I previously mentioned I let the HP diag tools run overnight and completed 14 cycles without any errors and now I just finished letting Memtest86+ run for 5 complete cycles without any errors. The only common denominator I have found so far is the two 6.9 kernels I have installed. Thanks, Richard
-- _______________________________________________ users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue