Frank Steinmetzger wrote: > Am Sun, Oct 08, 2023 at 07:44:06PM -0500 schrieb Dale: > >> Just as a update. The file system I was trying to do a file system >> check on was my large one, about 40TBs worth. While running the file >> system check, it started using HUGE amounts of memory. It used almost >> all my 32GBs and most of swap as well. It couldn't finish due to not >> enough memory, it literally crashed itself. So, I don't know if this is >> because of some huge problem or what but if this is expected behavior, >> don't try to do a file system check on devices that large unless you >> have a LOT of memory. > Or use a different filesystem. O:-)
I'm using ext4 which is said to be one of the most reliable and widely used file systems. I do wonder tho, am I creating file systems that may be to large or that it just has trouble with??? I doubt that but I'm up to about 40TBs now. I just can't figure out a way to split that data up, yet. >> I ended up recreating the LVM devices from scratch and redoing the >> encryption as well. I have backups tho. This all started when using >> pvmove to replace a hard drive with a larger drive. I guess pvmove >> isn't always safe. > I think that may be a far-fetched conclusion. If it weren’t safe, it > wouldn’t be in the software – or at least not advertised as safe. > Well, something went sideways. Honestly, I think it might not be pvmove but something happened with the file system itself. After all, LVM wasn't complaining at all and everything showed the move completed with no errors. I guess it is possible pvmove had a problem but given it was the file system that complained so loudly, I'm leaning to it having a issue. >> P. S. I currently have my backup system on my old Gigabyte 770T mobo >> and friends. It is still a bit slower than copying when no encryption >> is used so I guess encryption does slow things down a bit. That said, >> the CPU does hang around 50% most of the time. htop doesn't show what >> is using that so it must be IO or encryption. > You can add more widgets (“meters”) to htop, one of them shows disk > throughput. But there is none for I/O wait. One tool that does show that is > glances. And also dstat which I mentioned a few days ago. Not only can dstat > tell you the total percentage, but also which process is the most expensive > one. > > I set up bash aliases for different use cases of dstat: > alias ,d='dstat --time --cpu --disk -D $(ls /dev/sd? /dev/nvme?n? > /dev/mmcblk? 2>/dev/null | tr "\n" ,) --net --mem --swap' > alias ,dd='dstat --time --cpu --disk --disk-util -D $(ls /dev/sd? > /dev/nvme?n? /dev/mmcblk? 2>/dev/null | tr "\n" ,) --mem-adv' > alias ,dm='dstat --time --cpu --disk -D $(ls /dev/sd? /dev/nvme?n? > /dev/mmcblk? 2>/dev/null | tr "\n" ,) --net --mem-adv --swap' > alias ,dt='dstat --time --cpu --disk -D $(ls /dev/sd? /dev/nvme?n? > /dev/mmcblk? 2>/dev/null | tr "\n" ,) --net --mem --swap --top-cpu --top-bio > --top-io --top-mem' > > Because I attach external storage once in a while, I use a dynamic list of > devices to watch that is passed to the -D argument. If I don’t use -D, dstat > will only show a total for all drives. > > The first is a simple overview (d = dstat). > > The second is the same but only for disk statistics (dd = dstat disks). I > use it mostly on my NAS (five SATA drives in total, which creates a very > wide table). > > The third shows more memory details like dirty cache (dm = dstat memory), > which is interesting when copying large files. > > And the last one shows the top “pigs”, i.e. expensive processes in terms of > CPU, IO and memory (dt = dstat top). > >> Or something kernel >> related that htop doesn't show. No idea. > Perhaps my tool tips give you ideas. :) > > -- Grüße | Greetings | Salut | Qapla’ Please do not share anything > from, with or about me on any social network. What is the difference > between two flutes? – A semitone. Dang, I have a lot of drives here to add to all that. Bad thing is, every time I reboot, all but two I think tend to move around, even tho I haven't moved anything. This is why I use either labels or UUIDs by the way. Once ages ago, I saw a way to make commands/scripts see all drives on a system with some sort of inclusive trick. I think it used brackets but not sure. I can't find that trick anymore. I should have saved that thing. I used some command, can't recall which it was, and I think it is the kernel itself using so much CPU time. Given when it does it, I think it is either processing the encryption or working to send the data to the disks, or both. I'd suspect both but I dunno. Anyway, I'm restoring from a fresh LVM rebuild now. No way to test anything to see what the problem was now. Dale :-) :-)