https://bugs.kde.org/show_bug.cgi?id=404057
Bug ID: 404057 Summary: Uses an insane amount of memory (RSS/PSS) and writes a *ton* of data while indexing Product: frameworks-baloo Version: 5.54.0 Platform: Debian unstable OS: Linux Status: REPORTED Severity: normal Priority: NOR Component: Baloo File Daemon Assignee: baloo-bugs-n...@kde.org Reporter: mar...@lichtvoll.de Target Milestone: --- SUMMARY I see that baloo_file_extractor easily uses 5 GiB or more of RSS (resident memory). The Proportional Set Size which attributes shared memory to all of the processes who share it proportionately is almost as high. So it appears to me the process uses almost all of the memory for itself. STEPS TO REPRODUCE 1. Have it index a lot of files 2. Watch memory usage 3. If you like to kick it beyond any sanity: - have it go at the results of git clone https://github.com/danielmiessler/SecLists.git - here it eats the resources of a quite potent laptop with 16 GiB of RAM as if there was no tomorrow. OBSERVED RESULT Sample of smemstat -T: PID Swap USS PSS RSS User Command 4791 0,0 B 6136,7 M 6142,8 M 6169,7 M martin /usr/bin/baloo_file_extractor PID Swap USS PSS RSS User Command 4791 0,0 B 4595,1 M 4598,2 M 4617,6 M martin /usr/bin/baloo_file_extractor Yes, there are times when Baloo even frees some memory again, just to use even more later on. Granted, this laptop has 16 GiB of RAM, but this still appears to be off for me. Also I see the machine actually swapping out. Also the disk I/O it generates is beyond anything that I would even consider to be remotely sane for a laptop or any desktop machine: pidstat -p 4791 -d 1 Linux 5.0.0-rc4-tp520 (merkaba) 07.02.2019 _x86_64_ (4 CPU) 12:32:21 UID PID kB_rd/s kB_wr/s kB_ccwr/s iodelay Command 12:32:22 1000 4791 75736,00 0,00 0,00 4 baloo_file_extr 12:32:23 1000 4791 33348,00 111232,00 0,00 3 baloo_file_extr 12:32:24 1000 4791 54288,00 0,00 0,00 4 baloo_file_extr 12:32:25 1000 4791 20516,00 119616,00 0,00 2 baloo_file_extr 12:32:26 1000 4791 24296,00 0,00 0,00 2 baloo_file_extr 12:32:27 1000 4791 35532,00 0,00 0,00 3 baloo_file_extr 12:32:28 1000 4791 32548,00 113112,00 0,00 3 baloo_file_extr 12:32:29 1000 4791 26720,00 0,00 0,00 1 baloo_file_extr 12:32:30 1000 4791 24048,00 103496,00 0,00 6 baloo_file_extr 12:32:31 1000 4791 7636,00 0,00 0,00 71 baloo_file_extr 12:32:32 1000 4791 16208,00 0,00 0,00 36 baloo_file_extr 12:32:33 1000 4791 18048,00 0,00 0,00 67 baloo_file_extr 12:32:34 1000 4791 23236,00 0,00 0,00 63 baloo_file_extr 12:32:35 1000 4791 16700,00 0,00 0,00 61 baloo_file_extr 12:32:36 1000 4791 20736,00 122392,00 0,00 23 baloo_file_extr 12:32:37 1000 4791 26752,00 0,00 0,00 36 baloo_file_extr 12:32:38 1000 4791 42456,00 0,00 0,00 4 baloo_file_extr 12:32:39 1000 4791 25156,00 118104,00 0,00 2 baloo_file_extr 12:32:40 1000 4791 12828,00 0,00 0,00 1 baloo_file_extr 12:32:41 1000 4791 14512,00 0,00 0,00 3 baloo_file_extr 12:32:42 1000 4791 7384,00 0,00 0,00 0 baloo_file_extr 12:32:43 1000 4791 2316,00 420664,00 0,00 1 baloo_file_extr 12:32:44 1000 4791 0,00 56520,00 0,00 0 baloo_file_extr 12:32:45 1000 4791 0,00 75188,00 0,00 0 baloo_file_extr 12:32:46 1000 4791 0,00 55376,00 0,00 0 baloo_file_extr 12:32:47 1000 4791 0,00 64496,00 0,00 33 baloo_file_extr 12:32:48 1000 4791 0,00 0,00 0,00 85 baloo_file_extr 12:32:49 1000 4791 0,00 0,00 0,00 89 baloo_file_extr 12:32:50 1000 4791 0,00 0,00 0,00 86 baloo_file_extr 12:32:51 1000 4791 16,00 0,00 0,00 83 baloo_file_extr 12:32:52 1000 4791 2772,00 220,00 0,00 58 baloo_file_extr 12:32:53 1000 4791 28056,00 4,00 0,00 3 baloo_file_extr 12:32:54 1000 4791 81328,00 0,00 0,00 8 baloo_file_extr 12:32:55 1000 4791 71740,00 0,00 0,00 8 baloo_file_extr 12:32:56 1000 4791 46088,00 0,00 0,00 6 baloo_file_extr 12:32:57 1000 4791 44320,00 0,00 0,00 5 baloo_file_extr 12:32:58 1000 4791 29576,00 0,00 0,00 4 baloo_file_extr 12:32:59 1000 4791 41568,00 0,00 0,00 5 baloo_file_extr 12:33:00 1000 4791 31244,00 0,00 0,00 5 baloo_file_extr 12:33:00 UID PID kB_rd/s kB_wr/s kB_ccwr/s iodelay Command 12:33:01 1000 4791 23764,00 0,00 0,00 4 baloo_file_extr 12:33:02 1000 4791 24272,00 0,00 0,00 5 baloo_file_extr 12:33:03 1000 4791 19840,00 0,00 0,00 5 baloo_file_extr 12:33:04 1000 4791 22096,00 0,00 0,00 5 baloo_file_extr 12:33:05 1000 4791 14696,00 0,00 0,00 4 baloo_file_extr 12:33:06 1000 4791 14204,00 0,00 0,00 4 baloo_file_extr 12:33:07 1000 4791 12336,00 0,00 0,00 3 baloo_file_extr 12:33:08 1000 4791 23796,00 0,00 0,00 3 baloo_file_extr 12:33:09 1000 4791 21076,00 0,00 0,00 3 baloo_file_extr 12:33:10 1000 4791 8280,00 194116,00 0,00 2 baloo_file_extr 12:33:11 1000 4791 744,00 777584,00 0,00 4 baloo_file_extr Yep, that is right: that are 770 MiB! 12:33:12 1000 4791 160,00 0,00 0,00 39 baloo_file_extr 12:33:13 1000 4791 16,00 0,00 0,00 90 baloo_file_extr 12:33:14 1000 4791 0,00 0,00 0,00 53 baloo_file_extr 12:33:15 1000 4791 0,00 0,00 0,00 139 baloo_file_extr 12:33:16 1000 4791 0,00 0,00 0,00 103 baloo_file_extr 12:33:17 1000 4791 0,00 29072,00 0,00 88 baloo_file_extr 12:33:18 1000 4791 0,00 70980,00 0,00 68 baloo_file_extr ^C Durchschn.: 1000 4791 19701,54 42669,68 0,00 26 baloo_file_extr Yes, that is about 42 MiB/s! But on the other hand the index size does not nearly increase by that rate. So what does it actually write there? The index is currently at 9,48 GiB. Now I have a gem here: PID Swap USS PSS RSS User Command 4791 0,0 B 8615,9 M 8617,1 M 8630,8 M martin /usr/bin/baloo_file_extractor According to balooctl status during that time it indexed: […]SecLists/Passwords/Common-Credentials/10-million-password-list-top-100000.txt: OK […]SecLists/Passwords/Common-Credentials/10-million-password-list-top-1000000.txt Seriously there are two things wrong with that: - That file is *only* 8.2 MiB big - There is never ever an excuse to use 8 GiB of RSS for file indexing. I bet there should be a size limit at what to grok. Baloo certainly should not try to index files which are several GiB big. And yes, I can tell it to exclude those, but then its something else. In my oppinion it is Baloo's responsibility to keep resource usage within check. So in short: Recent Baloo, I did not see this prior to KDE Frameworks 5.54, at least not in that dimension, basically manages to hog a ThinkPad T520 with Sandybridge dual core, 16 GiB of RAM, and dual SSD BTRFS RAID 1. For now I let it run, in the hope that eventually at some time it completes and stays quiet without me having to kill its processes, as it does not appear to respond to balooctl stop in a reasonable time either. EXPECTED RESULT A more reasonable memory and I/O usage while indexing. Basically Baloo should stay in the background. IMHO there is never ever an excuse for baloo_file_extractor to use 8 GiB or more of RSS. Never… ever… SOFTWARE/OS VERSIONS Linux: Debian Unstable KDE Plasma Version: 5.14.5 KDE Frameworks Version: 5.54 Qt Version: 5.11.3 -- You are receiving this mail because: You are watching all bug changes.