https://bugs.kde.org/show_bug.cgi?id=472525
Bug ID: 472525 Summary: Deleted files not being removed from baloo's index Classification: Frameworks and Libraries Product: frameworks-baloo Version: 5.107.0 Platform: Other OS: Linux Status: REPORTED Severity: normal Priority: NOR Component: Baloo File Daemon Assignee: baloo-bugs-n...@kde.org Reporter: tagwer...@innerjoin.org Target Milestone: --- SUMMARY Files in a directory indexed by baloo can be deleted but they do not disappear from the index There have been various issues logged about this, with troubleshooting and potential root causes buried in the comments. The closest, without actually pinpointing the answer, is probably Bug 437754. The issue here is where the user logs out or reboots before the process finished. STEPS TO REPRODUCE: 1... Make sure you have content indexing enabled and you are indexing your test directory 2... Create a test directory with many files, following the steps in Bug 437754: for i in {00001..50000}; do echo "This is file $i" > file$i.txt; done and watch to see that the files are indexed (it's possible that you'll need a "balooctl check"). Delete 1,000 of these files: rm file20*.txt 3... Logout or reboot, log back in and check the search results baloosearch file20 | wc wait a bit and try again.... OBSERVED RESULT: The deleted files don't get removed from the index. The information that they've been deleted and need to be removed from the index has been lost in the restart. Running "balooctl check" does not nudge baloo to remove the entries EXPECTED RESULT: Three things: Baloo should remember that files have been deleted and resume removing entries after a restart A "balooctl check" should identify missing files and queue the entries for removal (it may be that baloo has missed the iNotify messages that the files have been deleted) You should also be able to follow deletions with "balooctl monitor" SOFTWARE/OS VERSIONS: This was tested on Fedora 38 (that has the BTRFS patch) Fedora Linux 38 Plasma: 5.27.6 Frameworks: 5.107.0 Qt: 5.15.10 ADDITIONAL INFORMATION Baloo is slow removing deleted entries from its index and seems to commit a change to disc after removing each file. This is a lot slower (and far harder on an SSD) than when content indexing where batches of files are indexed and then committed. It's possible to script a clean-up, although it's a hack (part of which being you cannot get a complete list of files baloo has in its index so you have to ask for the file extensions you are interested in and I don't seem able to get the handling of filenames with embedded spaces to work as it ought to): baloosearch txt OR doc OR jpg OR jpeg OR png OR mp3 | sort -u | while read i; do if [ ! -e "$i" ] then echo $i fi done | sed -e 'N;s/\n/ /' | sed -s 'N;s/\n/ /' | sed -s 'N;s/\n/ /' | sed -s 'N;s/\n/ /' | sed -s 'N;s/\n/ /' | sed -s 'N;s/\n/ /' | sed -s 'N;s/\n/ /' | while read line; do balooctl clear $line; done This calls "balooctl clear" with batches of "not there anymore" files. Each clear takes longer but the total disc writes are less -- You are receiving this mail because: You are watching all bug changes.