https://bugs.kde.org/show_bug.cgi?id=472525

            Bug ID: 472525
           Summary: Deleted files not being removed from baloo's index
    Classification: Frameworks and Libraries
           Product: frameworks-baloo
           Version: 5.107.0
          Platform: Other
                OS: Linux
            Status: REPORTED
          Severity: normal
          Priority: NOR
         Component: Baloo File Daemon
          Assignee: baloo-bugs-n...@kde.org
          Reporter: tagwer...@innerjoin.org
  Target Milestone: ---

SUMMARY

    Files in a directory indexed by baloo can be deleted but they do not
    disappear from the index

    There have been various issues logged about this, with troubleshooting
    and potential root causes buried in the comments. The closest, without
actually
    pinpointing the answer, is probably Bug 437754.

    The issue here is where the user logs out or reboots before the process
finished.

STEPS TO REPRODUCE:

1...

    Make sure you have content indexing enabled and you are indexing your
    test directory

2...

    Create a test directory with many files, following the steps in Bug 437754:

        for i in {00001..50000}; do echo "This is file $i" > file$i.txt; done

    and watch to see that the files are indexed (it's possible that you'll need
    a "balooctl check").

    Delete 1,000 of these files:

        rm file20*.txt 

3...

    Logout or reboot, log back in and check the search results

        baloosearch file20 | wc

    wait a bit and try again....

OBSERVED RESULT:

    The deleted files don't get removed from the index. The information that
they've
    been deleted and need to be removed from the index has been lost in the
restart.

    Running "balooctl check" does not nudge baloo to remove the entries

EXPECTED RESULT:

    Three things:

        Baloo should remember that files have been deleted and resume removing
        entries after a restart

        A "balooctl check" should identify missing files and queue the entries
        for removal (it may be that baloo has missed the iNotify messages that
the
        files have been deleted)

        You should also be able to follow deletions with "balooctl monitor"

SOFTWARE/OS VERSIONS:

    This was tested on Fedora 38 (that has the BTRFS patch)

        Fedora Linux 38
        Plasma: 5.27.6
        Frameworks: 5.107.0
        Qt: 5.15.10

ADDITIONAL INFORMATION

    Baloo is slow removing deleted entries from its index and seems to commit
    a change to disc after removing each file. This is a lot slower (and far
harder
    on an SSD) than when content indexing where batches of files are indexed
    and then committed.

    It's possible to script a clean-up, although it's a hack (part of which
being you
    cannot get a complete list of files baloo has in its index so you have to
ask for
    the file extensions you are interested in and I don't seem able to get the
    handling of filenames with embedded spaces to work as it ought to):

    baloosearch txt OR doc OR jpg OR jpeg OR png OR mp3 |
       sort -u |
       while read i; do
          if [ ! -e "$i" ]
             then echo $i
             fi
          done |
             sed -e 'N;s/\n/ /' |
             sed -s 'N;s/\n/ /' |
             sed -s 'N;s/\n/ /' |
             sed -s 'N;s/\n/ /' |
             sed -s 'N;s/\n/ /' |
             sed -s 'N;s/\n/ /' |
             sed -s 'N;s/\n/ /' |
             while read line; do balooctl clear $line; done

    This calls "balooctl clear" with batches of "not there anymore" files. Each
clear
    takes longer but the total disc writes are less

-- 
You are receiving this mail because:
You are watching all bug changes.

Reply via email to