poboiko created this revision. poboiko added reviewers: Frameworks, Baloo. Herald added projects: Frameworks, Baloo. Herald added a subscriber: kde-frameworks-devel. poboiko requested review of this revision.
REVISION SUMMARY Currently, after the indexing is done, there is a huge difference between `Actual Size` and `Expected Size` in `balooctl indexSize`. (I would estimate it as ~30%, based on personal experience). The difference because how `LMDB` works: it never shrinks, and if we ask it to remove or replace something, free'd pages go to freelist (i.e. useless empty pages waiting to be reused). That wasted space is indeed a freelist (can be checked via `mdb_stat -nf ~/.local/share/baloo/index` -- see `Free pages`). `LMDB` doesn't allow to compress the DB and remove all free pages "in situ", but it allows to create a copy of the DB without freelist (see `mdb_env_copy2` with `MDB_CP_COMPACT` flag). Because it creates another copy (which might eat a lot of space), I would not suggest to do it automatically, but instead add a possibility to do it manually. This patch adds a `copy` command to `balooctl` tool, which does it. It allows to create a backup of index (either in a desired place or in temporary directory), and its main usage is to shrink the DB, in a following way: $ balooctl copy /tmp/index $ balooctl stop $ mv /tmp/index ~/.local/share/baloo/index $ balooctl start TEST PLAN It works as expected: before copy, indexSize gave me ~500MB, after copy - 327MB (funny, the actual size after copy is even smaller than expected size, that is 389MB). DB seems to be fine, i.e. searching works REPOSITORY R293 Baloo BRANCH freepages (branched from master) REVISION DETAIL https://phabricator.kde.org/D16876 AFFECTED FILES src/engine/database.cpp src/engine/database.h src/tools/balooctl/main.cpp To: poboiko, #frameworks, #baloo Cc: kde-frameworks-devel, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns, abrahams