bruns added a comment.
In D23787#541963 <https://phabricator.kde.org/D23787#541963>, @poboiko wrote: > In D23787#537891 <https://phabricator.kde.org/D23787#537891>, @bruns wrote: > > > Can you please provide an example which: > > > > - is currently indexed though it should be skipped due to size > > - is skipped after this change > > > Sure. Any mimetype inherited from "text/plain", but starting with "text/" counts. I've made an actual list: > F7515259: list.txt <https://phabricator.kde.org/F7515259> > (using simple python script, which iterates over `QMimeDatabase().allMimeTypes()`, checks if `type.inherits("text/plain")` and is not already excluded by default Baloo config from `file/fileexcludefilters.cpp`) Your script is wrong. E.g. SVG inherits from text/plain, but has its own extractor, thus is not fed to the PlaintextExtractor. Dito for anything inheriting from XML. REPOSITORY R293 Baloo REVISION DETAIL https://phabricator.kde.org/D23787 To: poboiko, #baloo, bruns, ngraham Cc: davidedmundson, broulik, kde-frameworks-devel, #baloo, hurikhan77, lots0logs, LeGast00n, fbampaloukas, GB_2, domson, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns, abrahams