bruns created this revision. bruns added reviewers: Frameworks, astippich. Herald added projects: Frameworks, Baloo. Herald added subscribers: Baloo, kde-frameworks-devel. bruns requested review of this revision.
REVISION SUMMARY Currently, both XML and SVG documents are indexed as plain text due to mimetype inheritance. This fills the content index with meaningless data (tags, attributes, attribute values ...). Use QDomElement::text() for generic XML documents and <text/> nodes for SVG to extract the content. Also try do find Dublin Core metadata and add the relevant properties. Depends on D16488 <https://phabricator.kde.org/D16488> REPOSITORY R286 KFileMetaData BRANCH xml_extractor REVISION DETAIL https://phabricator.kde.org/D16489 AFFECTED FILES src/extractors/CMakeLists.txt src/extractors/xmlextractor.cpp src/extractors/xmlextractor.h To: bruns, #frameworks, astippich Cc: kde-frameworks-devel, #baloo, ashaposhnikov, michaelh, astippich, spoorun, ngraham, bruns, abrahams