Re: Review Request 129703: [baloo_file_extractor] Limit CPU usage

Anthony Fieroni Tue, 27 Dec 2016 07:18:07 -0800


> On Дек. 27, 2016, 4:29 след обяд, Michael Stemle wrote:
> > src/tools/balooctl/indexer.cpp, line 53
> > <https://git.reviewboard.kde.org/r/129703/diff/2/?file=488136#file488136line53>
> >
> >     This may be a dumb comment, but if there are multiple extractors, each 
> > potentially pulling metadata in a different way (say, one pulls 
> > demographics of the file, its type, its size, etc) and the other pulls 
> > metadata from the file itself, wouldn't we want that to be supported?
> >     
> >     This loop only appears to be running multiple extractions in the event 
> > that there are multiple extractors for the mime-type, each potentially 
> > sticking information into different parts of the `result`.
> >     
> >     Does that make sense? It may be a dumb point, but I'm curious to see 
> > where I'm wrong.

Look at extractors -> 
https://github.com/KDE/kfilemetadata/tree/master/src/extractors they report for 
supported mimetypes and potentially on well-known mimetype you will get only 
one extractor, the dumpass is to get all extractors when mimetype is unknown 
i.e. svg' mimetype is "image/svg+xml" there is no extractor for it, so we 
iterate over all available - huh, why? We can add more flexible code see: get 
all extractors and test if someome can satisfy "inherit" rules mimetype e.g. 
svg is text/plain it can be extracted via plaintextextractor.

- Anthony

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/129703/#review101591
-----------------------------------------------------------

On Дек. 27, 2016, 7:34 преди обяд, Anthony Fieroni wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/129703/
> -----------------------------------------------------------
> 
> (Updated Дек. 27, 2016, 7:34 преди обяд)
> 
> 
> Review request for Baloo and Vishesh Handa.
> 
> 
> Repository: baloo
> 
> 
> Description
> -------
> 
> Processing large directories, +5000 files, can be CPU eater. Large file, 
> itself, can be another issue.
> 
> 
> Diffs
> -----
> 
>   src/file/extractor/app.cpp 97332469 
>   src/tools/balooctl/indexer.cpp 45e42c1c 
> 
> Diff: https://git.reviewboard.kde.org/r/129703/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Anthony Fieroni
> 
>

Re: Review Request 129703: [baloo_file_extractor] Limit CPU usage

Reply via email to