Tommaso Teofili created LUCENE-5548:
---------------------------------------
Summary: Improve flexibility and testability of the classification
module
Key: LUCENE-5548
URL: https://issues.apache.org/jira/browse/LUCENE-5548
Project: Lucene - Core
Issue Type: Improvement
Components: modules/classification
Reporter: Tommaso Teofili
Lucene classification module's flexibility and capabilities may be improved
with the following:
- make it possible to use them "online" (or provide an online version of them)
so that if the underlying index(reader) is updated the classifier doesn't need
to be trained again to take into account newly added docs
- eventually pass a different Analyzer together with the text to be classified
(or directly a TokenStream) to specify custom tokenization/filtering.
- normalize score calculations of existing classifiers
- provide publicly available dataset based accuracy and speed tests
- more Lucene based classification algorithms
Specific subtasks for each of the above topics should be created to discuss
each of them in depth.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]