Tommaso Teofili created LUCENE-5548:
---------------------------------------

             Summary: Improve flexibility and testability of the classification 
module
                 Key: LUCENE-5548
                 URL: https://issues.apache.org/jira/browse/LUCENE-5548
             Project: Lucene - Core
          Issue Type: Improvement
          Components: modules/classification
            Reporter: Tommaso Teofili


Lucene classification module's flexibility and capabilities may be improved 
with the following:
- make it possible to use them "online" (or provide an online version of them) 
so that if the underlying index(reader) is updated the classifier doesn't need 
to be trained again to take into account newly added docs
- eventually pass a different Analyzer together with the text to be classified 
(or directly a TokenStream) to specify custom tokenization/filtering.
- normalize score calculations of existing classifiers
- provide publicly available dataset based accuracy and speed tests
- more Lucene based classification algorithms

Specific subtasks for each of the above topics should be created to discuss 
each of them in depth.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to