[ 
https://issues.apache.org/jira/browse/TIKA-568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13045311#comment-13045311
 ] 

Jan Høydahl commented on TIKA-568:
----------------------------------

An intermediate step would perhaps be to add the getDistance() method, but mark 
it as experimental. That way we can start using it but still be aware that the 
actual value it returns may change in the future as the backend algorithm is 
improved.

> Language Detection isReasonablyCertain() hides valuable information
> -------------------------------------------------------------------
>
>                 Key: TIKA-568
>                 URL: https://issues.apache.org/jira/browse/TIKA-568
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Grant Ingersoll
>            Priority: Minor
>         Attachments: TIKA-568.patch
>
>
> LanguageIdentifier.isReasonablyCertain() hardcodes a threshold for language 
> detection, which is fine, except applications should be allowed to decide 
> what threshold suits them.  For instance, how was 0.022 decided?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to