[ https://issues.apache.org/jira/browse/TIKA-3343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17311844#comment-17311844 ]
Kenneth William Krugler commented on TIKA-3343: ----------------------------------------------- [~tallison] - I didn't find the specific discussion about removing Tika's language detector (and switching to Optimaize fork, or OpenNLP). But I agree it makes sense. I'd also forgotten about our very lengthy discussion on https://issues.apache.org/jira/browse/TIKA-2790 :) > Remove Tika custom lang detection for 2.x > ----------------------------------------- > > Key: TIKA-3343 > URL: https://issues.apache.org/jira/browse/TIKA-3343 > Project: Tika > Issue Type: Task > Reporter: Tim Allison > Priority: Major > > In the back of my mind, this was an agreed upon change for 2.x. I can't find > documentation, tho, so I'm opening this issue to discuss. > My memory is that we agreed that we should outsource language id to other > tools and remove our own lang ider for 2.x. If my memory is wrong, or if > there's a good reason to keep our language detection algorithm and data, > let's discuss. -- This message was sent by Atlassian Jira (v8.3.4#803005)