[ https://issues.apache.org/jira/browse/TIKA-3340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17311776#comment-17311776 ]
Arky commented on TIKA-3340: ---------------------------- {color:#1d1c1d}[~tallison] This ngram was created using corpus of modern Burmese compiled by John Okell(circa 1990s){color} {color:#1d1c1d}That sounds good.{color} {color:#1d1c1d} Perhaps if you bring me up to speed on opennlp model to include burmese. I would be able to help with other southern/southeast asian languages. {color} {color:#1d1c1d}This patch was created to support the downstream project https://github.com/ICIJ/datashare/issues/781{color} > LanguageProfile for Myanmar > --------------------------- > > Key: TIKA-3340 > URL: https://issues.apache.org/jira/browse/TIKA-3340 > Project: Tika > Issue Type: Improvement > Components: languageidentifier > Reporter: Arky > Priority: Major > > A language profile for detecting Myanmar/Burmese (my). -- This message was sent by Atlassian Jira (v8.3.4#803005)