Hi, The data files for libexttextcat in this directory:
https://github.com/giuliopaci/libexttextcat/tree/master/langclass/ShortTexts Contains a garbled Hungarian version, it's almost in iso-8859-1 but some characters are destroyed because it doesn't contain all Hungarian characters. It is easy to pick up a utf-8 good version from http://www.ohchr.org/EN/UDHR/Pages/Language.aspx?LangID=hng and see the difference. It's not clear whether this prevents it from classifying Hungarian text correctly, but it may stop it working in utf-8, because most of the other files are in utf-8. Cheers Mark
_______________________________________________ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice