On Jun 22, 2015, at 5:21 PM, Marc Selig <a29508-spamassas...@sedacon.com> wrote:
> On Mon, Jun 22, 2015 at 05:09:45PM -0400, Charles Sprickman wrote: > >> Are there any other options for filtering based on language, or any known >> patches/fixes for TextCat to make it a bit less aggressive when it runs >> across gibberish that is probably not any particular language? > > You could tinker with textcat_acceptable_score. Increasing it slightly > (e.g. back to the old default of 1.05) seems to reduce those wild guesses. I don’t quite follow what exactly this does, the explanation seems a bit circular: textcat_acceptable_score N (default: 1.05) "Include any language that scores at least textcat_acceptable_score in the returned list of languages" I’m bumping it up to see what happens, I’m also lowering "textcat_max_languages” to 3. How can I get more info about what this plugin is doing into the headers? Thanks, Charles > > Regards, > > Marc