The following module was proposed for inclusion in the Module List:

  modid:       Encode::Guess::Educated
  DSLIP:       adpOp
  description: determine encoding based on language model
  userid:      TOMC (Tom Christiansen)
  chapterid:   13 (Internationalization_Locale)
  communities:

  similar:
    Encode::Guess Encode::Detect

  rationale:

    Damian suggested E::Infer. Brian liked E::G::Educated, which has
    the advantage of being a three-level name. I don’t much care, but
    Brian’s seems cool.

    My approach differs from all existing approaches because it uses a
    language model trained against three different very large
    English-language corpora. It correctly determines the encoding
    between several possible 8-bit encodings where the other modules
    fail miserably.

    I had originally thought to put this under Lingua::EN:: somewhere,
    but Damian convinced me that this was wrong. It is only the fact
    that I use English-language models by default that it works on
    English-language text. There is no reason that the user could not
    supply their own training model for some other language, and have it
    perform commensurately well on the non-English text.

    I will make the mechanism for doing this clearer in the beta
    release.

  enteredby:   TOMC (Tom Christiansen)
  enteredon:   Mon Mar  5 19:42:29 2012 GMT

The resulting entry would be:

Encode::Guess::
::Educated        adpOp determine encoding based on language model   TOMC


Thanks for registering,
-- 
The PAUSE

PS: The following links are only valid for module list maintainers:

Registration form with editing capabilities:
  
https://pause.perl.org/pause/authenquery?ACTION=add_mod&USERID=20800000_0a090bfcd67d5620&SUBMIT_pause99_add_mod_preview=1
Immediate (one click) registration:
  
https://pause.perl.org/pause/authenquery?ACTION=add_mod&USERID=20800000_0a090bfcd67d5620&SUBMIT_pause99_add_mod_insertit=1
Peek at the current permissions:
  
https://pause.perl.org/pause/authenquery?pause99_peek_perms_by=me&pause99_peek_perms_query=Encode%3A%3AGuess%3A%3AEducated

Reply via email to