Re: Auto correction of typos

Brian F. Yulga Mon, 07 Mar 2011 10:00:21 -0800


shawn wilson wrote:

 On Mar 7, 2011 11:37 AM, "Brandon McCaig" <bamcc...@gmail.com>
 wrote:
>
> On Mon, Mar 7, 2011 at 5:22 AM, Ramprasad Prasad
> <ramprasad...@gmail.com>
 wrote:
>> 1) Create a hash of aliases for frequently used domains and their
>> typos For eg gmaill.com => gmail.com hotmal.com =>  hotmail.com
>> etc
>>
>> when I get the email id  with these typos , I will prompt the
>> user for correction  , If accepted then thats fine
>
> It's difficult to predict every possible typo that a user is going
> to make. They could have their hands off-by-one and type something
> completely different. Perhaps you could instead just store a list
> of common mail domains and warn the user to double-check if their
> E-mail address doesn't match any of them.
>


 Yeah. However if he's going to go that far, he might as well just
 check for an mx record for the domain.

 Also, that hash is going to get unwieldy pretty quick. If there's no
 db already, might want to use sqlite.

 Heh, too bad he didn't like my capcha of the email address for
 confirmation. I kinda think that was one of my better ideas. :)

I agree -- I think you would spend considerable time attempting topredict typos, when a simpler validation model could be written andimplemented relatively quickly. For instance, I would tackle thisproblem with some straightforward client-side JavaScript. Beforepassing the form results back to the server for processing (and furthervalidation if desired), you could use the prompt() or confirm() methodsof the Window object. With prompt(), you could require the user toenter the address a second time. With confirm(), you could force themto inspect the address before continuing on with submission. I'm notsure if web bots could work through this logic and still spam you...

I'm not a big fan of my own suggestion, since encountering dialog boxeswhile browsing can be annoying to the user. But, I think most types ofentry validation annoy users, even if it's for their own good. IMO, theCapcha idea is probably the best "bang for the buck"; requires onlymoderate coding (some are just copy-paste code snippets) and gives you ahigh level of "human" authentication. The downside is that the capchaphrases are sometimes ambiguous, and the user might have to try morethan once to submit.

A quick Google search revealed an article you may find useful, if you'dstill like to tackle the problem with your original idea:

http://www.infoq.com/articles/lucene-did-you-mean

Granted, it's not a Perl solution, but it is the same programming logicyou would need. Notice that it requires a suitable dictionary ofreference words (e-mail domains for your case) so you would still have alot of work to do...


Good Luck!
Brian


--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: Auto correction of typos

Reply via email to