Near Text Search

Bob Sneidar Wed, 26 Feb 2014 18:07:28 -0800

Hi all.

I’m trying to devise a way to implement a “near text search” when querying a 
mySQL database. My problem is, I get spreadsheet forms filled out by hand from 
our dispatcher, and he sometimes makes typos, just small ones, and I need to 
ensure there are no virtual duplicate customer records in my application. So I 
need to query the database in sic a way that I come up with the nearest 
neighbor. I could do this easily in Foxpro, because they provide an argument 
for it, but I’ve searched around and no one seems to be able to produce a 
nearest neighbor search for text! You can do it for numbers, just not text.


So now I’m trying to devise a way to convert a string to a number in such a way 
that the likelihood there could be a match would be extremely unlikely. So far 
I’ve come up with this:

function textToNum theString
  put lower(theString)
   put 1 into theSeed
   repeat for each char theAscii in theString
      put charToNum(theAscii) into theAsciiCode
      add (theAsciiCode*theSeed) to theNum
      add 1 to theSeed
   end repeat
   return theNum
end textToNum

The idea is that each character position would be multiplied by a seed value 
representing it’s position in the string. However I can foresee that it would 
be statistically possible to get pretty close and even get a match for two 
completely different strings. I *could* use a seed value equal to the number of 
lower case printable characters in the lower ascii table, but that could 
produce HUGE numbers and I am afraid of overflows. 

Any thoughts?

Bob
_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Near Text Search

Reply via email to