On Sep 25, 4:11 pm, "exhuma.twn" <[EMAIL PROTECTED]> wrote: > Is it possible to calculate a distance between two chunks of text? I > suppose one could simply do a simple word-count on the chunks > (removing common noise words of course). And then go from there. Maybe > even assigning different weighting to words. But maybe there is a well- > tested and useful algorithm already available?
A good distance between two chunks of text is the number of changes you have to make to one to transform it to the other. You should look at 'difflib' with which you should be able to code up this sort of distance (although the details will depend just on what your text looks like). -- Paul Hankin -- http://mail.python.org/mailman/listinfo/python-list