Susan wrote:
If my data looks like this:
word 1: 100 101 101 102 102 102 106 106
word 2: 101 104 106 110 113 129 131 148
word 3: 101 153 175 180 381
word 4: 106 110 113 122 131 137 142 148
word 5: 120 165 169
where word 1,2,3,4,5 represent different words, numbers represent
different attributes of words.
How can I calculate similarity between words?
I am assuming that the numbers are independent, so that 101 and 102 are
as much related as 101 and 175. That is probably a bad assumption,
because I see that an attribute can apply to the same word multiple times.
1. Per word, concatenate the chr() of the attribute values, to make a
string.
2. Calculate the Levenshtein distance (or edit distance) between the
strings.
--
Ruud
--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/