Re: need some kind of "coherence index" for a group of strings

2016-11-04 Thread duncan smith
On 03/11/16 16:18, Fillmore wrote: > > Hi there, apologies for the generic question. Here is my problem let's > say that I have a list of lists of strings. > > list1:#strings are sort of similar to one another > > my_nice_string_blabla > my_nice_string_blqbli > my_nice_string_bl0bla >

Re: need some kind of "coherence index" for a group of strings

2016-11-03 Thread Mario R. Osorio
I don't know much about these topics but, wouldn't soundex do the job?? On Thursday, November 3, 2016 at 12:18:19 PM UTC-4, Fillmore wrote: > Hi there, apologies for the generic question. Here is my problem let's > say that I have a list of lists of strings. > > list1:#strings are sort of s

Re: need some kind of "coherence index" for a group of strings

2016-11-03 Thread jladasky
On Thursday, November 3, 2016 at 3:47:41 PM UTC-7, jlad...@itu.edu wrote: > On Thursday, November 3, 2016 at 1:09:48 PM UTC-7, Neil D. Cerutti wrote: > > you may also be > > able to use some items "off the shelf" from Python's difflib. > > I wasn't aware of that module, thanks for the tip! > > d

Re: need some kind of "coherence index" for a group of strings

2016-11-03 Thread Fillmore
On 11/3/2016 6:47 PM, jlada...@itu.edu wrote: On Thursday, November 3, 2016 at 1:09:48 PM UTC-7, Neil D. Cerutti wrote: you may also be able to use some items "off the shelf" from Python's difflib. I wasn't aware of that module, thanks for the tip! difflib.SequenceMatcher.ratio() returns a nu

Re: need some kind of "coherence index" for a group of strings

2016-11-03 Thread jladasky
On Thursday, November 3, 2016 at 1:09:48 PM UTC-7, Neil D. Cerutti wrote: > you may also be > able to use some items "off the shelf" from Python's difflib. I wasn't aware of that module, thanks for the tip! difflib.SequenceMatcher.ratio() returns a numerical value which represents the "similari

Re: need some kind of "coherence index" for a group of strings

2016-11-03 Thread Neil D. Cerutti
On 11/3/2016 1:49 PM, jlada...@itu.edu wrote: The Levenshtein distance is a very precise definition of dissimilarity between sequences. It specifies the minimum number of single-element edits you would need to change one sequence into another. You are right that it is fairly expensive to com

Re: need some kind of "coherence index" for a group of strings

2016-11-03 Thread jladasky
The Levenshtein distance is a very precise definition of dissimilarity between sequences. It specifies the minimum number of single-element edits you would need to change one sequence into another. You are right that it is fairly expensive to compute. But you asked for an algorithm that would

Re: need some kind of "coherence index" for a group of strings

2016-11-03 Thread justin walters
On Thu, Nov 3, 2016 at 9:18 AM, Fillmore wrote: > > Hi there, apologies for the generic question. Here is my problem let's say > that I have a list of lists of strings. > > list1:#strings are sort of similar to one another > > my_nice_string_blabla > my_nice_string_blqbli > my_nice_stri

need some kind of "coherence index" for a group of strings

2016-11-03 Thread Fillmore
Hi there, apologies for the generic question. Here is my problem let's say that I have a list of lists of strings. list1:#strings are sort of similar to one another my_nice_string_blabla my_nice_string_blqbli my_nice_string_bl0bla my_nice_string_aru list2:#strings are mostly