On 2006-09-10, Andrew McLean <[EMAIL PROTECTED]> wrote: > This really an algorithm question more that a Python question, > but it would be implemented in Python....
In that case, try comp.programming. But still... > I have a list of strings, A. I want to find a set of strings B > such that for any "a in A" there exists "b in B" such that b is > a sub-string of a. But I also want to minimise T = sum_j t_j > where > > t_j = count of the number of elements in A which have b[j] as a > sub-string > > My guess is that finding the smallest possible T satisfying the > constraint would be hard. However, for my application just > keeping it reasonably small would help. > > In my case the list A contains over two million addresses. I'm afraid I don't understand your description. How about an example of B for some A? A = ["abb", "bbc"] B = ["a", "ab", "abb", "b", "bb", "bbc", "bc", "c"] Is that what you're after? -- Neil Cerutti -- http://mail.python.org/mailman/listinfo/python-list