On 2006-09-10, Andrew McLean <[EMAIL PROTECTED]> wrote:
> This really an algorithm question more that a Python question,
> but it would be implemented in Python....

In that case, try comp.programming.  But still...

> I have a list of strings, A. I want to find a set of strings B
> such that for any "a in A" there exists "b in B" such that b is
> a sub-string of a. But I also want to minimise T = sum_j t_j
> where
>
> t_j = count of the number of elements in A which have b[j] as a
> sub-string
>
> My guess is that finding the smallest possible T satisfying the
> constraint would be hard. However, for my application just
> keeping it reasonably small would help.
>
> In my case the list A contains over two million addresses.

I'm afraid I don't understand your description.

How about an example of B for some A?

A = ["abb", "bbc"]
B = ["a", "ab", "abb", "b", "bb", "bbc", "bc", "c"]

Is that what you're after?

-- 
Neil Cerutti
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to