Carl Banks wrote: > Andrew McLean wrote: >> I have a list of strings, A. I want to find a set of strings B such that >> for any "a in A" there exists "b in B" such that b is a sub-string of a. > > B=A? > >> But I also want to minimise T = sum_j t_j where >> t_j = count of the number of elements in A which have b[j] as a sub-string > > If there not elements in A that are substrings of any other element in > A, and if B=A, then t_j would be 1 for all j. Which means B=A would be > optimal (since elements of B have to be substring of at least one > element in A). It looks like the B={set of all elements in A that are > not a substring of any other element in A} is the generally optimal > solution. > > I suspect you mistyped or omitted something--problem is underspecified > at best.
You are quite right. I was trying to generalise my real problem and missed out a constraint. I also want to keep length(B) small. Unfortunately, I'm a bit unsure about the relative importance of T and length(B), which makes the problem rather ill defined. I'll have to give this a bit more thought.... -- http://mail.python.org/mailman/listinfo/python-list