Kay Schluehr wrote: > I have a list of strings ls = [s_1,s_2,...,s_n] and want to create a > regular expression sx from it, such that sx.match(s) yields a SRE_Match > object when s starts with an s_i for one i in [0,...,n].
Why do you want to use a regex for this? When you have constant strings there are other solutions. For example, this uses Nicolas Lehuen's pytst module from http://nicolas.lehuen.com/index.php/Pytst >>> import tst >>> tree = tst.TST() >>> tree["aa"] = (1, "String with 'aa'") >>> tree["aahhh"] = (2, "At the doctor's") >>> tree["a"] = (3, "indefinite article") >>> tree.scan("This is a bunch of text. Have you seen an aardvark?", ... tst.TupleListAction()) [('This is ', -8, None), ('a', 1, (3, 'indefinite article')), (' bunch of text. H', -18, None), ('a', 1, (3, 'indefinite article')), ('ve you seen ', -12, None), ('a', 1, (3, 'indefinite article')), ('n ', -2, None), ('aa', 2, (1, "String with 'aa'")), ('rdv', -3, None), ('a', 1, (3, 'indefinite article')), ('rk?', -3, None)] >>> Each returned tuple has three elements: For a substring which is not a match these are: - the unmatched substring - the negative length of the substring - None For a substring which matches: - the matched substring - the positive length of the substring - the value of the match in the TST (the tree) It uses Aho-Corasick for the implementation which is fast and does what you expect it to do. Nor does it have a problem of matching more than 99 possible strings as the regexp approach may have. Andrew [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list