rh0dium wrote: > Michael Spencer wrote: >> >>> def parse(source): >> ... source = source.splitlines() >> ... original, rest = source[0], "\n".join(source[1:]) >> ... return original, rest_eval(get_tokens(rest)) > > This is a very clean and elegant way to separate them - Very nice!! I > like this alot - I will definately use this in the future!! > >> Cheers >> >> Michael > On reflection, this simplifies further (to 9 lines), at least for the test cases your provide, which don't involve any nested parens:
>>> import cStringIO, tokenize ... >>> def get_tokens2(source): ... src = cStringIO.StringIO(source).readline ... src = tokenize.generate_tokens(src) ... return [token[1][1:-1] for token in src if token[0] == tokenize.STRING] ... >>> def parse2(source): ... source = source.splitlines() ... original, rest = source[0], "\n".join(source[1:]) ... return original, get_tokens2(rest) ... >>> This matches your main function for the three tests where main works... >>> for source in sources[:3]: #matches your main function where it works ... assert parse2(source) == main(source) ... Original someFunction Orig someFunction Results ['test', 'foo'] Original someFunction Orig someFunction Results ['test foo'] Original someFunction Orig someFunction Results ['test', 'test1', 'foo aasdfasdf', 'newline', 'test2'] ...and handles the case where main fails (I think correctly, although I'm not entirely sure what your desired output is in this case: >>> parse2(sources[3]) ('getVersion()', ['@(#)$CDS: icfb.exe version 5.1.0 05/22/2005 23:36 (cicln01) $']) >>> If you really do need nested parens, then you'd need the slightly longer version I posted earlier Cheers Michael -- http://mail.python.org/mailman/listinfo/python-list