Nico Grubert wrote: > I'd like to split a string where 'and', 'or', 'and not' occurs. > > Example string: > s = 'Smith, R. OR White OR Blue, T. AND Black AND Red AND NOT Green' > > I need to split s in order to get this list: > ['Smith, R.', 'White', 'Blue, T.', 'Back', 'Red', 'Green'] > > Any idea, how I can split a string where 'and', 'or', 'and not' occurs?
try re.split: >>> s = 'Smith, R. OR White OR Blue, T. AND Black AND Red AND NOT Green' >>> import re >>> re.split("AND NOT|AND|OR", s) # look for longest first! ['Smith, R. ', ' White ', ' Blue, T. ', ' Black ', ' Red ', ' Green'] to get rid of the whitespace, you can either use strip >>> [w.strip() for w in re.split("AND NOT|AND|OR", s)] ['Smith, R.', 'White', 'Blue, T.', 'Black', 'Red', 'Green'] or tweak the split pattern somewhat: >>> re.split("\s*(?:AND NOT|AND|OR)\s*", s) ['Smith, R.', 'White', 'Blue, T.', 'Black', 'Red', 'Green'] to make the split case insensitive (so it matches "AND" as well as "and" and "AnD" and any other combination), prepend (?i) to the pattern: >>> re.split("(?i)\s*(?:and not|and|or)\s*", s) ['Smith, R.', 'White', 'Blue, T.', 'Black', 'Red', 'Green'] to keep the separators, change (?:...) to (...): >>> re.split("(?i)\s*(and not|and|or)\s*", s) ['Smith, R.', 'OR', 'White', 'OR', 'Blue, T.', 'AND', 'Black', 'AND', 'Red', 'AND NOT', 'Green'] hope this helps! </F> -- http://mail.python.org/mailman/listinfo/python-list