gry@ll.mit.edu writes: > I have a string like: > {'the','dog\'s','bite'} > or maybe: > {'the'} > or sometimes: > {} > > [FYI: this is postgresql database "array" field output format] > > which I'm trying to parse with the re module. > A single quoted string would, I think, be: > r"\{'([^']|\\')*'\}"
what about {'dog \\', ...} ? If you don't need to validate anything you can just forget about the commas etc and extract all the 'strings' with findall, The regexp below is a bit too complicated (adapted from something else) but I think will work: In [90]:rex = re.compile(r"'(?:[^\n]|(?<!\\)(?:\\)(?:\\\\)*\n)*?(?<!\\)(?:\\\\)*?'") In [91]:rex.findall(r"{'the','dog\'s','bite'}") Out[91]:["'the'", "'dog\\'s'", "'bite'"] Otherwise just add something like ",|}$" to deal with the final } instead of a comma. Alternatively, you could also write a regexp to split on the "','" bit and trim the first and the last split. 'as -- http://mail.python.org/mailman/listinfo/python-list