On Jun 18, 10:19 am, Robert Dodier <[EMAIL PROTECTED]> wrote: > Hello, > > I'd like to split a string by commas, but only at the "top level" so > to speak. An element can be a comma-less substring, or a > quoted string, or a substring which looks like a function call. > If some element contains commas, I don't want to split it. > > Examples: > > 'foo, bar, baz' => 'foo' 'bar' 'baz' > 'foo, "bar, baz", blurf' => 'foo' 'bar, baz' 'blurf' > 'foo, bar(baz, blurf), mumble' => 'foo' 'bar(baz, blurf)' 'mumble' > > Can someone suggest a suitable regular expression or other > method to split such strings? > > Thank you very much for your help. > > Robert
You might look at the shlex module. It doesn't get you 100%, but its close: >>> shlex.split('foo, bar, baz') ['foo,', 'bar,', 'baz'] >>> shlex.split( 'foo, "bar, baz", blurf') ['foo,', 'bar, baz,', 'blurf'] >>> shlex.split('foo, bar(baz, blurf), mumble') ['foo,', 'bar(baz,', 'blurf),', 'mumble'] Using a RE will be tricky, especially if it is possible to have recursive nesting (which by definition REs can't handle). For a real general purpose solution you will need to create a custom parser. There are a couple modules out there that can help you with that. pyparsing is one: http://pyparsing.wikispaces.com/ Matt -- http://mail.python.org/mailman/listinfo/python-list