On 10/22/2009 03:23 AM, Gabriel Genellina wrote: > En Wed, 21 Oct 2009 15:14:32 -0300, <ru...@yahoo.com> escribió: > >> On Oct 21, 4:59 am, Bruno Desthuilliers <bruno. >> 42.desthuilli...@websiteburo.invalid> wrote: >>> beSTEfar a écrit : >>> (snip) >>> > When parsing strings, use Regular Expressions. >>> >>> And now you have _two_ problems <g> >>> >>> For some simple parsing problems, Python's string methods are powerful >>> enough to make REs overkill. And for any complex enough parsing (any >>> recursive construct for example - think XML, HTML, any programming >>> language etc), REs are just NOT enough by themselves - you need a full >>> blown parser. >> >> But keep in mind that many XML, HTML, etc parsing problems >> are restricted to a subset where you know the nesting depth >> is limited (often to 0 or 1), and for that large set of >> problems, RE's *are* enough. > > I don't think so. Nesting isn't the only problem. RE's cannot handle > comments, by example. And you must support unquoted attributes, single and > double quotes, any attribute ordering, empty tags, arbitrary whitespace... > If you don't, you are not reading XML (or HTML), only a specific file > format that resembles XML but actually isn't.
OK, then let me rephrase my point as: in the real world it is often not necessary to parse XML in it's full generality; parsing, as you put it, "a specific file format that resembles XML" is all that is really needed. -- http://mail.python.org/mailman/listinfo/python-list