En Mon, 24 Sep 2007 23:51:57 -0300, Robert Dailey <[EMAIL PROTECTED]> escribi�:
> What I meant was that it's not an option because I'm trying to learn > regular > expressions. RE is just as built in as anything else. Ok, let's analyze what you want. You have for instance this text: "<action></action>" which should become "<action/>" You have to match: (opening angle bracket)(any word)(closing angle bracket)(opening angle bracket)(slash)(same word as before)(closing angle bracket) This translates rather directly into this regular expression: r"<(\w+)></\1>" where \w+ means "one or more alphanumeric characters or _", and being surrounded in () creates a group (group number one), which is back-referenced as \1 to express "same word as before" The matched text should be replaced by (opening <)(the word found)(slash)(closing >), that is: r"<\1/>" Using the sub function in module re: py> import re py> source = """ ... <root></root> ... <root/> ... <root><frame type="image"><action></action></frame></root> ... <root><frame type="image"><action/></frame></root> ... """ py> print re.sub(r"<(\w+)></\1>", r"<\1/>", source) <root/> <root/> <root><frame type="image"><action/></frame></root> <root><frame type="image"><action/></frame></root> Now, a more complex example, involving tags with attributes: <frame type="image"></frame> --> <frame type="image" /> You have to match: (opening angle bracket)(any word)(any sequence of words,spaces,other symbols,but NOT a closing angle bracket)(closing angle bracket)(opening angle bracket)(slash)(same word as before)(closing angle bracket) r"<(\w+)([^>]*)></\1>" [^>] means "anything but a >", the * means "may occur many times, maybe zero", and it's enclosed in () to create group 2. py> source = """ ... <root></root> ... <root><frame type="image"></frame></root> ... """ py> print re.sub(r"<(\w+)([^>]*)></\1>", r"<\1\2 />", source) <root /> <root><frame type="image" /></root> Next step would be to allow whitespace wherever it is legal to appear - left as an exercise to the reader. Hint: use \s* -- Gabriel Genellina -- http://mail.python.org/mailman/listinfo/python-list