parse HTML by class rather than tag

lorean2007 Thu, 22 Feb 2007 23:56:04 -0800

Hello,

i'm would be interested in parsing a HTML files by its corresponding
opening and closing tags but by taking into account the class
attributes and its values,


<html>
<body>
...
<div class="one">
...
<div class="two">
</div>
...
</div>
...
<div class="one">...</div>
<a href="..." class="three">
</body>
</html>

in this example, i will need all content inside div with class="two",
or only class="one",

so i wondering if i should go with regular expression, but i do not
think so as i must jumpt after inner closing div, or with a simple
parser, i've searched and found
http://www.diveintopython.org/html_processing/basehtmlprocessor.html
but i would like the parser not to change anything at all (no
lowercase).

can you help ?

best.

-- 
http://mail.python.org/mailman/listinfo/python-list

parse HTML by class rather than tag

Reply via email to