Hy guys, I'm using the python-framework BeautifulSoup(BS) to parse some information out of a german soccer-website. I spend some qualitiy time with the BS-docs, but I couldn't really figure out how to get what I was looking for.
Here's the deal: I want to parse the article shown on the website. To do so I want to use the Tag " <div class="txt_fliesstext">" as a starting-point. When I have found the Tag I somehow want to get all following "br"-Tags until there is a new CSS-Class Style is coming up. I tried several options in the findAll()-command, but nothing seems to work.(like: soup.findAll('br',attrs={'class':'txt_fliesstext'}, text =True) - This one comes with a thound addtional Tag that I don't want to have, or soup.findAll(attrs={'class':'txt_fliesstext'}) - This gives me a much better Result, but in this case I only get some few Tags, instead of all the Tags I want) Any suggestions? Thanks in advance! Website: http://www.bundesliga.de/de/liga/news/2007/index.php?f=94820.php Some html-code of the website: <div id="area_headline"> <div class="txt_headline_red">Erst Höhenflug, dann Absturz</ div> </div> <div id="area_fliesstext"> <div class="txt_fliesstext_bold">Mit 28 Punkten stand der KSC nach der Hinrunde sensationell auf Platz 6.</div> <br><br> <div class="txt_fliesstext">Doch in der Rückrunde brachen die Badener regelrecht ein und holten nur noch 15 Zähler.<br /> <br /> 43 Punkte reichten am Ende für den 11. Tabellenplatz, ein mehr als respektables Ergebnis für einen Aufsteiger.<br /> <br /> -- http://mail.python.org/mailman/listinfo/python-list