Hello All, I have a Beautiful Soup question and I'd appreciate any guidance the forum can provide.
Let's say I have a file that looks at file.html pasted below. My goal is to extract all elements where the following is true: <p align="left"> and <div align="center">. The lines should be ordered in the same order as they appear in the file - therefore the output file would look like output.txt below. I experimented with something similar to this code: for i in soup.findAll('p', align="left"): print i for i in soup.findAll('p', align="center"): print i I get something like this: <p align="left">P4</p> <p align="left">P3</p> <p align="left">P1</p> <div align="center">div4b</div> <div align="center">div3b</div> <div align="center">div2b</div> <div align="center">div2a</div> Any guidance would be greatly appreciated. Best, Ira ##########begin: file.html############ <html> <body> <p align="left">P1</p> <p align="right">P2</p> <div align="center">div2a</div> <div align="center">div2b</div> <p align="left">P3</p> <div align="right">div3a</div> <div align="center">div3b</div> <div align="left">div3c</div> <p align="left">P4</p> <div align="left">div4a</div> <div align="center">div4b</div> </body> </html> ##########end: file.html############ ===================begin: output.txt=================== <p align="left">P1</p> <div align="center">div2a</div> <div align="center">div2b</div> <p align="left">P3</p> <div align="center">div3b</div> <p align="left">P4</p> <div align="center">div4b</div> ===================end: output.txt=================== -- http://mail.python.org/mailman/listinfo/python-list