On Sep 20, 9:04 pm, crybaby <[EMAIL PROTECTED]> wrote: > I need to traverse a html page with big table that has many row and > columns. For example, how to go 35th td tag and do regex to retireve > the content. After that is done, you move down to 15th td tag from > 35th tag (35+15) and do regex to retrieve the content?
1) You can find your table using one of these methods: a) target_table = soup.find('table', id='car_parts') b) tables = soup.findall('table') target_table = tables[2] The tables are put in a list in the order that they appear on the page. 2) You can get all the td's in the table using this statement: all_tds = target_table.findall('td') 3) You can get the contents of the tags using these statements: print all_tds[34].string print all_tds[49].string Here is an example: from BeautifulSoup import BeautifulSoup doc = """ <html> <head> <title></title> </head> <body> <table> </table> <table> <tr><td>hello</td></tr> <tr><td>world</td><td>goodbye</td></tr> </table> </body> </html> """ soup = BeautifulSoup(doc) tables = soup.findAll('table') target_table = tables[1] all_tds = target_table.findAll('td') print all_tds[0].string print all_tds[2].string --output:-- hello goddbye -- http://mail.python.org/mailman/listinfo/python-list