I added extra td tags to your example, for whatever reason I am getting None. When I do the following:
print all_tds[0].string print all_tds[8].string from BeautifulSoup import BeautifulSoup doc = """ <html> <head> <title></title> </head> <body> <table> </table> <table> <tr><td>hello</td></tr> <tr><td>world</td><td>goodbye</td></tr> <tr> <td width=1 height=0 bgcolor="#800000"><img src="/img/ spacer.gif" width=1 height=0 alt="|"/></td> <td align=right width=80><font size=2 face="New Times Roman,Times,Serif"> 48.884 </font></td> <td width=1 height=0 bgcolor="#800000"><img src="/img/ spacer.gif" width=1 height=0 alt="|"/></td> <td align=right width=80><font size=2 face="New Times Roman,Times,Serif"> 49.950 </font></td> <td width=1 height=0 bgcolor="#800000"><img src="/img/ spacer.gif" width=1 height=0 alt="|"/></td> <td align=right width=80><font size=2 face="New Times Roman,Times,Serif"> 69.322 </font></td> <td width=1 height=0 bgcolor="#800000"><img src="/img/ spacer.gif" width=1 height=0 alt="|"/></td> <td align=right width=80><font size=2 face="New Times Roman,Times,Serif"> 99.740 </font></td> <td width=1 height=0 bgcolor="#800000"><img src="/img/ spacer.gif" width=1 height=0 alt="|"/></td> </tr> </table> </body> </html> """ soup = BeautifulSoup(doc) tables = soup.findAll('table') target_table = tables[1] all_tds = target_table.findAll('td') print all_tds[0].string print all_tds[8].string tds_str = all_tds[8].string print tds_str Output I am getting is following: >>> hello None None I am not sure why I am getting None for these lines: print all_tds[0].string print all_tds[8].string On Sep 21, 3:38 am, 7stud <[EMAIL PROTECTED]> wrote: > On Sep 20, 9:04 pm, crybaby <[EMAIL PROTECTED]> wrote: > > > I need to traverse a html page with big table that has many row and > > columns. For example, how to go 35th td tag and do regex to retireve > > the content. After that is done, you move down to 15th td tag from > > 35th tag (35+15) and do regex to retrieve the content? > > 1) You can find your table using one of these methods: > > a) > target_table = soup.find('table', id='car_parts') > > b) > tables = soup.findall('table') > target_table = tables[2] > > The tables are put in a list in the order that they appear on the > page. > > 2) You can get all the td's in the table using this statement: > > all_tds = target_table.findall('td') > > 3) You can get the contents of the tags using these statements: > > print all_tds[34].string > print all_tds[49].string > > Here is an example: > > from BeautifulSoup import BeautifulSoup > > doc = """ > <html> > <head> > <title></title> > </head> > <body> > <table> > </table> > > <table> > <tr><td>hello</td></tr> > <tr><td>world</td><td>goodbye</td></tr> > </table> > </body> > </html> > """ > > soup = BeautifulSoup(doc) > > tables = soup.findAll('table') > target_table = tables[1] > > all_tds = target_table.findAll('td') > print all_tds[0].string > print all_tds[2].string > > --output:-- > hello > goddbye -- http://mail.python.org/mailman/listinfo/python-list