Your program is doing what you asked it to do.  It finds the
first table with class 'bp_ergebnis_tab_info'.  Then it ignores
that results.  Then it finds the first "td" item in the document,
and prints the contents of that.  Then it exits.  What did
you want it to do?

   Try this.  It prints out the TD items on each
row of the table, in order.

import urllib2
from BeautifulSoup import BeautifulSoup
page = urllib2.urlopen("http://www.schulministerium.nrw.de/BP/SchuleSuchen?action=799.601437941842&SchulAdresseMapDO=142323";)
soup = BeautifulSoup(page)
table = soup.find('table' ,attrs={'class':'bp_ergebnis_tab_info'})
for row in table.findAll('tr') : # for all TR items (table rows)
    for td in row.findAll('td') : # for TD items in row
        text = td.renderContents().strip()
        print(text)
    print('-----') # mark end of row

                                John Nagle

On 12/25/2010 9:58 AM, Martin Kaspar wrote:
Hello dear Community,.
I am trying to get a scraper up and running: And keep running into
problems.

when I try what you have i have learned so far I only get:
<strong>Schuldaten</strong>

Here is the code that I used:

import urllib2
from BeautifulSoup import BeautifulSoup
page = urllib2.urlopen("http://www.schulministerium.nrw.de/BP/
SchuleSuchen?action=799.601437941842&SchulAdresseMapDO=142323")
soup = BeautifulSoup(page)
table = soup.find('table' ,attrs={'class':'bp_ergebnis_tab_info'})
first_td = soup.find('td')
text = first_td.renderContents()
trimmed_text = text.strip()
print trimmed_text


i run it in the template at http://scraperwiki.com/scrapers/new/python

see the target: 
http://www.schulministerium.nrw.de/BP/SchuleSuchen?action=799.601437941842&SchulAdresseMapDO=142323

What have I gotten wrong?

Can anybody review the code -

many thanks in Advance

regards
matze

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to