Okay, so I copied your code(and just so you know I am on a mac right now and i am using pydev in eclipse), and I got these errors, any idea what is up?
Traceback (most recent call last): File "/Users/Alex/Documents/workspace/beautifulSoup/src/firstExample.py", line 14, in <module> print list(get_defs("cheese")) File "/Users/Alex/Documents/workspace/beautifulSoup/src/firstExample.py", line 9, in get_defs dictionary.reference.com/search?q=%s' % term)) File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/urllib.py", line 82, in urlopen return opener.open(url) File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/urllib.py", line 190, in open return getattr(self, name)(url) File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/urllib.py", line 325, in open_http h.endheaders() File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/httplib.py", line 856, in endheaders self._send_output() File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/httplib.py", line 728, in _send_output self.send(msg) File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/httplib.py", line 695, in send self.connect() File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/httplib.py", line 663, in connect socket.SOCK_STREAM): IOError: [Errno socket error] (8, 'nodename nor servname provided, or not known') Sorry if it is hard to read. Jeff McNeil-2 wrote: > > Well, what about pulling that data out using Beautiful soup? If you > know the table name and whatnot, try something like this: > > #!/usr/bin/python > > import urllib > from BeautifulSoup import BeautifulSoup > > > def get_defs(term): > soup = BeautifulSoup(urllib.urlopen('http:// > dictionary.reference.com/search?q=%s' % term)) > > for tabs in soup.findAll('table', {'class': 'luna-Ent'}): > yield tabs.findAll('td')[-1].contents[-1].string > > print list(get_defs("frog")) > > [EMAIL PROTECTED]:~$ python test.py > [u'any tailless, stout-bodied amphibian of the order Anura, including > the smooth, moist-skinned frog species that live in a damp or > semiaquatic habitat and the warty, drier-skinned toad species that are > mostly terrestrial as adults. ', u' ', u' ', u'a French person or a > person of French descent. ', u'a small holder made of heavy material, > placed in a bowl or vase to hold flower stems in position. ', u'a > recessed panel on one of the larger faces of a brick or the like. ', > u' ', u'to hunt and catch frogs. ', u'French or Frenchlike. ', u'an > ornamental fastening for the front of a coat, consisting of a button > and a loop through which it passes. ', u'a sheath suspended from a > belt and supporting a scabbard. ', u'a device at the intersection of > two tracks to permit the wheels and flanges on one track to cross or > branch from the other. ', u'a triangular mass of elastic, horny > substance in the middle of the sole of the foot of a horse or related > animal. '] > > HTH, > > Jeff > > On Jun 27, 7:28 pm, Alexnb <[EMAIL PROTECTED]> wrote: >> I have read that multiple times. It is hard to understand but it did help >> a >> little. But I found a bit of a work-around for now which is not what I >> ultimately want. However, even when I can get to the page I want lets >> say, >> "Http://dictionary.reference.com/browse/cheese", I look on firebug, and >> extension and see the definition in javascript, >> >> <table class="luna-Ent"> >> <tbody> >> <tr> >> <td class="dn" valign="top">1.</td> >> <td valign="top">the curd of milk separated from the whey and prepared in >> many ways as a food. </td> >> >> >> >> Jeff McNeil-2 wrote: >> >> > the problem being that if I use code like this to get the html of that >> > page in python: >> >> > response = urllib2.urlopen("the webiste....") >> > html = response.read() >> > print html >> >> > then, I get a bunch of stuff, but it doesn't show me the code with the >> > table that the definition is in. So I am asking how do I access this >> > javascript. Also, if someone could point me to a better reference than >> the >> > last one, because that really doesn't tell me much, whether it be a >> book >> > or anything. >> >> > I stumbled across this a while back: >> >http://www.voidspace.org.uk/python/articles/urllib2.shtml. >> > It covers quite a bit. The urllib2 module is pretty straightforward >> > once you've used it a few times. Some of the class naming and whatnot >> > takes a bit of getting used to (I found that to be the most confusing >> > bit). >> >> > On Jun 27, 1:41 pm, Alexnb <[EMAIL PROTECTED]> wrote: >> >> Okay, I tried to follow that, and it is kinda hard. But since you >> >> obviously >> >> know what you are doing, where did you learn this? Or where can I >> learn >> >> this? >> >> >> Maric Michaud wrote: >> >> >> > Le Friday 27 June 2008 10:43:06 Alexnb, vous avez écrit : >> >> >> I have never used the urllib or the urllib2. I really have looked >> >> online >> >> >> for help on this issue, and mailing lists, but I can't figure out >> my >> >> >> problem because people haven't been helping me, which is why I am >> >> here! >> >> >> :]. >> >> >> Okay, so basically I want to be able to submit a word to >> >> dictionary.com >> >> >> and >> >> >> then get the definitions. However, to start off learning urllib2, I >> >> just >> >> >> want to do a simple google search. Before you get mad, what I have >> >> found >> >> >> on >> >> >> urllib2 hasn't helped me. Anyway, How would you go about doing >> this. >> >> No, >> >> >> I >> >> >> did not post the html, but I mean if you want, right click on your >> >> >> browser >> >> >> and hit view source of the google homepage. Basically what I want >> to >> >> know >> >> >> is how to submit the values(the search term) and then search for >> that >> >> >> value. Heres what I know: >> >> >> >> import urllib2 >> >> >> response = urllib2.urlopen("http://www.google.com/") >> >> >> html = response.read() >> >> >> print html >> >> >> >> Now I know that all this does is print the source, but thats about >> all >> >> I >> >> >> know. I know it may be a lot to ask to have someone show/help me, >> but >> >> I >> >> >> really would appreciate it. >> >> >> > This example is for google, of course using pygoogle is easier in >> this >> >> > case, >> >> > but this is a valid example for the general case : >> >> >> >>>>[207]: import urllib, urllib2 >> >> >> > You need to trick the server with an imaginary User-Agent. >> >> >> >>>>[208]: def google_search(terms) : >> >> > return >> >> urllib2.urlopen(urllib2.Request("http://www.google.com/search?" >> >> > + >> >> > urllib.urlencode({'hl':'fr', 'q':terms}), >> >> > >> headers={'User-Agent':'MyNav >> >> > 1.0 >> >> > (compatible; MSIE 6.0; Linux'}) >> >> > ).read() >> >> > .....: >> >> >> >>>>[212]: res = google_search("python & co") >> >> >> > Now you got the whole html response, you'll have to parse it to >> recover >> >> > datas, >> >> > a quick & dirty try on google response page : >> >> >> >>>>[213]: import re >> >> >> >>>>[214]: [ re.sub('<.+?>', '', e) for e in re.findall('<h2 >> >> class=r>.*?</h2>', >> >> > res) ] >> >> > ...[229]: >> >> > ['Python Gallery', >> >> > 'Coffret Monty Python And Co 3 DVD : La Premi\xe8re folie des Monty >> >> ...', >> >> > 'Re: os x, panther, python & co: msg#00041', >> >> > 'Re: os x, panther, python & co: msg#00040', >> >> > 'Cardiff Web Site Design, Professional web site design services >> ...', >> >> > 'Python Properties', >> >> > 'Frees < Programs < Python < Bin-Co', >> >> > 'Torb: an interface between Tcl and CORBA', >> >> > 'Royal Python Morphs', >> >> > 'Python & Co'] >> >> >> > -- >> >> > _____________ >> >> >> > Maric Michaud >> >> > -- >> >> >http://mail.python.org/mailman/listinfo/python-list >> >> >> -- >> >> View this message in >> >> context:http://www.nabble.com/using-urllib2-tp18150669p18160312.html >> >> Sent from the Python - python-list mailing list archive at Nabble.com. >> >> > -- >> >http://mail.python.org/mailman/listinfo/python-list >> >> -- >> View this message in >> context:http://www.nabble.com/using-urllib2-tp18150669p18165634.html >> Sent from the Python - python-list mailing list archive at Nabble.com. > > > -- > http://mail.python.org/mailman/listinfo/python-list > > -- View this message in context: http://www.nabble.com/using-urllib2-tp18150669p18166785.html Sent from the Python - python-list mailing list archive at Nabble.com. -- http://mail.python.org/mailman/listinfo/python-list