I have read that multiple times. It is hard to understand but it did help a little. But I found a bit of a work-around for now which is not what I ultimately want. However, even when I can get to the page I want lets say, "Http://dictionary.reference.com/browse/cheese", I look on firebug, and extension and see the definition in javascript,
<table class="luna-Ent"> <tbody> <tr> <td class="dn" valign="top">1.</td> <td valign="top">the curd of milk separated from the whey and prepared in many ways as a food. </td> Jeff McNeil-2 wrote: > > > the problem being that if I use code like this to get the html of that > page in python: > > response = urllib2.urlopen("the webiste....") > html = response.read() > print html > > then, I get a bunch of stuff, but it doesn't show me the code with the > table that the definition is in. So I am asking how do I access this > javascript. Also, if someone could point me to a better reference than the > last one, because that really doesn't tell me much, whether it be a book > or anything. > > > > I stumbled across this a while back: > http://www.voidspace.org.uk/python/articles/urllib2.shtml. > It covers quite a bit. The urllib2 module is pretty straightforward > once you've used it a few times. Some of the class naming and whatnot > takes a bit of getting used to (I found that to be the most confusing > bit). > > On Jun 27, 1:41 pm, Alexnb <[EMAIL PROTECTED]> wrote: >> Okay, I tried to follow that, and it is kinda hard. But since you >> obviously >> know what you are doing, where did you learn this? Or where can I learn >> this? >> >> >> >> >> >> Maric Michaud wrote: >> >> > Le Friday 27 June 2008 10:43:06 Alexnb, vous avez écrit : >> >> I have never used the urllib or the urllib2. I really have looked >> online >> >> for help on this issue, and mailing lists, but I can't figure out my >> >> problem because people haven't been helping me, which is why I am >> here! >> >> :]. >> >> Okay, so basically I want to be able to submit a word to >> dictionary.com >> >> and >> >> then get the definitions. However, to start off learning urllib2, I >> just >> >> want to do a simple google search. Before you get mad, what I have >> found >> >> on >> >> urllib2 hasn't helped me. Anyway, How would you go about doing this. >> No, >> >> I >> >> did not post the html, but I mean if you want, right click on your >> >> browser >> >> and hit view source of the google homepage. Basically what I want to >> know >> >> is how to submit the values(the search term) and then search for that >> >> value. Heres what I know: >> >> >> import urllib2 >> >> response = urllib2.urlopen("http://www.google.com/") >> >> html = response.read() >> >> print html >> >> >> Now I know that all this does is print the source, but thats about all >> I >> >> know. I know it may be a lot to ask to have someone show/help me, but >> I >> >> really would appreciate it. >> >> > This example is for google, of course using pygoogle is easier in this >> > case, >> > but this is a valid example for the general case : >> >> >>>>[207]: import urllib, urllib2 >> >> > You need to trick the server with an imaginary User-Agent. >> >> >>>>[208]: def google_search(terms) : >> > return >> urllib2.urlopen(urllib2.Request("http://www.google.com/search?" >> > + >> > urllib.urlencode({'hl':'fr', 'q':terms}), >> > headers={'User-Agent':'MyNav >> > 1.0 >> > (compatible; MSIE 6.0; Linux'}) >> > ).read() >> > .....: >> >> >>>>[212]: res = google_search("python & co") >> >> > Now you got the whole html response, you'll have to parse it to recover >> > datas, >> > a quick & dirty try on google response page : >> >> >>>>[213]: import re >> >> >>>>[214]: [ re.sub('<.+?>', '', e) for e in re.findall('<h2 >> class=r>.*?</h2>', >> > res) ] >> > ...[229]: >> > ['Python Gallery', >> > 'Coffret Monty Python And Co 3 DVD : La Premi\xe8re folie des Monty >> ...', >> > 'Re: os x, panther, python & co: msg#00041', >> > 'Re: os x, panther, python & co: msg#00040', >> > 'Cardiff Web Site Design, Professional web site design services ...', >> > 'Python Properties', >> > 'Frees < Programs < Python < Bin-Co', >> > 'Torb: an interface between Tcl and CORBA', >> > 'Royal Python Morphs', >> > 'Python & Co'] >> >> > -- >> > _____________ >> >> > Maric Michaud >> > -- >> >http://mail.python.org/mailman/listinfo/python-list >> >> -- >> View this message in >> context:http://www.nabble.com/using-urllib2-tp18150669p18160312.html >> Sent from the Python - python-list mailing list archive at Nabble.com. > > > > -- > http://mail.python.org/mailman/listinfo/python-list > > -- View this message in context: http://www.nabble.com/using-urllib2-tp18150669p18165634.html Sent from the Python - python-list mailing list archive at Nabble.com. -- http://mail.python.org/mailman/listinfo/python-list