>> Hi >> >> I have recently finished reading "Starting out with Python" and I >> really want to do some web scraping. Please kindly advise where I can >> get more information about BeautifulSoup. It seems that Documentation >> is too hard for me. >> >> Furthermore, I have tried to scrap this site but it seems that there >> is an error (<http.client.HTTPResponse object at 0x02C09F90>). Please >> advise what I should do in order to overcome this. >> >> >> from bs4 import BeautifulSoup >> import urllib.request >> >> HKFile = urllib.request.urlopen(" >> https://bochk.etnet.com.hk/content/bochkweb/tc/quote_transaction_daily_history.php?code=2388 >> ") >> HKHtml = HKFile.read() >> HKFile.close() >> >> print(HKFile)
<http.client.HTTPResponse object at 0x02C09F90> is not an error. If you want to print your file change print(HKFile) to print(HKHtml.decode("some-encoding")) where some-encoding is what the website is encoded in, these days utf-8 is most likely. If you want a tutorial on webscraping, not Beautiful Soup try: http://doc.scrapy.org/en/latest/intro/tutorial.html which is about using scrapy, a set of useful webscraping tools. the scrapy wiki is also useful https://github.com/scrapy/scrapy/wiki and there are many video tutorials available if you like that sort of thing. Just google for python scrapy tutorial. Laura _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor