Re: Raw Data from Website

Steven D'Aprano Wed, 24 Aug 2016 01:23:41 -0700

On Wednesday 24 August 2016 17:04, Bob Martin wrote:

> in 764257 20160823 081439 Steven D'Aprano
> <[email protected]> wrote:


>>There are many tutorials and examples of "screen scraping" or "web scraping"
>>on the internet -- try reading them. It's not something I personally have any
>>experience with, but I expect that the process goes something like this:
>>
>>- connect to the website;
>>- download the particular page you want;
>>- grab the data that you care about;
>>- remove HTML tags and extract just the bits needed;
>>- write them to a CSV file.
> 
> wget does the hard part.


I don't think so. Just downloading a web page is easy. Parsing the potentially 
invalid HTML (or worse, the content is assembled in the browser by Javascript) 
to extract the actual data you care about is much harder.


-- 
Steve

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Raw Data from Website

Reply via email to