On 02/22/2013 12:09 AM, qoresu...@gmail.com wrote:
Initially I was just trying the html, but later when I attempted more 
complicated sites that weren't my own I noticed that large bulks of the site 
were lost in the process. The urllib code essentially looks like what I was 
trying but it didn't work as I had expected.

To be more specific, after I got it working for my own little page, I attempted 
to take it further and get all the lessons from Learn Python The Hard Way. When 
I tried the same method on the first intro page to see if I was even getting it 
right, the html code was all there but upon opening it I noticed the format was 
all wrong, colors were off for the background, images, etc... were all missing.

So how are you opening this html? In a text editor that somehow added colors? Or were you opening it in a browser? In order for a browser to render a non-trivial page, it may need lots of files other than the html. Colors for example can be specified inline, in the header, or in an external css file. If the page was designed to use the external css, and it's missing or not in the right location, then the browser is going to get the colors wrong.

Further, if the location (url) is relative, then you can create a similar directory structure, and the browser will find it. But if it's absolute, then the browser is going to try to go out to the web to fetch it. If it succeeds, then it's masking the fact that you haven't downloaded the "whole web site."

The same is true for other external refs. It may be impossible to host it elsewhere if there are any absolute urls.

--
DaveA
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to