everything is obvious after its pointed out. On 03/24/12 11:23, Dave wrote: > On 24/03/2012 15:53, [email protected] wrote: >> On Sat, 24 Mar 2012 10:26:48 -0000, Dave said: > >>> Doesn't the the -e, robots=off, --page-requisites and -H wget directives >>> enable >>> one to collect all the necessary files that are called from a page? > >> No, not *all* the files, for the same reason that if you visit a page with >> NoScript enabled, you may end up with missing content and/or big open spaces >> on >> the page. > >> Consider a page that has Javascript on it: > >> todaysfile = "http://www.news-site.com/" + date_as_string; >> document.load(todaysfile); > >> Unless you interpret the javascript, you don't know what URL will get loaded, >> because yesterday and tomorrow will get a different URL. So basically, >> if you try to pull it down with wget or similar, you will miss *all* the >> stuff >> that's pulled down via Javascript (and probably via css as well - does wget >> know how to follow CSS references?). On many modern web designs, >> this ends up being the vast majority of the content. > > Thanks Valdis, > > Some things are pretty obvious when pointed out. > > Dave > > _______________________________________________ > Full-Disclosure - We believe in it. > Charter: http://lists.grok.org.uk/full-disclosure-charter.html > Hosted and sponsored by Secunia - http://secunia.com/
_______________________________________________ Full-Disclosure - We believe in it. Charter: http://lists.grok.org.uk/full-disclosure-charter.html Hosted and sponsored by Secunia - http://secunia.com/
