Randall R Schulz wrote: > Lowell, > > What's in your "~/.wgetrc" file? If it contains this: > > robots = off > > Then wget will not respect a "robots.txt" file on the host from which > it is retrieving files. > > Before I learned of this option (accessible _only_ via this directive > in the .wgetrc file)
Or, on the command line -erobots=off :-) Whilst this does control whether wget downloads robots.txt, a quick test confirms that even when it does get robots.txt, it still wanders into cgi-bin. I'd suggest taking this to the wget list, except wget it currently maintainer-less, and, it appears, bitrotted. Max. -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Bug reporting: http://cygwin.com/bugs.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/