On 2/13/07, Jm lists <[EMAIL PROTECTED]> wrote:
I want to get all the files on some a webdir.For example:
What do you mean by "some a webdir"?
http://www.foo.com/bar/ But that dir has a default page "index.htm".So when I accessed the url I only got the default page. Can you tell me is there a way to fetch all the files in that dir?Thanks a lot.
Are you asking, is there a way to ask a remote webserver for something? That would be a question about webservers, not about Perl, wouldn't it? And the answer, probably, is that webservers don't offer an easy way to let remote users download their entire contents, for much the same reason that the all-you-can-eat restaurant doesn't deliver the kitchen's entire output to your table when you arrive. If you're looking to slurp down an entire remote site, or even a sizable portion of one, well, that's just rude. You're abusing the hospitality of the information provider. Unless you have a good reason, of course; you might be the next Google, for all I know. Let's say you are. But are you, like Google, dealing with more than a dozen other sites? It's not practical for Google to contact the owners of the information in millions of separate cases; but if you've only got a few (or just one?) site on your list, there's no way around it: The only polite way to get the information is to ask for it. Why is it polite? Remember, you're consuming some of the site's outgoing bandwidth, and they pay for that. If you ask nicely, the information's owner may send you a CD and save you *both* time and trouble. (Or, maybe, the information's owner may not want you to have the entire fileset; in which case taking it anyway is even more rude.) For the sake of argument, then, let's say you've gotten this far and you still need a program that will fetch things for you. Let's say you've even read the Web Robots FAQ, so you know that a good robot won't overload the server, for example: http://www.robotstxt.org/wc/faq.html Sure; Perl can do web robots. Have you looked on CPAN? http://search.cpan.org/search?query=RobotRules&mode=all http://search.cpan.org Hope this helps! --Tom Phoenix Stonehenge Perl Training -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/