On 2/13/07, Jm lists <[EMAIL PROTECTED]> wrote:

I want to get all the files on some a webdir.For example:

What do you mean by "some a webdir"?

http://www.foo.com/bar/

But that dir has a default page "index.htm".So when I accessed the url
I only got the default page.

Can you tell me is there a way to fetch all the files in that dir?Thanks a lot.

Are you asking, is there a way to ask a remote webserver for
something? That would be a question about webservers, not about Perl,
wouldn't it? And the answer, probably, is that webservers don't offer
an easy way to let remote users download their entire contents, for
much the same reason that the all-you-can-eat restaurant doesn't
deliver the kitchen's entire output to your table when you arrive.

If you're looking to slurp down an entire remote site, or even a
sizable portion of one, well, that's just rude. You're abusing the
hospitality of the information provider. Unless you have a good
reason, of course; you might be the next Google, for all I know. Let's
say you are.

But are you, like Google, dealing with more than a dozen other sites?
It's not practical for Google to contact the owners of the information
in millions of separate cases; but if you've only got a few (or just
one?) site on your list, there's no way around it: The only polite way
to get the information is to ask for it.

Why is it polite? Remember, you're consuming some of the site's
outgoing bandwidth, and they pay for that. If you ask nicely, the
information's owner may send you a CD and save you *both* time and
trouble. (Or, maybe, the information's owner may not want you to have
the entire fileset; in which case taking it anyway is even more rude.)

For the sake of argument, then, let's say you've gotten this far and
you still need a program that will fetch things for you. Let's say
you've even read the Web Robots FAQ, so you know that a good robot
won't overload the server, for example:

   http://www.robotstxt.org/wc/faq.html

Sure; Perl can do web robots. Have you looked on CPAN?

   http://search.cpan.org/search?query=RobotRules&mode=all
   http://search.cpan.org

Hope this helps!

--Tom Phoenix
Stonehenge Perl Training

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/


Reply via email to