Re: [Dirvish] [administrivia] taming web spiders, particularly Baidu?

2012-07-23 Thread Loren M. Lang
On 7/23/2012 11:31 AM, Keith Lofstrom wrote: > This is not about dirvish, but about the website. Perhaps some > of you sysadmins can help. > > You may occasionally see the dirvish.org website stop responding > to web requests. > > dirvish.org is running on my virtual machine at rimuhosting in > Da

[Dirvish] [administrivia] taming web spiders, particularly Baidu?

2012-07-23 Thread f-dirvish
It -should- be using HEAD to see if the timestamps have changed---not downloading everything repeatedly! I found that Baidu was so badly-behaved on a site I run that I just barred it completely, by telling Apache not to serve anything with that user-agent string. Good riddance. (The site is for

Re: [Dirvish] [administrivia] taming web spiders, particularly Baidu?

2012-07-23 Thread Jenny Hopkins
On 23 July 2012 20:04, James Stanley wrote: > Hi Keith, > > I've not noticed any problems with the dirvish site (though I don't > visit it particularly often), but you may find that you have better luck > using robots.txt to prevent all robots from indexing the large videos, > but still allow them

Re: [Dirvish] [administrivia] taming web spiders, particularly Baidu?

2012-07-23 Thread James Stanley
Hi Keith, I've not noticed any problems with the dirvish site (though I don't visit it particularly often), but you may find that you have better luck using robots.txt to prevent all robots from indexing the large videos, but still allow them to index text content. Something like: User-agent: *

Re: [Dirvish] [administrivia] taming web spiders, particularly Baidu?

2012-07-23 Thread Dale Amon
On Mon, Jul 23, 2012 at 11:31:41AM -0700, Keith Lofstrom wrote: > Is there any way to tell the search spiders to visit once a day > or once a week, rather than four times per hour? Or send them > "recent changes" lists instead of them repeatedly downloading the > same files? Any other ideas for c

[Dirvish] [administrivia] taming web spiders, particularly Baidu?

2012-07-23 Thread Keith Lofstrom
This is not about dirvish, but about the website. Perhaps some of you sysadmins can help. You may occasionally see the dirvish.org website stop responding to web requests. dirvish.org is running on my virtual machine at rimuhosting in Dallas, along with half a dozen other low-usage sites. Some