On 2013-01-01 13:21, Gerard Beekmans wrote:
> Hi guys,
>
> After reviewing logs I ended up having to block the wget user agent in
> Apache for the time being. Pages such as
> http://www.linuxfromscratch.org/lfs/downloads/stable/ are causing issues
> with wget.
>
The block has been lifted and the w
On Tue, Jan 01, 2013 at 02:18:02PM -0600, Gerard Beekmans wrote:
>
> All this will be a moot point before this month (January) is out.
Excellent.
> Later
> today I will try an experiment to reconfigure Apache to turn off the
> sorting headers and allow wget again and monitor usage and make a
> Nope, that page is served out by Apache using its autoindex module.
>
> Gerard, we could just configure Apache to use
> 'SuppressColumnSorting'
> (http://httpd.apache.org/docs/2.2/mod/mod_autoindex.html#indexoptions) - it
> won't stop bots from downloading masses of data if that's what they're
> Is this something we can change in the future, somewhere in the xml,
> or is it another of those "we miss Manuel" moments ?
>
> For the content of that page, I have difficulty understanding what
> use the alternate orders provide - there are only six links plus the
> parent directory, and for
> Would an appropriate /robots.txt help things out?
>
Doesn't look like it. The "guilty" hosts never attempted to download
robots.txt files. Bots like Google do request those files and behave
properly but those aren't the ones causing issues or dowloading
duplicate files. Nor do they show up as
On Tue, 2013-01-01 at 20:01 +, Ken Moffat wrote:
> On Tue, Jan 01, 2013 at 01:21:26PM -0600, Gerard Beekmans wrote:
> > Hi guys,
> >
> > After reviewing logs I ended up having to block the wget user agent in
> > Apache for the time being. Pages such as
> > http://www.linuxfromscratch.org/lfs
On Tue, Jan 01, 2013 at 01:21:26PM -0600, Gerard Beekmans wrote:
> Hi guys,
>
> After reviewing logs I ended up having to block the wget user agent in
> Apache for the time being. Pages such as
> http://www.linuxfromscratch.org/lfs/downloads/stable/ are causing issues
> with wget.
>
> The name
Gerard Beekmans wrote:
> Hi guys,
>
> After reviewing logs I ended up having to block the wget user agent in
> Apache for the time being. Pages such as
> http://www.linuxfromscratch.org/lfs/downloads/stable/ are causing issues
> with wget.
>
> The name, last modified, size and description headers a