On Sat, Jun 17, 2000 at 10:10:01AM +0200, Andreas J. Koenig wrote:
> >>>>> On Fri, 16 Jun 2000 21:45:44 -0500, Elaine -HFB- Ashton
><[EMAIL PROTECTED]> said:
>
> > Concerns are:
>
> > - It's really big. Might it be time to segment it?
>
> I still love it as one big piece. I wouldn't mind producing additional
> lists chapterwise though. Would that fit the bill?
The catalog on the fron page at search.cpan.org is supposed to be the module
list split down. BUt it needs work.
> > - The core modules. I noticed the other night that some guy was crawling
> > the heck out of search at the rate of 7 or 8 requests per second. This
> > didn't load the server too horribly until he started hitting the core
> > modules and docs because the tar.gz distribution has to be uncompressed
> > for each request. I was thinking perhaps that unless the query parameter
> > is present the engine could fetch it out of already decompressed sources.
> > It's a thought since it would be trivial to DoS the box with these
> > requests en masse.
>
> If search isn't programmed to be fast, we are in deep troubles. Maybe
> the code should be made publically available and setting up mirrors of
> search should be made easy. That could serve two purposes: attract
> contributing programmers and later clusterize search services. Maybe
> such a tarball is available already?
No it's not avaliable yet. But the search right now is an SQL search. That
needs to change.
> > - I thought the addition of the author pages might be a nice touch along
> > with the RFCs, etc. which I put up at
> > http://chaos.wustl.edu/~elaine/download/modulelist-E.html
> > Use if you like, I'm just procrastinating on a Friday night :)
>
> Sure, looks much better than before, thanks! I've replaced the thing
> on PAUSE's incoming directory with this fix.
>
> > - Deprecation. e.g., TOMC has Date-GetDate listed on search yet clicking
> > on it gives me a module listing ANDK. It's a nit, but people who haven't
> > been around will see that and wonder if they have the right module since
> > it seemed to change ownership without explanation. Yes, the README explains
> > the deal but maybe there can be some sort of tag for deprecated modules
> > and modules which have been passed from one author to another.
>
> This is most definitely a bug on search
YEs, clicking on a dist will always take you to the latest dist by that
name rather than just by the author. It is something that needs fixing.
> What search doesn't know is that both TOMC and ANDK are on an access
> control list, so uploads from either of them will get indexed while
> uploads by anybody else will be ignored. We need either to propagate
> the ACL to search or search needs to follow
> modules/02packages.details.txt.gz more closely. I'm not sure which of
> the two.
Neither am I
Graham.