Graham Barr [[EMAIL PROTECTED]] quoth:
*>>
*>> I still love it as one big piece. I wouldn't mind producing additional
*>> lists chapterwise though. Would that fit the bill?
*>
*>The catalog on the fron page at search.cpan.org is supposed to be the module
*>list split down. BUt it needs work.
I can have a go at it if you like. It is something of a hodgepodge at the
moment and it would be something I could manage. Some of them, like the
FrameMaker::, etc. should be noted in the database as placeholders and the
like.
*>> If search isn't programmed to be fast, we are in deep troubles. Maybe
*>> the code should be made publically available and setting up mirrors of
*>> search should be made easy. That could serve two purposes: attract
*>> contributing programmers and later clusterize search services. Maybe
*>> such a tarball is available already?
*>
*>No it's not avaliable yet. But the search right now is an SQL search. That
*>needs to change.
Speed isn't really the issue here as if someone hammered the site with
requests for core modules you could have a supercomputing cluster and
still drag it down. The idea would be to minimize the need for the
decompressing the Perl source by having it look for it in an uncompressed
directory and fetch it from there or, on last resort, go uncompress the
source. As far as actual speed is concerned, I've been impressed it can
handle 20k requests a day without really loading the server.
I wouldn't want to wish that upon mirrors without a warning or a solution.
The bummer about http is that it takes some work to deny access without a
firewall. My crawling little friend from .ar was using 'Offline Explorer'
which respected neither the robots.txt nor paid any attention to the 403
errors he was getting. He was denied, but he was still loading up the
server because it was trying to service the requests. I don't think the
person was actually working at doing this either and possibly just wanted
to collect data for a local copy for faster access. As it gets more
popular, so too will this become more common.
*>> Sure, looks much better than before, thanks! I've replaced the thing
*>> on PAUSE's incoming directory with this fix.
:)
*>YEs, clicking on a dist will always take you to the latest dist by that
*>name rather than just by the author. It is something that needs fixing.
*>
*>> What search doesn't know is that both TOMC and ANDK are on an access
*>> control list, so uploads from either of them will get indexed while
*>> uploads by anybody else will be ignored. We need either to propagate
*>> the ACL to search or search needs to follow
*>> modules/02packages.details.txt.gz more closely. I'm not sure which of
*>> the two.
*>
*>Neither am I
That depends on what kind of ACL it is. If it is simply an issue of
current version then the details file might be more useful. Something else
could be done to index the modules by version.
http://search.cpan.org/search?mode=author&query=TOMC gives you the right
output because he does still have the 2.00 version in his directory, but
clicking on it takes you to ANDK. Since ANDK holds the current version,
this is appropriate behaviour which is why I was suggesting that a visible
method of marking deprecation would be useful in cases like this. Or,
moving all deprecated modules to backpan might do as well.
e.