Writing simple code to trim down a URL is trivial, but to actually trim it down to its most meaningful state is very hard. In same cases the URL parameters actually define the page in others they are useless babble. I'd like to use the hash of a page's URL as well as a hash of the content data to help me eliminate duplicates... is there any good methods that are commonly used for URL stemming?
-- ___________________________________________________ Chris Fraschetti e [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]