The following module was proposed for inclusion in the Module List: modid: Digest::ManberHash DSLIP: bdcOg description: Estimating similariness in files userid: PMAREK (Philipp Marek) chapterid: 17 (Archiving_and_Compression) communities:
similar: String::Similarity String::Approx rationale: This module gives a number of hash values for any given file; this hash values can be used to compare files and get a value telling about similariness. As this is not a single value per file it can't be replaced by MD5, SHA-1, or other cryptographic hashes. The difference between String::Similarity, String::Approx and this module is that this module may be used to compare BIG files. String::Similarity and String::Approx are (AFAIU) approx. O(N*M), where Digest::ManberHash is only O(N+M) (with N and M the size of the compared objects); but Digest::ManberHash works only for bigger data sets. For details please see http://manber.com/publications.html or ftp://ftp.cs.arizona.edu/reports/1993/TR93-33.ps enteredby: PMAREK (Philipp Marek) enteredon: Tue Aug 19 12:37:51 2003 GMT The resulting entry would be: Digest:: ::ManberHash bdcOg Estimating similariness in files PMAREK Thanks for registering, -- The PAUSE PS: The following links are only valid for module list maintainers: Registration form with editing capabilities: https://pause.perl.org/pause/authenquery?ACTION=add_mod&USERID=a0400000_991065f3581374b9&SUBMIT_pause99_add_mod_preview=1 Immediate (one click) registration: https://pause.perl.org/pause/authenquery?ACTION=add_mod&USERID=a0400000_991065f3581374b9&SUBMIT_pause99_add_mod_insertit=1