The following module was proposed for inclusion in the Module List: modid: Search::VectorSpace DSLIP: adpOg description: Text search using vector-space model userid: MACIEJ (Maciej Ceglowski) chapterid: 11 (String_Lang_Text_Proc) communities: Upcoming article on Perl.com
similar: None that I could find rationale: Most search implementations use some kind of reverse index or keyword/document lookup table, often tied to a data file or RDBMS. Vector searches are neat in that they represent each document as a feature vector in a high-dimensional space, and calculate similarity based on linear algebra. On a practical level, they make it easy to keep the search engine in RAM (no disk or DB access except to retrieve the actual results), and they allow for sophisticated 'find similar' searches on one or more existing results. The module itself will be simple (to accompany a Perl.com tutorial), but allow for subclassing to enable sophisticated parsing, term weighting, etc. enteredby: MACIEJ (Maciej Ceglowski) enteredon: Wed Jan 22 04:19:15 2003 GMT The resulting entry would be: Search:: ::VectorSpace adpOg Text search using vector-space model MACIEJ Thanks for registering, The Pause Team PS: The following links are only valid for module list maintainers: Registration form with editing capabilities: https://pause.perl.org/pause/authenquery?ACTION=add_mod&USERID=ef100000_7017acab3155e111&SUBMIT_pause99_add_mod_preview=1 Immediate (one click) registration: https://pause.perl.org/pause/authenquery?ACTION=add_mod&USERID=ef100000_7017acab3155e111&SUBMIT_pause99_add_mod_insertit=1