The following module was proposed for inclusion in the Module List:
modid: LSI DSLIP: adpOg description: Latent semantic indexing toolkit userid: MACIEJ (Maciej Ceglowski) chapterid: 11 (String_Lang_Text_Proc) communities: Will introduce at O'Reilly bioinformatics conference 2003. General writeup at http://javelina.cet.middlebury.edu/lsa/out/lsa_intro.htm similar: none rationale: Latent semantic indexing is a vector-based technique for indexing large document collections. LSI uses a dimensionality reduction technique called 'singular value decomposition' to vastly improve recall in searching such collections. LSI search engines can return relevant results even when a document does not contain an exact keyword match to the query. "Document" and "keyword" here can be defined very loosely - although traditionally LSI has been applied to natural language text, the technique is purely algebraic, and there are potential applications to DNA sequences, image files, and pretty much anything you can shoehorn into a vector model. While LSI has been a theoretical curiosity for many years, this is the first open-source implementation ( to my knowledge ), and the first practical toolkit usable by people outside the computational linguistics community. We hope to make the task of building vector-based search engines, visualization tools, and archive management tools much easier for the casual programmer. Because the toolkit includes visualization, clustering and other components that go beyond searching data, I felt the Search:: namespace was overly specific. I do understand that the CPAN team may feel very strongly about creating a root-level LSI namespace. enteredby: MACIEJ (Maciej Ceglowski) enteredon: Sun Oct 13 21:28:44 2002 GMT The resulting entry would be: LSI adpOg Latent semantic indexing toolkit MACIEJ Thanks for registering, The Pause Team PS: The following links are only valid for module list maintainers: Registration form with editing capabilities: https://pause.perl.org/pause/authenquery?ACTION=add_mod&USERID=31300000_210131c3f58a35db&SUBMIT_pause99_add_mod_preview=1 Immediate (one click) registration: https://pause.perl.org/pause/authenquery?ACTION=add_mod&USERID=31300000_210131c3f58a35db&SUBMIT_pause99_add_mod_insertit=1