Hi, while I was playing with the SynonymAnalyzer stuff (pylucene-3.4 samples) I discovered that the wordnet example is broken due to an outdated wordnet database: The SynonymAnalyzerTest works fine, but the SynonymAnalyzerViewer fails with: ...lucene.JavaError: org.apache.lucene.index.IndexFormatTooOldException: Format version is not supported in file 'segments': 44132 (needs to be between -1 and -11). This version of Lucene only supports indexes created with release 3.0 and later.
The WordNetSynonymEngine uses an index contained in the indexes.tgz file which is looked up in indexes\wordnet - this file (dated 2004) seems to be an old lucene index format. I managed to find the files required to build the index for lucene-3.4, adjusted the WordNetSynonymEngine to work with lucene 3.4 and all seems to be working again. I've created an archive with the relevant changes and uploaded it to the pylucene-extras project - just in case anyone is interested: http://code.google.com/a/apache-extras.org/p/pylucene-extra/downloads/list BTW, who is maintaining/updating the samples that are included in the distribution? It should be noted that the SynonymAnalyzer examples are based on the lia book and implement their own Synonym support while there is currently already support for SynonymAnalyzer in java-lucene-3.4: package org.apache.lucene.analysis.synonym; (in contrib) see CHANGELOG LUCENE-3233, LUCENE-3375: Added SynonymFilter for applying multi-word synonyms during indexing or querying (with parsers for wordnet and solr formats). Removed contrib/wordnet. It's already included in the PyLucene core: lucene.SynonymFilter - however I couldn't find any samples / tests for this new feature - will have to play with this one as well... Let me know if anyone has made experience with the new lucene.SynonymFilter and possible advantages over the Python-based implementation (in pylucene-3.4\samples\LuceneInAction\lia\analysis\synonym). regards Thomas -- OrbiTeam Software GmbH & Co. KG Endenicher Allee 35 53121 Bonn - Germany http://www.orbiteam.de