> One thing I was thinking of doing was checking the > character frequency
An alternative idea is index-time fuzzification rather than query-time. This is documented in one of the case studies in LIA - the principle is you don't index/search for whole words but use an NGram Analyzer to break them up at index time: Kylie becomes multiple words: [ k] [ ky] [ kyl] [ky] [kyl] [kyli] [yl] [yli] [ylie] [ kylie ] Obviously you use the same analyzer to process queries. Lucene will automatically look after relevancy of partial matches for you but your indexes are bigger and your queries will generate many more Boolean clauses. ___________________________________________________________ Yahoo! Messenger - NEW crystal clear PC to PC calling worldwide with voicemail http://uk.messenger.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]